Saturday, August 29, 2009

Map a UC appliance as a network drive

Remember the "good old days" of CCM 4.x? You can do almost everything on the box. Because it's in fact a Windows 2000 box. However, this brings security and supportability issues.

With the introduction of Linux-based Unified Communication appliance (CUCM 5.x), Cisco locked down the box. You can only access the box via admin web page or a tailored command line.

One of the inconveniences is to review log files. On the old-school CCM 4.x, you may just view the logs in C:\Program Files\Cisco\Trace. On the new UC appliance, you'll have to use RTMT (RealTime Monitoring Tool). This is especially annoying if you're testing your system. For each test, you'll have to download a new set of logs to your computer. (though you may use 'Remote Browse' in RTMT, its function is very limited)

What if we can go back to the "good old days" and view the file system just like a Windows drive?

Take a look at the screenshot below. It's a CUCM 6.1.4 mapped to my Windows XP laptop. You can read/write files on CUCM just like a local hard drive. For those people who are not a fan of VI, you may use your favorite editor (such as Notepad++/UltraEdit). And you may use any Windows tools, such as Windows search, WinGrep, WinZip, etc. How's that? :)


To achieve this, you need two things: a root account on CUCM and a software who can map a SFTP server to a network drive (such as sFTPdrive).

Wednesday, July 15, 2009

UC Appliance on VMWare

In theory, any software runs on x86 platform should be able to run on VMWare, unless the software vendor explicitly block it.

Cisco has many software running on x86 platform. We'll discuss Unified Communication products here - CUCM, CER, UCCX, etc.

CUCM

CUCM is the flagship of Cisco UC products. You may install CUCM on VMWare just fine. No hacking is required, but you'll receive a warning that VMWare is not "officially" supported. ie. you shouldn't use it in production. Cisco planned to support VMWare in production in the future ( probably with UC 8).

Though you may install CUCM on VMWare, there are some limitations.

Limitation #1 Licensing

Cisco limits the number of nodes and DLUs on a VMWare MAC address. (3 nodes, 125 DLUs at the time I'm writing this blog). If you should need more than 3 nodes and 125 DLUs, you may change the MAC address of the CUCM (change in VM guest, not in VM host). Just Google keywords "change MAC address on Linux" and you'll find the answer.

Limitation #2 SNMP Agent

You'll notice that "SNMP Master Agent" service fails to start if CUCM was installed on VMWare. This will cause problem if you're testing CER (Emergency Responder). CER needs SNMP connection to CUCM to retrieve phone info. The workaround would be issue the command on CUCM root shell "/sbin/chkconfig snmpd off". Then reboot the server.

Limitation #3 VMWare Acknowledgement

Since from version 7, CUCM requires you acknowledge the "VMWare agreement" during startup. If you reboot the CUCM remotely (either via OS Admin web or via CLI), the server will not boot up until you press the "Agree" button on the console. The workaround is to edit /usr/local/bin/base_scripts/hardware_check.sh,

change the line

if [ "$hwmodel" = "vmware" ];

to

if [ "$hwmodel" = "foobar" ];

On newer versions,

Change the line

    if isHardwareUnsupported || isOriginalHardwareUnsupported ; then

To

   #if isHardwareUnsupported || isOriginalHardwareUnsupported ; then
    if 0 ; then

CER

Besides the limitations mentioned about, CER has another limitation with VMWare.

CER retrieve information from CUCM via SNMP. This includes the machine type of CUCM. If CUCM is running on VMWare, the machine type will be "unsupported" from CER point of view. The workaround is to edit /usr/local/CER/etc/devices.xml file on the CER box. Add the following tag under "CcmHost" family tag:

member OID="1.3.6.1.4.1.99.1.1.3.28" OIDNAME="vm-ware" CAPTION="VMWare"
Reboot CER or restart Phone Tracking Engine.

UCCX

So far, UCCX has been running on "Cisco OS" 2000/2003, which in fact is Windows 2000/2003. However, UCCX will refuse to install if Cisco specific registry key is missing. Follow instructions on http://www.tek-tips.com/viewthread.cfm?qid=930128&page=1 to add the registry key.

Good news is: VMWare is supported since Cisco OS 2003.1.4. If you're using OS 2003.1.4 or newer, you don't need the registry hack.

Another tips is: if you want to bypass the hard drive/memory requirement check, you may create an empty file named "crstest.ini" on C:\. Then CRS won't require 72G HDD/2G memory to install. Of course, this is for testing purpose.

Root shell on UC appliance

Many of the hacking above requiress the root access to the appliance (CUCM, CER, etc.). Just use Google to find the answer. For example: http://www.blindhog.net/how-to-get-root-access-on-call-manager-56-server/

Saturday, June 20, 2009

Virtualize everything!

People in network world should have heard about 'simulators'. A router simulator gives you the command line interface you can practice on.

With more and more network equipments move to open source OS (linux) and x86 platform, the word 'simulator' has another meaning - virtualization. Which means, you can run the software (such as IOS, JUNOS, etc.) on a x86 computer just like it runs on the original hardware.

I still remember the excitment when I discover that I can run JUNOS on a 486 PC back in 1999. I built my first JNICE lab with nine of those PCs ($50 each).

Now, working in Cisco Unified Communication team, one of the challenges I'm facing is the availability of equipments. Sure we have access to IP phones, routers and switches. But getting mobile phones (BlackBerry, Nokia, WinMobile, Android, iPhone) and ASA (Adaptive Security Appliance) for each engineer is not as easy as we thought.

Mobile phones are required to test CUMC (Cisco Unified Mobile Communicator). ASA is required to test CUMA (Cisco Unified Mobility Advantage), Phone Proxy and CUPS Inter-domain Federation.

Fortunately, with simulators, everything can be run on a PC (or a virtual machine).

Below is a BB simulator and ASA simulator running on a VM.


When running a network appliance image (such as JUNOS or ASA) on a PC or VM, one thing to notice is that you cannot use the monitor and keyboard as console. Why? Because a router does not have video card and keyboard. The 'console' port is the serial port.

If you are using a PC, connect the console cable to the COM port.

If you are using a VM, you may direct the serial output to a named piple.


For the VM that running the appliance image (such as JUNOS or ASA), set the 'near end' to 'Server'. Set the 'far end' to 'A Virtual Machine'. You may use any name for 'pipe name'.

For the VM that acting as 'terminal' (such as WinXP or Linux), set the 'near end' to 'Client'. Set the 'far end' to 'A Virtual Machine'. The 'pipe name' needs to match the one you configured above. After this, it's like there's a serial cable connects the terminal VM(WinXP) and the appliance VM (ASA).

Saturday, June 6, 2009

It's live - "Ask Expert" on Netpro

If you have questions regarding CUPS/CUPC, presence, OCS/MOC, etc., you may ask questions on Netpro forum. They have a "Ask the Expert" event this week for CUPS and presence.

Link as below:
http://forum.cisco.com/eforum/servlet/NetProf?page=netprof&forum=Unified%20Communications%20and%20Video&topic=Unified%20Communications%20Applications&topicID=.ee835d2&fromOutline=&CommCmd=MB%3Fcmd%3Ddisplay_location%26location%3D.2cd34986


Thanks!

Tuesday, May 26, 2009

UC Appliance Command Line - Part 2

Part 2 - Start, Stop, Restart

utils service list
This command will list all services on an appliance. It's usually used with the parameter 'page', so it'll pause at each page.


utils service stop
utils service start
utils service restart
These commands are used to stop/start/restart services. For example, if you'd like to restart "Cisco Tomcat" service, you type "utils service restart Cisco Tomcat".
utils system shutdown
utils system restart
These commands are used to shutdown or restart the system.
utils system switch-version
This command is used to switch software version (if you have two versions installed). For your information, Cisco Unified Appliance will keep two versions of software on hard drive - one in the root partition, the other one in "PartB" partition. This provides you an option to fall back to an old version.

To see the versions installed, use the commands below:
show version active
show version inactive
Every time you run "utils system switch-version", it'll make the active partition inactive and make the inactive partition active.

Please note that each partition (version) has it's own database, which means they don't share the same database (configuration). If you switched version, you might lose any changes you made in the other version.

Wednesday, May 13, 2009

UC Appliance Command Line - Part 1

Cisco built many Unified Communication "Appliance" based on Linux, such as CUCM (Communication Manager, a.k.a. CallManager), CUPS (Presence Server), CER (Emergency Responder, a.k.a. e911), etc.

Even though those appliances are built on Linux, Cisco does not give you shell access to the box (if you know about Linux, you know what a "shell" means). This is for security and maintenance purpose.

However, some of the maintenance work needs to be done via command line. Cisco built a customized command line interface (CLI) for UC appliances. Since most of the UC appliances share the same OS, they also share the same sets of CLI commands.

Mastering some of the CLI commands would make your life easier (or you may impress your colleagues or boss by showing off some of the rarely used commands).

Some basics:
0) To get access to the CLI, you need the "OS Administration" credential. "OS Administration" credential is stored in /etc/passwd file, while "Application Administration" credential is stored in database.

1) To access the CLI, you may either go to the sever console or SSH to it. (Telnet is not supported for security reasons).

2) Cisco keeps adding new commands to CLI. Some of the commands are available on new versions (such as CUCM 7.x) but not available on old versions (such as CUCM 6.x).

3) You may always use question mark (?) and tab key to get help.

4) Unlink IOS, UC Appliance CLI doesn't take abbreviations. You'll have to give the full command (either type it yourself or use the tab key).

5) "show" command is to display information

6) "set" and "unset" commands are to change configuration

7) "utils" command is run maintenance utilities (such as system reboot, backup/restore, etc.)

8) "run sql" command is to run SQL query against the database.

Part 1: Getting system info

show status

This command will give you the following information:
  • Hostname of the box
  • Current date/time on the box
  • Current time zone configured on the box
  • Current version
  • How long the system has been up and running
  • CPU/Memory/Hard Disk usage
For example, if you cannot access the web interface of your CUCM box, you open a case. TAC engineer asks you what version is the CUCM. If you can access to the CLI, you may find out the version. This could possibly speed up the resolution.

show hardware

This command would give you the hardware information (such as serial number of the box). If you need to find out the serial number remotely, you may SSH to the box and use this command. Serial number is critical for entitlement and tech support.

show network eth0 detail
This gives you the following information:

  • IP address of the box
  • MAC address
  • DNS
  • Gateway
This command is useful if you need to check the MAC address quickly (for licensing purpose).

To see all "show" commands, type "show ?"

Thursday, May 7, 2009

"Ask The Expert" on Cisco NetPro Forum

Sorry I didn't post any new article lately.

I'll host a "Ask The Expert" event on Cisco NetPro forum June 8 - June 12.

For those new to the forum, "Ask The Expert" is a periodic event that the subject matter expert (SME) would answer questions on a specific topic (such as licensing, contact center, video conferencing, etc.). Of course, I'll be answering questions on CUPS/CUPC and presence-related questions. Bring your toughest questions! :)

Though I cannot guarantee every question be answered immediately, I'll make sure they get to the right people.

We believe Unified Communication will make our life better (though the process of deploying it might make your life tougher... LOL)

Saturday, April 4, 2009

Book, charity and life

I'm a customer support engineer that supporting Cisco Unified Communication products.

I built this blog and wrote the book "Deploying Cisco Unified Prsence" with the intention of helping my customers (and my employer). The book was priced very cheap (39.99). After deducting the manufaturer and distribution cost, the revenue I received from retailers is $5.16 for each copy sold (it'll be higher if it was sold from the publisher's website).

For some reasons, my motive was questioned. (Sorry I can't disclose more details. But it makes me feel really bad.)

Thus I make an announcement here (and Lulu.com), that all (100%) revenue from this book will go to "American Red Cross International Relief Fund".

I'll try my best to answer any technical questions you have. I'll be hosting a "Ask the Expert" session on Cisco NetPro Forum in June 2009 (on CUPS/CUPC products).

God bless America. God bless you!

Donation receipts for Q1 2009:


Tuesday, March 31, 2009

"Hardware not supported"

When installing Cisco Unified Communication products (CUCM, CUPS, CER, UC, CUMA, etc.), you might get a message saying that the hardware is not supported. It's a pain in the butt. Espeically when you (or you client) spent quite a few $$$ to get a brand new server and yielded "not supported".

History

Cisco itself does not manufacture servers (not until lately, with introducing of 'Unified Computing'). Cisco OEM servers (x86) from IBM, HP and Dell and brand it as "Cisco MCS" (Media Convergence Server). Cisco also labels those servers with it's own model number. For example, MCS-7845-H2 is actually a HP DL380 G5 server.

Cisco recommends customers purchase "Cisco MCS" server for Unified Communication products. The major advantage of that is it guarantees the compatibility between hardware and software. For example, you may find the CUCM compatibility matrix here: http://www.cisco.com/en/US/prod/collateral/voicesw/ps6790/ps5748/ps378/prod_brochure0900aecd8062a4f9.html


The mess

Even though Cisco recommends customers purchase MCS servers, it does not prohibit people from buying "equivalence" from manufacturers directly (ie. from IBM/HP/Dell).

If you decided to buy "equivalence", be careful, Cisco has very strict requirements on that. If you didn't order the right parts, the server could yield "not supported" (ie. the software install will fail).

What does it look for?

When the software being installed, it usually looks for the following attributes on the system:
1) Machine type (model number in BIOS)
2) Hard drive and RAID card
3) CPU speed
4) Memory

Frequently seen issues:

#1 You have an MCS I-series ("I" stands for IBM) server from Cisco. One day, the motherboard burnt out. IBM replaced the motherboard (yes, it's IBM who services the server, though you bought it from Cisco).

You tried to reinstall CUCM 6.1.2, but it kept saying the server was not supported.

Cause of the problem:
In IBM BIOS, there's a field called "Machine Type". The generic IBM machine type is different with Cisco MCS machine type.

Solution:
1) Obtain a BIOS update disk for the server from here (2000.4.4 supported version 1.14, this link is for 1.17, please note that 1.17 has NOT been tested): http://www-304.ibm.com/jct01004c/systems/support/supportsite.wss/docdisplay?lndocid=MIGR-57074&brandind=5000008

2) Boot to the disk and flash the BIOS. During the BIOS flash, you should receive a prompt as to whether or not you would like to change the MTM. Please select yes, and enter the correct machine type (e.g. 884xxxx)

Note: the above should be performed by a Cisco TAC engineer.

#2 You have an MCS H-series ("H" stands for HP) server from Cisco. One day, the motherboard burnt out. HP replaced the motherboard (yes, it's HP who services the server, though you bought it from Cisco).

You tried to reinstall CUPS 6.0.4, but it kept saying the server was not supported.

Cause of the problem:
Your old motherboard was with a CPU at speed of 2.13 Ghz. Since the 2.13 Ghz CPU was end of life, HP gave you a 2.8 Ghz CPU and thought you'd be happy with that.

Resolution:
Though you might be happy with the faster CPU, the software was not happy at all. Based on the machine type, the software expect a 2.13Ghz CPU.

#3 You ordered a server from HP, you made sure the CPU, memory, hard disks meet the Cisco requirements. But the software stills said "hardware not supported".

Root cause of the problem:
You forgot to order a "PCI-X/E Mixed Riser Option" from HP.

How do we find the "equivalence" of a MCS server (so you can order it from HP/IBM)?

Step 1: Go to product compatibility page and find a supported MCS server. e.g. for CUPS 6, go to http://www.cisco.com/en/US/docs/voice_ip_comm/cups/7_0/english/compatibility/cupcompatibility.html#wp77512. Let say, you picked MCS-7825-I3-IPC1, which is good for CUPS 6.0.4 new install.

Step 2: Go to this page and click on IBM or HP link (http://www.cisco.com/en/US/products/hw/voiceapp/ps378/product_solution_overview_list.html). MCS-7825-I3-IPC1 is an IBM server. So you would click on the IBM link.

Step 3: Search for the 2nd and 3rd part of the MCS model number. e.g. 7825-I3 in this case. You'll find something like this:

IBM x3250 with Intel 3050 Xeon 2.13-GHz Processor
"... It is the configuration equivalent of the Cisco MCS 7825-I3"

And you'll find the parts list for it:

Table 19. Non-Country Specific Hardware for IBM x3250 with Intel 3050 Xeon 2.13-GHz Processor*

Quantity

IBM Type-Model Feature

Description

1

4364-AC1** or
4365-AC1

IBM System x3250

1

0992

CPU Retention Module

1

1128

x3250 Revision 1 System Planar

1

1272

Dual Core Intel Xeon 3050 (2.13 GHz / 2M L2)

2

1903

1-GB DDR2 667 SDRAM DIMM Memory

1

2007

BIOS GBM

1

2046

Front Bezel

1

2088

3.5inch DASD Cage

2

2091

SATA Filler 3.5inch

1

2268

Base Hardware

1

4144

CDRW/DVD Combo V UltraBay Enhanced

1

4256

Rack Mount Kit

1

4367

Simple Swap SATA RAID Kit

2

5291

160GB 7200 RPM 3.5 inch Simple Swap SATA HDD

1

9011

Internal RAID - Cabled only - setup by Customer


You'll have to order each piece on the list.

If we got a "hardware not supported" message, how do we know which part is not meeting the requirements?

To see the reason of the failure, you need to review the /tmp/hw_validation_err file on the hard drive. You may press ALT+F2 while the installation is in progress (before it gets to halt state). Then you'll get to the Linux command prompt. Type the command below to display the content of the file:

cat /tmp/hw_validation_err

Other useful files include: /tmp/hw_info, /tmp/anaconda.log and /tmp/install.log.

Thursday, March 26, 2009

Licensing

Licensing is another mysterious area in Unified Communication. Each product has its own licensing model.

CUCM (CallManager)

CUCM license resides on publisher. The 'host ID' in license file needs to match the MAC address of the publisher.

There are three different types of license for CUCM: node license, feature license and device license (DLU). Node license is for server node. Feature license is required for upgrade or CUCM 7 and above. Device license is required to configure devices (such as phones).

CER (Emergency Responder or e911)

CER license resides on publisher and subscriber (if you have CER subscriber). License file on pub needs to match the MAC address of pub. License file on sub needs to match the MAC address of sub.

If you received a "license expired" message on a two-node CER deployment, 99% of the chance you don't have a license for sub (wrong MAC address).

CUPS (Presence)

Like CUCM, CUPS also has node license and device license. Node license resides on CUPS publisher. Device license resides on CUCM (it's just a regular DLU license).

Node license could contain proxy or PE (Presence Engine) or both.

Wednesday, March 11, 2009

CCIE Voice Lab v3

Cisco is going to change the CCIE Voice Lab in mid-July 2009. Major changes are:

1) Remove analog devices (such as VG248, ATA)
2) Remove CatOS (Catalyst 65xx)
3) Replace CCM with CUCM 7 (Linux Appliance)
4) Replace Unity with Unity Connection 7 (Linux Appliance)
5) Add CUPS 7 (Linux Appliance)
6) Add SIP phones

IPExpert (a training company) provides some practice labs. The topology would be like below:



Here are some tips if you're going to build your own lab.

PSTN-WAN simulator

As shown in the diagram above, we have a PSTN cloud and a Frame-Relay cloud. We can use a single router to simulate that. And this router could also be the terminal server.

I would choose a Cisco 2811 router with the following modules:

PVDM2-16 (or other PDVM resource).
HWIC-4T : 4-port Serial module for Frame-Relay simulation.
VWIC2-1MFT-T1/E1 : 2-port T1/E1 module for PSTN simulation. You need at least three ports. So you may order two of these. Or to save some money, order one 2-port module and one 1-port module
HWIC-8A & CAB-OCTAL-ASYNC : 8-port async module for terminal service

Don't forget the female serial cable (CAB-SS-V35MT). You need at least three.

On HQ, BR1, and BR2 Gateways:

PVDM2-8 (or other PDVM2)
HWIC-1T & CAB-SS-V35MT
VWIC2-1MFT-T1/E1

On BR1 and BR2 Gateways
HWIC-4ESW-POE : Ether net module to power up the IP phones.

On BR2 Gateway
NM-CUE : Unity Express Module


Switch

Any Cisco IOS switch that supports PoE.

Servers

I recommend using virtual machine(VM) for the lab:

1. If you don't use VM, you're going to need the expensive Cisco/HP/IBM server to install CUCM, UC, CUPS.
2. If you don't use VM, you're going to need license files for CUCM, UC, CUPS.
3. If you don't use VM, you're going to need 5 servers and a couple of workstations (to simulate soft phone and run CUPC/CAD)
4. With VM, you can easily clone VMs, which saves you lots of time.

I myself use VMWare. I have no experience on Microsoft Virtual PC (HyperV).

Here are some caveats you need to know about VMWare:

1) You may either use "VMWare Server" (a.k.a. GSX) or "VMWare Infrastructure") (a.k.a. ESX)
2) GSX is free while ESX is commercial license
3) VMs on GSX has issues with NTP. This is a known issue.
4) ESX does not support audio device. Your VM might need an audio device to launch soft phone (such as CIPC or IP Blue). Try google "virtual audio cable" and you should find a solution.

Following is my lab set up with four 2811 routers, one 3560 switch, one PC (8G RAM, 750G HDD):


When creating VM, make sure you choose RHEL 4(RedHat), 32-bit, one CPU. For CUCM 7, you need at least 1G RAM and 60G hard drive. For CUPS 7, you need 1296M RAM and 75G hard drive.

Tuesday, March 3, 2009

Change Notification

On Cisco Unified Communication products, there's a terminology called "Change Notification".

What is "Change Notification"? And how does it affect products' functionality?

Typical symptom of "Change Notification Issue" is: the changes you made didn't seem to take effect, including:

#1 You add a new DN (Directory Number) to CUCM. But you got fast busy when calling that DN.
#2 You try to reset the phone from GUI. But the phone didn't get reset.
#3 You updated the information on CUCM (such as PIN#, line association, etc.). But CUPS didn't get those changes.
...

"Change Notification" is one component notify another component that changes have been made. The other component should react to this notification.

For example, when you add a new DN to CUCM. The changes were made to database. Database component should send notification to the call routing component. So the call routing function could work properly (calls can be routed to the new DN).

If change notification was failed (or delayed), the call routing component will use outdated information to route calls. Thus calls to new DN would failed.

How do you know if there's a change notification issue? If the components are on the same box, you may use the command "show tech notify"

An example output would be link below:

32 I 0 P 118 H 118 T 118 S 80 ccm
...
38 I 0 P 119 H 119 T 119 S 27 EPAS_SyncAgent[10.88.229.209]:32958

The line that begin with numbers (32, 38, ...etc.) represent clients (applications) subscribed to change notify

"32" means it is the 32 client slot on this server.
"I 0" means there are 0 messages in shared memory to be processed by this client.
"P 118" means 118 messages have been processed.
"H 118" means the "Head" message position is 118.
"T 118" means the "Tail" message was in position 118.
This is the optimal situation: 118-118=0 (nothing to be processed).

“S 80 ccm” means there are 80 tables subscribed by this client and the client name is ccm (callprocssing).

You may also use RTMT to see the CN (Change Notification) queued. If you saw non-zero value in queue, either the server was busy, or you got CN issues. Restarting corresponding process (such as ccm) normally would clean up the queue and solve the problem.


Change notification across different boxes is a little bit complex. CN was sent via IPSEC tunnel in this scenario. IPSEC was controlled by Cluster Manager. Trust relationship was established during installation (when you add a box into cluster).

In order to add a new server into a cluster, two requirements have to be met:
1) The hostname of the new box needs to be presented in the Publisher's server list or application list. (this can be done manually or automatically depends on the server you try to add).
2) You need to know the cluster "secret password". The password needs to be entered during installation (of the new box).

If either one was changed after installation, the trust would be broken. You can verify that by looking at "Cluster Manager" logs.

If trust was broken, change notification won't work. For example, changes made on CUCM didn't populate to CUPS. Restart Sync Agent would force a synchronization, which is not affected by change notification or IPSEC.

Friday, February 27, 2009

Phone reset issue

Phone reset issue is the most common yet most nasty issue (especially when it's "sporadic" or "intermittent").

Please note the different between reset and re-register.

If the phone lost connection with CUCM, it'll try to re-register. "Lost connection" means, the phone lost three keepalives from CUCM in a row. By default, those keepalives are sent every 30 seconds. You may verified that from "Cisco CallManager" trace. If CUCM sent keepalives but phone didn't receive it, it's usually network issue.

Reset usually happens when IP address on the phone was lost. In that case, the phone need to go through a reset process to acquire a new IP address. This is usually a DHCP (server) problem.

When the DHCP client reaches half-life, it'll try to renew the lease with DHCP server. e.g. If the DHCP lease was 72 hours, the client will try to renew at 36 hours. In normal situation, DHCP server will agree to renew. So the client can keep its IP address.

If DHCP server explicitly refused the renew, DHCP client has to release the IP. This is unusual and probably would be a problem of DHCP server.

On phone console log, you would see something like below:

NOT 08:11:10.854439 DHCP: Restart - delay = 0
NOT 08:11:10.866112 DHCP: Sending Release...
NOT 08:11:10.894059 DHCP: dhcpSendReq: status 0x12300000
NOT 08:11:10.894946 DHCP: Sending Request...
NOT 08:11:10.899614 DHCP: NAK received
NOT 08:11:10.901451 DHCP: clear info - IP = 10.2.16.37, state = 2
NOT 08:11:10.902400 DHCP: Sending Release...

"NAK received" means the DHCP server refused to renew the lease.

Next time, if you got phone reset periodically (say, every 36 hours), check DHCP lease time. If the cycle matches the "half-life", it's most likely DHCH server issue.

Wednesday, February 25, 2009

KISS - Keep It Simple Stupid

Most IT guys would know about KISS rule (Keep It Simple Stupid). However, not too many really understand and utilize it. Let's take a look at some Cisco Unified Communication products and see how we can utilize KISS rule.

The most frequently seen problem description is "it doesn't work".

"It" could mean lots of things. "It" could involved different products from different products/vendors (such as CUCM/IPCC/CUPS from Cisco, MOC from Microsoft, PBX from Avaya, T1 trunks from AT&T, etc.).

In order to simply the problem, we have to narrow down the problem quickly.

For example, if a customer said "My call center agents cannot make phone calls", I would ask "can you make calls from IP phone to IP phone in the same office?". This question could potentially eliminate call center software, voice gateway, PSTN and codec issues. If you didn't ask, you'd have to troubleshoot those items one by one (assuming you know how to troubleshoot those items)

Another example is network issue. All Unified Communication software rely on network connectivity. They wouldn't function properly if network didn't. Sometimes, network issue was not as obvious as you had thought. For example:

1) Windows Firewall service was stopped. But traffic wouldn't pass through until you explicitly open ports on it. (Hard to believe. But it happens)

2) You're not using VPN. But VPN client was running as a service and have firewall option turned on. (works as designed)

3) You claimed there was no firewall in the network, but there's a FWSM(Firewall Switching Module) on the switch.

4) You opened all ports on ASA (Firewall and VPN), CUPC still won't work. That is because one of the ASA bugs prevent large SIP message from passing through.

...

To troubleshoot network issue, you have to:
a) Have visibility on every components in the network
b) Be very good at all network layers (from physical layer to application layer)
c) Know how to use sniffer (such as Wireshark)

The difficult part is: sometimes you wouldn't think it's the network because it's not that obvious. Hence you wouldn't go down that path at all. You have to use KISS rule to find it out.

Example #1:
Customer: "My CUPC doesn't work."
You: "Doesn't work for all users? Or for some users?"
Customer: "For those users working from home."
You: "If those users were in office, would CUPC work for them?"
Customer: "Yes."

Now you know the problem is outside CUPC. Probably on the network (VPN?)

Example #2:
Customer: "My CUPC doesn't work."
You: "Doesn't work for all users? Or for some users?"
Customer: "It works for John but doesn't work for Mary. And they are both in the same office."
You: "On John's computer, can you log into CUPC with Mary's account? See if it works?"
Customer: "Yes, it works."

Now you know the problem is outside CUPC. Probably on Mary's computer (Firewall?)

Some KISS rules for Cisco UC (Unified Communication) products:

#1 If you don't know if it's case sensitive, assume it is.
This becomes a problem when Cisco moved from Windows platform to Linux.

#2 Because of #1, try to use lower case as much as you could.
Some people use capital case for cosmetic purpose. This could potentially cause some problems and it could take weeks to troubleshoot.

#3 Eliminate dependencies as much as possible.

Example A: When you installing CUCM, you have the option to use DNS, NTP, etc. Do NOT use them. If you use them, the installation might fail if those components weren't configured properly. You chance to configure them after install. I can't tell you how many problems are caused by DNS (even after install).

Example B: Don't use same "service account" for different applications. For example, you used the same active directory account for CUCM LDAP integration, CUPC LDAP search and CUPS Calendar. If CUPS admin change the account password (for whatever reason), it breaks CUCM and CUPC.

Example C: Get rid of CUCM subscribers during Windows-to-Linux migration. When you migrate from CCM 3.x/4.x (Windows) to CUCM (Linux), DB replication is always a headache. DB replication would fail if hostname, IP address was changed during migration (or some other changes between Pub and Sub). To avoid those headaches, remove subscribers from the cluster before migration. With a single server (Publisher) in the cluster, your chance of failure is far less than a 10-server cluster. After migration, you may add the subscriber to cluster one by one.

#4 Be a "minimalist"
Sales people tend to sell all the "bells and whistles" to customer. Sure that's the selling point. But as an engineer, if you want to get the job done smoothly, try to start with the minimum.

Example A: Use TCP instead of TLS.
Sure we want the security of TLS. But don't try to run before you can walk. Make sure the product is working before attempting TLS. If it didn't work with TLS, you know where the problem is.

Example B: Use simple passwords.
Sure we want the security of a long, complex passwords (how about 1024-character long?). But for installation and troubleshooting purpose, keep it short and simple (don't use special characters)

Example C: Build a simple test bed.
I've seen some integrators tried to deploy their first CUPS/CUPC installation over the VPN (because they are not onsite). This is a bad idea unless VPN is what you want to test. If something didn't work, you won't know if it's the VPN or CUPC.

Same for the computers. Instead of testing on a computer with bunch of custom-installed software, you'd better test on a clear/fresh-installed computer. Stick with Windows XP. Stay away from Vista, unless you understand what is UAC, Windows Defender (or offender?), and other security "features".

Thursday, February 12, 2009

Decrypt CUCM version numbers

In an ideal world, version 6.x is better than 5.x, version 7.x is better than 6.x, and so on so forth. However, we're not in an ideal world.

Cisco builds different "trains" in parallel. Currently, the active trains for CUCM are 5.x, 6.x and 7.x.

This "multiple trains" approach is a compromise between market demands and compatibility. In order to support new features, big changes need to be made to the infrastructure (e.g. database schema). Sometimes, the changes are so big that it's impossible to be compatible with previous versions. So they introduce a totally different "train" to lower the risk.

It's really hard to tell which train is the "best". Of course, newer train would have more features. But they also have more requirements. For example, CUCM 6.x is compatible with CUPS 6.x and 7.x. But CUCM 7.x is only compatible with CUPS 7.x.

On each train, there are many "sub-versions". For example, on 6.x train, you have 6.1.1, 6.1.2 and 6.1.3. Read the release note carefully. Some versions won't be able to upgrade to another train. For example, CUCM 6.1.3 won't be able to upgrade to 7.0.x (because of different database schema)

On each sub-version, there are also "build-numbers". e.g. 6.1.2.1000, 6.1.2.2000, etc. Build-number is the most confusing part.

Generally speaking, build numbers should increase in 1000, such as 6.1.2.1000, 6.1.2.2000, etc.

CUCM is built on Linux OS. Whenever Cisco release an OS security patch, they'll increase the build number by 1000. This is called PSIRT patch.

Remember CUCM is an application running on Linux. OS patch does not contain any CUCM bug fixes. Any bug fixes would be in ES (Engineering Special). ES versions would be indentified by the last three digits in build numbers (e.g. 6.1.2.1112)

OS team and CUCM (application) team are two different teams. When the OS team release OS patches, they don't include any application patches at all. But the version number was increased by 1000.

Quiz: 6.1.2.2000 and 6.1.2.1112, which one is "better"?

Answer: it depends on how you define "better". But most of the people would think "less buggy" is better. When they say "less buggy", they mean "less bugs in CUCM". If that's the case, 6.1.2.1112 is better. Because it has ES number of 112, which means it fixed quite a lot bugs. While 6.1.2.2000 has no CUCM bug fixes at all (it contains OS patches though).

Confusing enough? I don't know which genius invented this version schema. But that's the way it is. If you try to "upgrade" 6.1.2.1112 to 6.1.2.2000, it'll fail with some vague error messages. You have to open a TAC case to understand why it failed.

Interesting? Yeah, that's the way to keep TAC engineers' jobs. :)

Sunday, February 8, 2009

NTP - Network Time Protocol

NTP is critical in Cisco voice products. Time synchronization not only provides consistent time in trace files, but also a mandatory requirement for some components.

Architecture

On a CUCM publisher, you may choose to use internal clock (computer hardware clock) or external clock (NTP server, such as a router).

Regardless of your choice, all other servers in the cluster will use NTP protocol to synchronize time with publisher. In another word, NTP is only configurable on publisher.

Basic concepts

http://en.wikipedia.org/wiki/Network_Time_Protocol


Tips

1. Before you configure NTP on publisher, configure the local time as accurate as possible. This will shorten the time to synchronize after you configure NTP.

2. Be patient after you configured NTP. It might take hours to synchonize based on the time difference between publisher and NTP source. This works as designed. This is to comply with IETF RFC.

3. If NTP was configured on publisher, subscribers won't synchronize to publisher until publisher is in-sync with NTP source. If you're having problem sync the publisher to NTP source, but you want the whole cluster in-sync on time, disable NTP on publisher.

Frequently used commands

utils ntp status

ntpd (pid 3638) is running...

remote refid st t when poll reach delay offset jitter
==============================================================================
127.127.1.0 127.127.1.0 10 l 9 64 377 0.000 0.000 0.008
*171.68.10.80 64.103.34.14 2 u 921 1024 377 38.233 3.336 1.182
+171.68.10.150 10.81.254.202 2 u 988 1024 377 37.044 3.252 12.236


synchronised to NTP server (171.68.10.80) at stratum 3
time correct to within 60 ms
polling server every 1024 s

Current time in UTC is : Sun Feb 8 14:38:36 UTC 2009
Current time in America/Chicago is : Sun Feb 8 08:38:36 CST 2009


The output above tells you:
1. The box is synchronized to 171.68.10.80 at stratum 2.
2. Internal clock is at stratum 10 (the box won't synchonrize to any time source with stratum equal or greater than 10)

Other commands include:

utils ntp config
utils ntp restart
utils ntp start

Troubleshooting

utils network capture port 123
Executing command with options:
size=128 count=1000 interface=eth0
src= dest= port=123
ip=

08:56:01.125718 cm6-sub.ntp > cm6-pub.ntp: v4 client strat 4 poll 10 prec -18 (DF) [tos 0x10]
08:56:01.125965 cm6-pub.ntp > cm6-sub.ntp: v4 server strat 3 poll 10 prec -17 (DF) [tos 0x10]
08:56:18.270720 cm6-pub.ntp > ntp-sj1.ntp: v4 client strat 3 poll 10 prec -17 (DF) [tos 0x10]
08:56:18.308956 ntp-sj1.ntp > cm6-pub.ntp: v4 server strat 2 poll 10 prec -18
08:57:24.271526 cm6-pub.ntp > ntp-sj2.ntp: v4 client strat 3 poll 10 prec -17 (DF) [tos 0x10]
08:57:24.309282 ntp-sj2.ntp > cm6-pub.ntp: v4 server strat 2 poll 10 prec -16


Port 123 is NTP port. The output above shows the incoming/outgoing NTP packets on publisher:
1) cm6-sub is the NTP client on stratum 4
2) cm6-pub is the NTP server on stratum 3 (because the external NTP source is on stratum 2)
3) ntp-sj1 and ntp-sj2 are the external NTP source on stratum 2

NTP logs

Use RTMT to get "ntp logs".

Troubleshooting time offset on phones

If the time on CUCM server was correct, but the phones showed wrong time, it's most likely due to misconfiguration.

First of all, we need to understand the difference between UTC time and local time.

There are many different time zones in the world. In US, we have EST, CST, MST, PST, etc. 8AM EST means 7AM CST. Daylight saving also adds more complex to this. Different countries have different daylight saving cutoff dates.

To provide consistency around the world, NTP server feeds UTC (GMT) time to clients. How to manipulate it to get "local time" would be the client's responsibility.

On CUCM Admin > System > Date/Time Group, you may configure different groups to reflect different time zones. Then you may associate date/time group to different device pools. Hence, different phones in different device pools can have different local time.

One thing to notice is:
The "old" phones (7940/7960) get local time from CUCM server.
The "new" phones (7941/7961 and newer) get UTC time and time zone info from CUCM server. Then they do the math and display the local time.

Use Windows server as NTP source

Depending on your Windows version, there are some registry settings you need to set:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W32Time\TimeProviders\NTPServer\Enabled
Changing the ‘Enabled’ flag to the value 1 enables the NTP Server.

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W32Time\Parameters\Type
Change the server type to NTP by specifying ‘NTP’ in the ‘Type’ registry entry.

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W32Time\Config\AnnounceFlags
Set the ‘Announce Flags’ registry entry to 5, to indicate a reliable time source.

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W32Time\Config\LocalClockDispersion
Set 'LocalClockDispersion' to 0

The last one is most important one.

After changing registry, you need to restart "Windows Time" service.

P.S.  You either turn off Windows Firewall or have to allow UDP port 123, which is used by NTP protocol.

Wednesday, February 4, 2009

Integration between CUPS and MOC/OCS

For now, to integrate MOC (Microsoft Office Communicator) with Cisco IP phone system (CUCM) you need CUPS for phone presence and phone control.

Phone presence

Phone presence info will flow like this: CUCM -> CUPS -> OCS -> MOC.

Phone control

Phone control info will flow like this: MOC -> OCS -> CUPS -> CUCM.

Best Practices

Use TCP instead of TLS on your first deployment. TLS/Cetificates are not something fun to play with. They are optional for the integration. Per the KISS (Keep It Simple, Stupid) principle, don't mess with TLS unless you have to.

Let's talk about phone control first. Currently, Cisco support RCC (Remote Call Control) for MOC.

RCC was configured in Active Directory Users and Computers (ADUC) > Communications.


As you can see from the picture above, we need to configure Server URI and Line URI.

6002 is the IP phone DN (Directory Number).

htluo-cups is the proxy domain that will process the request. We'll further discuss this part later.

When MOC starts up, it'll send "INVITE 6002@htluo-cups6" to OCS.

When OCS receives the INVITE message, it'll try to route it to the right destination (CUPS in this case).

How OCS routes the message is more complicated than it looks like. It could be static route, it could be DNS lookup. For more details see "SIP domain and DNS domain".

Again, per KISS principle, it's recommended to use static route on OCS to eliminate any misconfiguration of DNS.


As shown above, the static route means "for any SIP message with domain htluo-cups6, forward it to 10.88.229.209, port 5060, with TCP protocol".

In order for CUPS to accept this message, OCS' IP address must be added to CUPS Incoming ACL. (or you may configure an "ALL" incoming ACL)

When CUPS server receives the message from OCS, the first thing it does is to determine if message has reached the final destination. CUPS compares its own configuration with the domain portion of the SIP message. If the domain portion of the SIP message matches one of the following, CUPS would think the message arrives at its final destination and take care of that.

a) SIP domain name (configured under CUPS Admin > System > Service Parameters)
b) CUPS node name (configured under CUPS Admin > System > Server)
c) node name + SIP domain
d) other alias name configured on CUPS

To see a full list of alias names, set "SIP Proxy" trace to detail. Restart SIP Proxy service. SIP Proxy would write a list of alias names to trace files during startup.

If CUPS decided to take care of the INVITE message from OCS, it will do the following:

1) Determine if the MOC user has permission to control the phone
2) If step 1 was ok, open a CTI request to CUCM CTIManager
3) If step 2 was successful, return "200 OK" SIP message to OCS

Determine if the MOC user has permission to control the phone

In the Server URI (tel:6002;phone-context=dialstring;device=SEP001E7A24429A), if a device name was specified (which was in this case), CUPS will check if that device was in the "Controlled Devices" list on CUCM Admin > User Management > End User.

If no device name was specified in Server URI, CUPS will try to find the device by DN. For details, please see: http://www.cisco.com/en/US/docs/voice_ip_comm/cups/6_0_1/install_upgrade/deployment/guide/dgmsint.html#wp1049685

Again, per KISS principle, you'd better specify device name in Server URI on your first deployment.

Open a CTI request to CUCM CTIManager

CUPS will open a CTI request to CUCM CTIManager with the credential configured on CUPS Admin > Application > CTI Gateway > Settings (CUPS 6.x).

Of course, the credential needs to exsit on CUCM > User Management > Application User. It needs to be in "Standard CTI Enabled" and "Standard CTI Allow Control of All Devices" groups.

And of course, the phone device needs to be registered to CUCM.

If all above was successful, CUPS will send "200 OK" to OCS as an response to the INVITE.

At this point, CUPS has done its job. But it does not necessarily mean MOC gets phone control.

In order for OCS to accept the "200 OK" message from CUPS, CUPS' IP address must be added to OCS "Host Authorization" (please note, it's IP address, not hostname or FQDN).


Don't forget to restart OCS Front End services after making changes.

The best tool to debug OCS/CUPS integration is on OCS. Right-click on the pool > Logging Tool > New Debug Session. Choose SIP Stack. Optionally, you may filter by the MOC user ID in the filter settings.

Known caveats:

1. CUPS sent "200 OK" for the INVITE. But MOC still not getting phone control.

This is because OCS doesn't trust CUPS. OCS SIP stack log will show "SIPPROXY_E_INVALID_RECORD_ROUTE"

Resolution: Check "Host Authorization" on OCS.

2. Load balancer

If you have load balancer for OCS, more likely than not, you will run into "one-way phone control" issue. The symptom is: you can make phone calls from MOC. But the call status was not updated on MOC. For example, when the call was connected, MOC still showing "calling".

This problem was caused by misconfiguration of load-balancing.

When OCS sends message to CUPS, it doesn't go through load-balancer (based on your exsiting configuration).

When CUPS tries to reply to OCS, it looks up DNS and DNS resolve the pool name to the load-balancer virtual IP. So the traffic goes through load-balancer to get to OCS. When OCS received the message, the last hop was load-balancer. However, the load-balancer didn't add its IP to the SIP header. OCS will reject this message and send "400 Missing correct Via header" to CUPS.

Resolution:
Check your load-balancer, see if it's capable of modifying SIP header. Or contact Microsoft to see if they can turn off the "Via header" check on OCS.

Wednesday, January 28, 2009

How does CUPC determine presence address

In order for CUPC to receive presence information, it has to connect to a presence node.

Just to remind you again, "presence node" is NOT necessarily the same node as the logon server. For example, you have two CUPS servers - CUPS-A and CUPS-B. The logon server could be CUPS-A, but the "presence node" could be CUPS-B.

Even in a single-node environment, CUPC still has to run through a set of rules to determine the "presence address".

If you ever looked at CUPC > Show Server Health (or CUPC logs), you'll notice two parameters: "Presence.Primary.Address" and "Presence.Domain".

"Presence.Primary.Address" is the node name of the server. You may find this under CUPS Admin > System > Servers.

"Presence.Domain" is the SIP proxy domain. You may find this under CUPS Admin > System > Service Parameters > Cisco UP SIP Proxy.

If the node name was configured as dotted IP address (e.g. 192.168.1.100), CUPC will use that as the presence address.

If the node name was non dotted IP, things are more complicated. Here are the rules CUPC uses to determine the presence address:

When you specify a non IP string for the node name, CUPC looks at the string to see if it already ends with the domain value -- if not, it appends it. Examples:

1)
Proxy.Domain = cisco.com
Presence.Primary.Address = cup1

Registration will be to user@cup1.cisco.com (hostname, domain is appended)

2)
Proxy.Domain = cisco.com
Presence.Primary.Address = cup1.cisco.com

Registration will be to user@cup1.cisco.com (FQDN, domain was already included)

3)
Proxy.Domain = cisco.com
Presence.Primary.Address = 10.11.12.13

Registration will be to user@10.11.12.13 (address was dotted-decimal)

4)
Proxy.Domain = 10.11.12.13
Presence.Primary.Address = 10.11.12.13

Registration will be to user@10.11.12.13 (address was dotted-decimal)

5)
Proxy.Domain = 10.11.12.13
Presence.Primary.Address = cup1

Registration will be to user@cup1.10.11.12.13 (hostname, domain is appended) WRONG

6)
Proxy.Domain = 10.11.12.13
Presence.Primary.Address = cup1.cisco.com

Registration will be to user@cup1.cisco.com.10.11.12.13 (FQDN, domain is appended) WRONG

7)
Proxy.Domain = PROXY_DOMAIN_NOT_SET
Presence.Primary.Address = cup1.cisco.com

Registration will be to user@cup1.cisco.com.PROXY_DOMAIN_NOT_SET (FQDN, domain is appended) WRONG

The last 3 of these are wrong (there are other permutations) because the domain portion of the URI is neither an IP address nor DNS-resolvable FQDN. While the proxy might be content to match the strings, CUPC's SIP stack needs to resolve to a routable address.

Please note everything is case sensitive. If node name was cup1.cisco.com, proxy domain was CISCO.COM, presence address will be "cup1.cisco.com.CISCO.COM", which is also WRONG.

Monday, January 26, 2009

Mysterious "Invalid Crdentials" on CUPC

On CUPC > Help > Show Server Health, sometimes you would see failed items with the message "invalid credential", such as "Presence", "Desk Phone", or "Voicemail".

This is very confusing. Since you already logged into CUPC, why it's giving you "invalid credential"? What kind of credential it was failing on?

Before we can move further, please take a look at "CUPS and CUPC, father and son? or not".

CUPS and CUPC's relationship is not as tight as you thought. CUPC has many features, but CUPS is only relevant in two of them (configuration repository and presence).

When you type username and password on CUPC login window, that is majorly for "Configuration Repository". If you typed in the wrong password, CUPC won't be able to download configuration from CUPS. No other functions CUPC can perform without configuration.

However, sucessfully downloading configuration does not guarantee other functionalities. To use other fucntions, a 2nd authentication might be required (either explicitly or implicitly).

Presence - Invalid Credential

For presence feature, 2nd authentication is required on SIP layer. This authentication is implicit. For more details on "Digest Authentication", please see http://www.ietf.org/rfc/rfc3261.txt.

Why is it implicit? Why does it fail?

To make it implicit is Cisco development's decision. If they made it explicit, you'd have to provide digest credential (2nd password) after login. This could be annoying since SSO (Single Sign On) was what we preferred.

So Cisco development made CUPS/CUPC worked this way:
1) You (system admin) configure digest credential on CUCM Admin > User Management > End User page.
2) CUPS synchronizes digest credentials from CUCM to CUPS.
3) CUPS transmits digest credential to CUPC during logon.
4) CUPC uses that degest credential to authenticate with SIP proxy.

Step 3 and 4 look funny because it's like a door keeper gives the key to you and asks you open the door with the key. But keep in mind:
a) The "door keeper" acutally verified your identify (username/password), before giving you the key.
b) The key was encrypted during transmission.
c) The key door keeper gave you might be for a different door (SIP proxy could be on a different server other than the logon server)
d) This is a compromise (or balance) between inconvenience of SSO and SIP protocol requirements.

If there's no digest credential configured on CUCM (ie. it's blank), you'll get "Invalid Credential" for presence. To fix it, take one of the following options:

Option 1: Go to CUCM Admin > User Management > End User, configure a dummy value for "digest credential". It could be any value. Why? See workflow explained above.

Option 2: Go to CUPS Admin > Cisco Unified Presence > Proxy Server > Incoming ACL. (on CUPS 7.x, it's "System > Security > Incoming ACL". Configure an address pattern that covers your CUPC machines. For example, a "all" pattern matches all machines.

This option is considered less secure, because any machine in that address pattern (subnet) would be able to connect to SIP proxy without digest authentication challenge.

Option 3: Go to System > Service Parameters > Cisco UP SIP Proxy. Set "Authentication Module" to "off". This is the least secure option, which turns off SIP authentication at all.

Desk Phone - Invalid Credential

This usually happens when CUCM was configured to use "LDAP Authentication".

To control desk phone from CUPC, CTI protocol was used. Before a CTI client (CUPC) can control the phone, it needs to authenticate with CTI server (CTIManager). This authentication is implicit. CUPC would use the same logon username/password to authenticate with CTIManager. CTIManager, in turn, would authenticate that with LDAP.


Question: Why the authentication would fail?
Answer: In short, this is a bug on CUCM.

Question: Any workaround for that before we can upgrade CUCM?
Answer: On CUCM, change LDAP authentication port to 3268 and restart CTIManager.

Question: Why it would fix the problem?
Answer: When LDAP referral happens, CTIManager would fail on authentication. Using 3268 (Global Catalog) port eliminate LDAP referals.

Question: Why it only affects CUPC?
Answer: CUPC is the only application (so far) that uses end user credential to authenticate with CTIManager.

Voicemail - Invalid username/password or account locked

Depending on what Unity edition you're running (Unity or Unity Connection), the cause could be different.

Before moving on, please take a look at "How to test IMAP connection".

On Exchange 2007, it's because IMAP login was disabled on TCP (port 143) by default.

On Unity Connection, make sure you reset "Web Application Password" instead of VoiceMail password.

Saturday, January 24, 2009

Decrypt HTTPS traffic with Wireshark

Wireshark is a useful tool in troubleshooting. However, if the traffic was encrypted (such as https between CUPS and Exchange), it's unreadable unless you can decrypt it.



Look at packet 11 in sniffer capture above. Application data was encrypted. There's not too much useful data in it.

To decrypt this data, we need the "private key" of the server certificate. You cannot get the private key from client side (such as web browsers). To get the private key, you need access to the server.

Step 1. Export the server certificate with private key

1-1: Go to IIS Admin > Right-click "Defautl Web Site" > Properties > "Directory Security" > "View Certificate".

1-2: Go to "Details" tab > "Copy to File" > Choose "Yes, export the private key"


1-3: You'll save the file in PKCS #12 (.PFX) with all three options UNCHECKED


1-4: You'll have to provide a password to protect the file. Because private key is a very sensitive information.


1-5: Save the file (system will add ".pfx" extension to the file name)


Now we have a PKCS #12 file (.pfx file).

Step 2: Extract the private key from .pfx file

openssl pkcs12 -in test.pfx -nocerts -out privateKey.pem -nodes

The command above take "test.pfx" as the input file, extract the private key, save it unencrypted in "privateKey.pem" file. You'll be asked for the password (where you entered on step 1-4).

Where to find openssl? Google!

Step 3: Go to Wireshark > Edit > Preferences > Protocols > SSL. In "RSA keys list", type the following:

10.88.229.196,443,http,C:\privateKey.pem

Where "10.88.229.196" is the server IP. "443" is the port number (HTTPS). "http" is the protocol you want Wireshark decode to. "C:\privateKey.pem" is the file name of the private key. "SSL debug file" is optional.


Step 4: Once you click OK, you'll notice the changes on Wireshark screen. Now the data was decrypted!

Friday, January 23, 2009

CUPS Calendar integration

Calendar integration is probably the most mysterious part of CUPS.

There are many catchas in Calendar integration due to the following:

1) CUPS uses WebDAV protocol to query calendar, which is a pretty old-school protocol and has many limitations. Microsoft recommend developers use EWS (Exchange Web Service) for better compatibility and features. Cisco will eventually change to EWS. But no ETA yet.

2) CUPS requires HTTPS (TLS/SSL) connection between CUPS and Exchange. This adds more complexity to the picture because you have to deal with CA, certificates, etc.

3) Exchange authentication and permissions
Exchange (OWA/IIS) has many authentication methods (FBA, classic, NTLM, etc.). Exchange also has two different sets of permissions: AD permission and Mailbox permission.

It's impossible to elaborate all scenarios in this blog. But here are some recommendations:

1) Avoid "2003/2007 mixed mode"
More likely than not, it's not going to work with CUPS due to limitation of WebDAV.

2) Avoid any firewall (especially MSFT ISA server) between CUPS and Exchange.

3) Avoid any load-balancer or traffice directory between CUPS and Exchange.

4) Try to disable FBA (Form Based Authentication) for troubleshooting purpose.

5) If you don't care about certificates, use "makecert.exe" utility to create self-signed certificate for Exchange. See http://www.lulu.com/content/5552336 for details.

6) Make sure you set your meeting status to "Busy" for testing. Make sure you set it to "whole day event" to avoid time zone glitches. (be aware, when you set it to "whole day event", the status in outlook will be "FREE" by default)

Wednesday, January 21, 2009

SIP domain and DNS domain

If you deal with SIP products (such as CUPS/CUPC), you'll have to deal with SIP domain sooner or later.

Here are some of the most asked questions:
1) What is a SIP domain?
2) Does the SIP domain has to match the DNS domain?
3) What if I'm not using DNS with the application?

SIP domain is more like on application layer versus DNS domain on network layer.

Let's take a look at a real life example. Let say, you have a SIP application that can send instant message and make phone calls (such as CUPC and MOC).

When the application initiate a call, the SIP message would look like this "INVITE 6002@acme.com". This message means - "I want to call extension number 6002 in ACME company".

Usually, the first stop of this SIP message would be your local proxy server (SIP proxy). The local proxy server will determine how to route this message to its destination.

Whenever a SIP proxy server receive a SIP message, it will always look at the domain part of the SIP request. Based on the domain, SIP proxy will determine how to route the message.

Here's the detailed workflow:

1) If the SIP message's domain matches with the SIP domain configured on SIP proxy, SIP proxy will handle that within the same domain. Otherwise, SIP proxy will forward it to a different domain (or just ignore/discard it depends on the design).

This is very important. Let say, CUPC sent a message "INVITE johndoe@acme.com" to SIP proxy. However, the SIP domain configured on SIP proxy was "abc.com". The SIP proxy will treat it as "foreign message" and try to forward it to its destination domain or discard/ignore it.

2) SIP proxy will look at the SIP static routes to determine the messages' destination.

SIP static routes are configured on the SIP proxy server on application layer. Don't confuse it with TCP/IP static routes.

On CUPS 6.x, you may configure static routes on CUPS Admin > Cisco Unified Presence > Proxy Server > Static Routes. On OCS 2007, you may configure static routes by right-click on the "Front End" folder > properties > routing.

You may configure IP address or FQHN for static routes.

For example:

Static route: acme.com ---> apple.acme.com
This means for all SIP messages with SIP domain acme.com will be routed to a host with FQHN apple.acme.com.

Static route: acme.com ---> 192.168.1.100
This means for all SIP messages with SIP domain acme.com will be routed to a host with IP address 192.168.1.100.

Keep in mind, the above is on application layer.

3) If there's no static route configured, SIP Proxy would try to determine the next hop by name resolution.

If SRV records were configured, SIP proxy would try to resolve the domain by SRV records.

Then SIP proxy would try to resolve by A records.

Now lets answer the questions.

Q1: What is a SIP domain?
Answer: A SIP domain is an application layer configuration that define the management domain of a SIP proxy.

2) Does the SIP domain has to match the DNS domain?
Answer: Yes, it has to match the DNS domain in most of the scenarios. And it's strongly recommended to match the SIP domain with DNS domain.

3) What if I'm not using DNS with the application?
Answer: You may or may not be able to use application features if you don't have appropriate name resolution configured. For example, in CUPC 1.2.x, you may use presence feature without name resolution to the presence FQHN. On CUPC 7.0.x, it doesn't work (because they changed the design on 7.0.x). If you don't have a DNS, you may use local host files to do name resolution.

To be continued...

Monday, January 19, 2009

LDAP account keeps locking out on CUPC

CUPC has the capability to search LDAP. So you can easily add contacts to your CUPC contact list.

In order to search LDAP, the application (CUPC) has to authenticate with LDAP first. A service account was used. This service account was configured in CUPS Admin > Application > Cisco Unified Personal Communicator > LDAP Profile.

Once this service account was locked out, none of the CUPCs would be able to search LDAP.

The strange thing is: as soon as you unlocked the account, it got locked up pretty soon. By looking at Windows Event Viewer (Security Log), you'd see the source was the CUPC computer. You changed the password in LDAP, and changed it on CUPC. But the account still got locked up.

Now you got confused. Since you already "refreshed" the password, why the account still got locked up?

The answer is: the "refresh" didn't get populated to CUPC. Some of the CUPCs were still trying the LDAP with old (wrong) password.

When you change the LDAP profile on CUPS, CUPC didn't get the updated profile (password) until next logon. It'll keep trying LDAP with the old password and keep locking out the account.

To solve this problem, you'll have to logout all CUPC before you unlock the LDAP account.

Sometimes, this is "mission impossible" in a large network where you have hundreds of users.

The workaround is:
1) Create a 2nd LDAP account.
2) On CUPS, update LDAP profile to use the new account.

On step2, you have to make sure you put in the correct information in one shot. If you misconfigured something and tried to correct that, chances are some CUPC might get the wrong info before you corrected it. The loop starts again.