Saturday, January 23, 2010

Hard Drive Partition and version control of UC Appliance

Active and Inactive Version

Many Cisco Unified Communication appliances (CUCM, CUPS, CER, CUMA, etc.) share the same OS (Cisco customized Linux).

For maintenance purpose, you may install two copies (versions) of systems on the hard drive. Cisco call it "active version" and "inactive version".

CLI commands:

show version active
show version inactive
utils system switch-version

Please note: "active" and "inactive" are relative. When you switch the version (utils system switch-version), the "active" version will becomes "inactive".

If you're a Windows guy, you should be familiar with C:\boot.ini.

If you're a Linux guy, you should be familiar with grub.conf.

It's the same way Cisco UC appliance controls which version to boot from.

Partitions

The two copies of software are installed into two partitions: \ (referred as Partition A) and \partB (referred as Partition B). Whenever you use "utils system switch-version", the active partition will become inactive. The inactive partition will becomes active.

Upgrade and Switch Version

The word "upgrade/patch" has a different meaning in Cisco UC world. Instead of "replacing" files, the "upgrade/patch" process is actually installing a full copy of system in the inactive partition. This has two benefits:

1) You may perform an upgrade/patch during production hours.
2) It's easy to fall back to the old version.

Scenario:
You have 6 CUCM servers in the cluster. Let say, upgrade each server takes about 2 hours. Your business does not allow any downtime during business hours.

Questions:
Q1: How long does it take to upgrade the whole cluster?
Q2: How much time you'll have to spend in after hours for the upgrade?

Answers:
A1: About 4 hours (2 + 2)
A2: A couple minutes.

Explanation:

1. You need to finish the upgrade on CUCM publisher before you can do the upgrade on subscriber. It takes 2 hours to upgrade the publisher. The new code be installed into inactive partition. You don't have to switch to new version right after install. Thus you can do it in business hours.

2. Once the the new version has been installed on publisher (even it's in inactive partition), you may start upgrade process on subscribers (simultaneously). This takes about 2 hours (because you're upgrading all subscribers simultaneously). You don't have to switch to new version right after install. Thus you can do it in business hours.

3. In after hours, you may use "utils system switch-version" command to switch all boxes to new version. This usually takes less than 10 minutes.

However, there's a catch: if you made any configuration changes after the point of upgrade, those changes wouldn't be reflected in the new version. For example, you performed the upgrade at 10AM but didn't switch to new version. Then you switched to new version at 6:30PM. Any configuration changes made between 10AM and 6:30PM will be lost.

Under the hood of "utils system switch-version"

What actually happens when you type the command "utils system switch-version"?

1) It modifies /grub/boot/grub/grub.conf file to make the other partition active
2) It synchronizes UFF (User Faced Feature) to the other version. UFF refers to Call Forwarding, MWI (Message Waiting Indicator), etc.

If the system failed to switch version, here are some options:

Option 1: Try "utils system switch-version nodatasync"
This turn off the UFF data sync action.

Option 2: Use "Recovery CD" (downloadable from CCO) to switch version.

Option 3: If you're a Linux guy, it shouldn't be too difficult for you to get access to /grub/boot/grub/grub.conf.

8 comments:

  1. Do we run "utils system switch-version" on just the publisher first and wait for it to come back up before starting the process on any subscribers? Is it possible to eliminate any downtime during the cluster upgrade by only rebooting on CM at a time?

    Thanks, Jeff

    ReplyDelete
  2. Hi Michael,

    I'm becoming a regular reader of you blog which i find very interesting because the content of your technical experience with Cisco products.

    regarding the swapping of the active/inactive versions, i will like to know how does the active /inactive partitions get synchronized in the DB, i guess there must be some kind of DB sync between the active/inactive DB during the swap?

    thanks,

    ReplyDelete
  3. The "switch version" command only synchronize UFF (User Faced Features), such as MWI and call forwarding. Nothing else!

    Which means, if you make configuration changes in one version, don't expect it to synchronize with the other version when you do "switch version".

    ReplyDelete
  4. I just wanted to thank you for an answer you posted at Cisco re: BLF issues. SUBSCRIBE css was indeed the issue I was having. Thanks for the expertise, you saved me more hours and frustration.

    ReplyDelete
  5. hi... I just upgrade CUCM form 6.1.(1a) -> 6.1.(3a)

    but I get error message "Sync after switch version failed" and now the version of CUCM still 6.1.(1a)
    I have tried :
    1. utils system switch-version
    2. utils system switch-version nodatasync

    can u tell me the path for get the Recover CD for switch version CUCM ?

    note: I'm not linux guy :)

    thank's

    ReplyDelete
  6. Great post. This one and your Ultraiso post to boot any CUCM CD have saved me a lot of time.

    ReplyDelete
  7. @apa aja deh: I know it's a bit after the fact, but you're always supposed to download the latest recovery CD. For 6.1, that would be http://www.cisco.com/cisco/software/release.html?mdfid=281023410&softwareid=282074294&release=6.1(5)&relind=AVAILABLE&rellifecycle=&reltype=latest

    ReplyDelete
  8. Wow.... awesome explanation of it all.
    Thanks for taking the time to post this.

    Much gratitude.

    ReplyDelete