E-Merging Companies
(Originally published in Messaging Magazine, November/December 1998)

By Stan Foster, Compaq Computer Corporation

In January 1998 Compaq Computer Corporation announced that it would acquire Digital Equipment Corporation. My first action on hearing the news was to send an e-mail message to MrNobody@compaq.com to see what mail system generated the non-delivery report. I was relieved to get a non-delivery from an Exchange server. It was clear that we would have a lot of work ahead as we merged the infrastructure of the two companies but at least we would speak the same language when we talked about merging the mail and directory systems.

We knew from the beginning that merging two very large Exchange networks was going to take some time. Digital had 200 servers supporting 65,000 employees. Compaq had 200 servers supporting 35,000 employees. With the current state of the tools and the ongoing work with the Transmission Control Protocol/Internet Protocol (TCP/IP) network interconnect, Domain Naming System (DNS), Windows Internet Naming Service (WINS) and the NT domain trusts, we would not be merging these Exchange systems over a weekend.

Instead we decided that an extended period of seamless coexistence leading up to a full merge would be a more pragmatic approach. We would make the interconnect seamless by configuring the Internet Mail Service on each network to achieve the highest level of message fidelity, and by synchronizing the directories so that both systems would include all Digital and Compaq employees as well as selected distribution lists. We would add public folder synchronization as time permitted. We felt that this strategy would relieve the immediate demand for a single Exchange network and buy us some time until the tools and the infrastructure would support a true merge.

While both Digital and Compaq used Exchange, the two directories were not identical. Digital had adopted a "Firstname Lastname" display name format while Compaq used "Lastname, Firstname". There were other differences in the use of Custom attributes, telephone numbers and the interpretation of some fields such as Company name. We drew up a list of characteristics that a directory synchronization solution must have:

We needed flexibility to handle the mapping between one directory and the other. For example, the tool had to re-write display names so that all names appeared consistent in each directory and deal with the other differences.

We wanted a lightweight solution with as few moving parts as possible. We wanted a system that would synchronize quickly with little load on the source and target systems so that we could run it frequently and keep the directories as up to date as possible.

We did not want to introduce additional complex components that would need specialized expertise or that might require constant babysitting. The solution should not require additional head count or expertise that we did not currently have available.

After evaluating several solutions we ruled them out because they did not meet our requirements. In most cases all the products failed to meet two out of three of our criterion and some failed all three.

We knew that Exchange 5.5 supported Lightweight Directory Access Protocol (LDAP) for both read and write operations and so in theory it was possible to interconnect them using LDAP. LDAP specifies a standard protocol and there is some agreement about the Applications Programming Interface (API) and a standard schema. However LDAP does not yet standardize replication or synchronization between directories. We were fortunate to learn that a Digital product that had previously been used to synchronize X.500 directories was about to be released with an update that would allow it to synchronize any two LDAP capable directories. This product is called LDAP Directory Synchronization Utility (LDSU) and it was in late beta at the time we were seeking a solution.

LDSU met all of our needs. There was only one component (no database required) and it had a flexible scripting language that would handle the complex attribute mapping that we needed. Since it introduced no additional components other than the LDSU process we felt that we could configure it once and just let it run automatically as often as we chose. Since it was LDAP based, the process operated via direct connections between the directory servers. In some of the other solutions that we evaluated, directory updates were sent via e-mail and processed by some daemon in the target system. The direct LDAP interconnect seemed much more natural and involved no additional load on our busy Internet Mail connectors. wpe227.gif (8329 bytes)

The basic function of LDSU is to use LDAP to query the source directory, calculate the difference in synchronized objects since the last time it ran and propagate the changes to the target directory, again using LDAP. Since only the updates were propagated to the target system, we chose to run two instances of LDSU, one in each network. This way the LDAP extract would be local and only the updates would travel over the Wide Area Network (WAN) connection. We could consolidate this down to a single LDSU instance if we desired.

In about a day we had installed LDSU and configured it for the basic features we needed for mailbox and custom recipient synchronization. Mailboxes and Custom Recipients (representing our Unix and Virtual Memory System (VMS) mail users) from the source system were synchronized as Custom Recipients in the target system, preserving all the attributes we wished to synchronize, and using the Simple Mail Transfer Protocol (SMTP) address to allow message delivery.

We used the LDSU scripting language to implement the display name mapping and the alignment of other attributes where our syntax or semantics was different. The Display Name was particularly interesting because both companies used a convention of appending some descriptive information after the employee name to help disambiguate that person from another with the same name. We needed to preserve that detail as we mapped the Display Name. In addition, to help employees differentiate based on the original company affiliation we also appended the company name to the display name. For example "Mary Jones (Sales)" from Digital had to appear as "Jones, Mary (Digital—Sales)" in the Compaq directory.

LDSU also allowed us to filter out many mailboxes that we did not want to synchronize. In some cases there were test mailboxes, or mailboxes that were used for internal processes that had no meaning in the other company. We used the LDSU script to filter these out using the pattern matching features in the script language.

The initial synchronization went without a hitch and we managed to create the impact we desired by arranging the synchronization to start the second that the legal process had been finalized at the shareholders meeting on June 11. This single act of merging the directories made a great impact on the employees of both companies. We really were one company because all employees were visible in the directory. I have a 45-minute commute home and on June 11th I grinned like an idiot all the way home.

It took several hours to complete this initial synchronization process and several more hours before the Exchange network had fully replicated all the synchronized objects. Within 24 hours all servers worldwide were fully replicated and our directories were now showing over 110,000 entries.

This is how it looked to the employees:

scrshot.GIF (51244 bytes)

We scheduled the synchronizer to run four times a day every six hours. During each synchronization period we typically see a few hundred transactions (Add, Modify or Delete) as LDSU detects the changes and updates the target system. It typically takes 15 minutes for the full synchronization from the Compaq directory and 25 minutes from Digital. The bulk (95 percent) of the time is the extract and difference calculation. The remaining 5 percent is LDSU performing the updates.

We created a special container in each directory just for the synchronized objects. We created a "Digital" container in the Compaq directory, and a "Compaq" container in the Digital directory. We used permissions to allow the LDSU process to have write-access to only this container. We basically "leased" the container to the other company and let the LDSU process from their side read and write into the special container but nowhere else.

Our next task was to synchronize distribution lists. Our requirements here were that we wanted distribution lists owners to decide if a list should be synchronized or not and we wanted the synchronized copy to appear as natural as possible in the other directory. This meant that the synchronized object should be a distribution list (not just a custom recipient referencing the original list) so that it displayed with the correct icon when viewed by the clients. The synchronized list should also retain the original owner so that an employee in the other system knew whom to contact when a change in the list membership was required.

With these requirements in hand we decided on the following approach. Distribution lists would be synchronized to the target system as hidden custom recipients with the SMTP address of the original list. Then for each synchronized list in the source system we would create a new list with the same name (and with the company affiliation in the Display Name) in the target directory. This synchronized copy would have just one member—the hidden custom recipient representing the original list.

Finally we added the original list owner to the synchronized list by adding the synchronized copy of the owners mailbox as the owner of the synchronized distribution list. So now we had a complete replica of the original list in the target system with the same name and apparently the same owner, displayed in the native format of the target system. We created a special container for the synchronized distribution lists and let LDSU do all the work automatically.

To allow users to select which lists to synchronize we created two empty lists, one in each directory. In the Compaq directory we created a distribution list with no members called "Publish to Digital". In the Digital directory we created a similar empty list called "Publish to Compaq". The instructions to the user community were simple. To synchronize your distribution list, add the "Publish to Digital" (or "Publish To Compaq" for Digital employees) list as a member of your distribution list. LDSU was configured to synchronize only those lists that had the so-called "cookie" list as a member.

This is very efficient from an LDAP perspective because we could find all distribution lists to be synchronized using a single search filter that requests all the objects of type Distribution List that have the "Publish To Compaq" distribution list as a member. Or as LDAP would phrase that: "(&(objectclass=groupofnames)(member=cn=publish-to-compaq, cn=compaq, ou=amexch1, o=digital))"

The "cookie" list has no members so the addition of the "cookie" to the source list does not generate any additional mail traffic.

Since we do not populate the synchronized list with the original members (we only add the hidden custom recipient representing the original lists as the single member), we do not have to worry about frequent changes of the membership causing large amounts of churn in the target directory, or worse, a user of the list in the synchronized domain using out of date membership. The synchronized copy, via the SMTP address in the hidden custom recipient, always sends mail to the source list where it is expanded and delivered.

Confused? Here is a diagram that might help:

NATIVE COPY SYNCHRONIZED COPY
Dlist: CIO Staff
Owner: White, John
Members: Smith, Bill
               Jones, Mary
Email: CIOstaff@compaq.com
Dlist: CIO Staff (Compaq)
Owner: White, John (Compaq)
Members: CIO Staff

Hidden Cust Recip: CIO Staff
Email: CIOStaff@compaq.com

With the distribution list synchronization behind us we started hearing that our filter for mailboxes that we did not want to synchronize had been overly aggressive. The help desk was reporting complaints that "Herman Von Testmbox", who of course is a Vice President, was not appearing in the other directory. We needed a way to allow Exchange administrators to override our filter on a per mailbox basis.

Since our distribution list cookie had worked out so well we adopted a similar convention for the mailbox filter override. An Exchange administrator can add the text "Publish to Digital" (or "Publish to Compaq" in the Digital side) to Custom-Attribute-10 to force the mailbox to be synchronized. We modified the LDSU script so that if this string exists anywhere in the Custom-Attribute-10 value then LDSU will bypass the filter tests and always synchronize the object.

The final major feature we added was to accommodate the fact that employees in the Digital mail system were requiring addresses in the Compaq name space (compaq.com). In some cases this was because of business needs, in other cases we had Compaq employees moving into Digital buildings and due to network constraints, getting mailboxes in the Digital Exchange system. We needed to accommodate that in the synchronization strategy since we could not synchronize a compaq.com address into the Compaq Exchange directory unless there was a digital.com target address to deliver to. So once again the LDSU script came to the rescue and now we can tolerate mailboxes in the Digital directory having compaq.com addresses as either the primary or secondary address. Provided there is also a digital.com address on the mailbox, we can use that as the target address in the synchronized copy.

While this project involved two Exchange implementations, the tools, techniques and caveats can be applied to almost any LDAP capable directory. The only requirement for the source directory is that it must allow LDAP read access and of course the target directory must allow LDAP write access. The access control for read or write operations may be secured according to the security implementation of the particular product. After a long gestation period, LDAP is now implemented by most of the major mail and groupware vendors including Novell, Lotus, Netscape, Hewlett Packard, Digital (now Compaq) and Microsoft.

Most X.500 or X.500-like directories and meta-directories also support LDAP. These products are frequently used to aggregate employee information and synchronize with mail systems. So no matter what you have chosen for a mail and groupware strategy, LDAP-based directory synchronization can probably be used to assist during the migration to a new platform or co-exist in a mixed environment.

During this project we learned a number of lessons. The first lesson is that using LDAP is a really elegant solution to the directory synchronization problem. The Exchange 5.5 server supports both read and write access and has no problem doing the synchronization process 4 times a day against a very large directory. LDSU has been running for over two months now and has not missed a beat.

We also learned that using LDAP to write to the directory is like using Exchange Admin in raw mode. All of the normal constraint checks and automatic address proxy generation that are normally provided by the Exchange Admin utility are not in effect. So the rule is to test everything and use objects created by Admin as a template for how the object should look when created by LDAP. Be especially careful when creating X.400 proxies for your synchronized objects, right down to the order of the terms and placement of semicolons. The Exchange Message Transfer Agent (MTA) is very fussy about the X.400 addresses as we learned to our cost!

LDAP servers are generally good at serving large numbers of connections for clients that request small subsets of the directory. To use LDAP to synchronize it is likely that you will need to adjust the LDAP server default configuration to allow clients to fetch a larger set of results and to increase the LDAP timeout to accommodate the longest search. We found that we got the best results by allowing LDAP clients to fetch up to 20,000 objects and set the LDAP timeout to 10 minutes. We do the extract in small increments using a series of LDAP searches such as (rdn=a*), (rdn=b*) etc. The actual search expression may vary depending on the LDAP implementation.

We chose to do our synchronization into our two large Americas sites, which in both cases included over 80 servers. This was probably a mistake because during the initial synchronization we found the target directory became preoccupied with intra-site replication and the response to the LDAP requests dropped dramatically, even on the big Digital Alpha server we were connecting to. In retrospect it may have been better to synchronize into a small site, or even create a special site just for this purpose so that we could synchronize into one server that was replicating only to a bridgehead server in the site. Then we could let the bridgehead handle the inter-site replication and let the target server focus on the LDAP updates. After the initial load this is not a problem since the delta updates do not present any measurable load during updates or replication.

In conclusion we are very satisfied with the tools we chose and the end result. The user community is happy, the help desk is not inundated with calls and we met our objective to buy a little time while we wait for the network infrastructure changes to take place and the mailbox/server move tools be mature. Then we can begin the process of truly merging the two large Exchange networks.