At Tekelec, we have several new products that provide a data provisioning interface to one of our core products. These interfaces must be extremly reliable and therefore require fault tolerant hardware and software on "highly available" platforms. This means that we must provide two independent servers, each running our software and one or more databases. Our architecture provides an "active/standby" interface to the customer where the customer need only send updates to one server or the other. Our software distributes the update to both servers as appropriate.
As with all ambitious software development projects, we encountered a bug or two during development. More than one bug resulted in a difference between the two databases. Isolation of a difference in log files is the key to solving these types of problems. Finding these differences was often difficult for the following reasons:
Process logs are distributed across the two servers
Although most log entries contain a timestamp, the time on each machine may be different
The transaction rate is high; log files contain a lot of information
This type of task is why Perl (Practical Extraction and Reporting Language) was originally created. With the support of such a large development community, it is no surprise that a module for performing differences was available on CPAN. This module is called Algorithm::Diff.