Insights

Bringing our best ideas and thinking to you.

Blog Post
Digital

Blog Post

August 24, 2016

Systems Integration: Using Batch Import as a Stepping Stone

By Martin Lowe

As companies grow in size, so do their internal systems used to manage all facets of the business. Most systems are often based on commercial off-the-shelf (COTS) platforms, and it comes as no surprise that the need to integrate disparate systems becomes more important as the business becomes larger and more complex. This is particularly true when building dynamic data-driven websites, where web content management systems (WCMS) expose or react to backend CRM, analytics or sales data. When those systems share similarities or even a vendor, the integration may be as simple as updating a few configuration files. A Hundred Answers is often engaged to integrate systems where this is not the case.

Most enterprise systems have application program interfaces (APIs), which are ideal for automating the integration between two systems. There are, however, cases where this may not be a practical first step. Often, the requirements for an initial integration are unclear, or too ill-defined to warrant the hefty effort involved in a fully-automated integration. To complicate the matter, APIs often differ in implementation approach, messaging systems, and functional capabilities. Bridging two APIs is always a complex task.

When prototyping an integration between two systems, consider that most have some mechanism to export content into a structured computer-readable format, such as XML, JSON, CSV, Excel, etc. If the source system can produce (or be configured to produce) an extract of relevant content in a consistent format, that export can be used to establish a “data contract” between two systems. Rather than connecting two systems with different purposes and different operating models, we simplify the problem to one of exporting content on one side, and ingesting it on the other.

Not only is this approach much easier to implement, but it also lends itself to iterative development. On a first pass, authors from one system may have to manually trigger the content extract, and email the file to another author for importing into the target system. A second pass may automatically generate the extract based on certain system events or on a regular basis, and the target system may be configured to automatically import files from a specified network share. A third pass could add additional data points or new content types.

Consider the following use case: a client wants to promote certain packages on their e-commerce website, based on sales data from another financial system. We used the above approach to define an exchange format, where the configuration data could be specified in CSV format and imported into the WCMS. The import process was built to automatically create and update native WCM records, allowing web authors to add “romance” content to the financial priorities specified by the sales group. Using this mechanism allowed the WCM implementation to detach itself from the details of the compilation of financial data.

The key concern with this approach is introducing human error into the process. As such, it becomes necessary to implement checks, in order to validate that the imported data is “valid” and complies with the intended use on the target system. These rules can range from checking that data is in a specific range of values, to more advanced validation of consistency of complex data across the various records.

The solution, in this case, relies on OpenText TeamSite as the WCM platform. The implementation makes use of multiple modules within the WCM, starting with customizable workflows to trigger the action. Using a workflow engine allows for more convenient and familiar access to the functionality, and allows for future extensibility of the processing logic. From the workflow, the system displays a custom dialog, built as a Java servlet. The servlet acts as both a controller to the actions the system will be performing, as well as triggering the initial user view and passing information that has been processed to the view. The screen was built using AJAX, with the controller acting as a data interchange between the platform API and the front-end browser, which allows for immediate feedback to the user.

The aforementioned interface contains information that will inform the user of things such as the current step of the import, as well as validation errors, or other issues when importing. These fields are updated by AJAX call backs, which drive the action of the servlet forward, while also retrieving information on the current step of the import and merge process. It will also present the user with a choice about whether to continue after the file has been pre-processed and a preview has been provided. The preview includes context on changes and success or failure states of various record elements that have been parsed from the file.

Driving the logic behind the validation of the records, as well as the writing of the approved and valid data to the file system, is a back-end class containing the model and transformation logic. On initiation, the records that are to be processed are imported and stored for later transformation. As the process goes on, further changes are made to the data set in order to log information, such as whether this information is new, if this is a modified version of existing data, or whether data records are to be removed from the file system. Once accepted, this class will create records within the file system, storing the data in a pre-configured method for retrieval and usage within other parts of the CMS.

The final part of this solution encompasses the method of storage, which needs to be standardized so that information can be retrieved, viewed, and used within the system. We created a form template in the system, which would enable the information to be viewed within the system. Additionally, it allows us to enable modifications to be made to the records in order to capture additional information not present in the external data source. The import process is then able to reconcile changes in financial configuration data and retain relevant romance content surrounding the core configuration.

Designing the implementation in this way yields numerous benefits. Chief among them is that the underlying data exchange (the CSV) can be altered to add additional columns, for author comments or other auditing data, without impacting the import. It also allows the WCM to capture additional content that is relevant to the website, but not to backend systems. The separation of concerns allows either side to change their internal data storage formats (due to version upgrades or data model changes) without impacting the data transfer process between systems. Lastly, core to the topic, it allows for incremental development with little overhead.