Data Warehousing: Operational Data Store (ODS) - dummies

Data Warehousing: Operational Data Store (ODS)

By Thomas C. Hammergren

Some definitions of an ODS make it sound like a classical data warehouse, with periodic (batch) inputs from various operational sources into the ODS, except that the new inputs overwrite existing data.

In a bank, for example, an ODS (by this definition) has, at any given time, one account balance for each checking account, courtesy of the checking account system, and one balance for each savings account, as provided by the savings account system.

The various systems send the account balances periodically (such as at the end of each day), and an ODS user can then look in one place to see each bank customer’s complete profile (such as the customer’s basic information and balance information for each type of account).

One of the more confusing concepts in the world of data warehousing is the operational data store. No one really agrees on what an ODS actually is.

If you want to call an environment such as this one an ODS, by all means, go right ahead. Terminology aside, this example is just a batch-oriented data warehousing environment doing an update-and-replace operation on each piece of data that resides there (and, of course, adding new data as applicable), rather than keeping a running history of whatever measures are stored there.

You can implement this so-called ODS pretty easily, and you can even use batch-oriented middleware tools and services, and reporting and OLAP tools.

Another version of an ODS is a little more architecturally challenging. It uses an end-to-end approach that requires warehouse-enabled applications (because you know that they’ll provide data to a data warehouse). Warehouse-enabled applications support a push or pull architecture and enable an informational database to be refreshed in real-time (or near to real-time).

Although the premise of breaking down application and system barriers is very much in concert with what you do with a data warehouse, you have one major problem: The pace of updates into your informational and analytical environment is much too slow if you use classical data warehousing and its batch-oriented processes for extracting and moving data.

Forget about terminology and buzzwords. Focus instead on the architectural and time-oriented differences between the ODS.