Data Mart or Data Warehouse? - dummies

By Thomas C. Hammergren

The idea of a data mart is hardly revolutionary, despite what you might read on blogs and in the computer trade press, and what you might hear at conferences or seminars. A data mart is simply a scaled-down data warehouse — that’s all.

Vendors do their best to define data marts in the context of their products; consultants and analysts usually define data marts in a way that’s advantageous to their particular offerings and specialties. That’s the way this business goes; be prepared to ask the tough questions.

If you start a project from the outset with either of the following premises, you already have two strikes against you:

  • “We’re building a real data warehouse, not a puny little data mart.”

  • “We’re building a data mart, not a data warehouse.”

By labeling your project as one or the other of these terms, you already have some preconceived notions about the work you’ll do, before you even begin to dig into the business problem. Until you understand the following three issues, you have no foundation on which to classify your impending project as either a data mart or a data warehouse:

  • The volumes and characteristics of data you need

  • The business problems you’re trying to solve and the questions you’re trying to answer

  • The business value you expect to gain when your system is successfully built

If you’re extracting and re-hosting a subset of data from an existing application into another environment, you can accurately call what you’re building a data mart.

But if you’re starting from scratch, extracting data from one or more source systems, handling the quality assurance and transformation, and copying that data into a separate environment, what determines whether you’re building a data warehouse or a data mart?

Although some guidelines exist, such as number of subject areas and volumes of data, it all comes down to this statement: As soon as you start labeling your environment as one or the other, you’re adding preconceived notions and beliefs about its characteristics that might not fit your business needs.

Here’s the answer: Forget about the terms data warehouse and data mart. Concentrate instead on your business problem and its possible solution. What data do you need in order to perform certain informational and analytical functions; where is that data now and in what form; and what do you have to do to make it available to your users?

Leave the terminology wars to the vendors and analysts. Don’t get caught up in the hype.