What to Put in a Data Mart - dummies

By Thomas C. Hammergren

If a data mart is a smaller-scale version of a data warehouse, this question comes up: What does “smaller scale” mean in reference to the contents of a data mart? The answer to this question is typically that the data will be a subset of the overall enterprise data.

Geography-bounded data

A data mart might contain only the information relevant to a certain geographical area, such as a region or territory within your company. This figure illustrates an example of geography-bounded data.


Although you technically can use a geography-bounded data mart in a relatively straightforward way, you probably don’t want to subset your data in this manner. Users often want to see a cross-geography comparison (for example, “How are our Arizona stores doing versus our Pennsylvania stores?”) in their data warehouse environment. When you create separate data marts for various geographical reasons, these types of comparisons become much more difficult to make.

Organization-bounded data

When deciding what you want to put in your data mart, you can base decisions on what information a specific organization needs when it’s the sole (or, at least, primary) user of the data mart. As shown in this figure, a bank might create one data mart for consumer checking-account analysis and another data mart for commercial checking accounts.


This approach works well when the overwhelming majority of inquiries and reports are organization-oriented. For example, the commercial checking group has no need whatsoever to analyze consumer checking accounts and vice versa.

It pays to dig into the business needs during the scope phase of a data warehousing or data mart project. Outsiders, for example, might think, “Okay, put all checking-account information, both consumer and commercial, into the same environment so that Marketing or Risk Management Analysts can run reports comparing average balances and other information for the entire checking-account portfolio at the bank.”

After additional analysis, though, you might notice that the bank doesn’t do this type of comparison, so why not keep the two areas separate and avoid unnecessary complexity?

Function-bounded data

Using an approach that crosses organizational boundaries, you can establish a data mart’s contents based on a specific function (or set of related functions) within the company. A multinational chemical company, for example, might create a data mart exclusively for the sales and marketing functions across all organizations and across all product lines, as shown in this figure.


Market-bounded data

A company might occasionally be so focused on a specific market and the associated competitors that it makes sense to create a data mart oriented with that particular focus. As shown in this figure, this type of environment might include competitive sales, all available public information about the market and competitors (particularly if you can find this information on the Internet), and industry analysts’ reports, for example.


To truly provide the business intelligence that a company needs in a competitor-driven situation, construct the data mart to include multimedia information, in addition to the traditional data types typically found in a data warehouse.

Answers to specific business questions

The answers to a select number (often a handful) of business questions occasionally drive an organization’s operations. Based on the answers, a company might speed up or slow down production lines, start up extra shifts to increase production or initiate layoffs, or decide whether to acquire other companies.

Business questions that have this degree of weighty importance traditionally cause nightmares for the in-house employees chartered with digging out data and reports, consolidating and checking the information, and reporting the results to executive management.

Sounds like a job for a data warehouse, you say? Unfortunately business analysts have often used spreadsheets, such as Microsoft Excel. These types of “spread marts” often lack the repeatability and data quality required to leverage the data for more than one moment in time.

Before constructing a full-scale data warehouse that can answer these (and many other) business questions, however, you probably want to consider whether a small-scale data mart designed specifically to answer those high-impact, high-value “How are we doing?” type of questions can get the job done.

Later, this type of environment might grow into a larger-scale data warehouse. It often makes more sense, however, to concentrate your efforts on supporting a data mart that has known business value, instead of on supplementing it with volumes of additional data that might provide business value (but can also slow response time or significantly complicate the end-to-end architecture).

Again, the job you do in the early phases of your project makes a big difference in the direction you take and your level of success.


Any set of criteria that you can dream up can determine a data mart’s contents. Some make sense; others don’t. Some take you into an architectural dead end because you get only limited value and have to start all over to expand your capabilities.