Designing a Relational Database for Querying and Reporting Support
Your data warehousing environment or a specific data mart that your main data warehouse will feed might have the mission of generating a finite and predictable set of reports. Here is one approach to designing a relational database to support that mission, built around the principle of database denormalization, or deliberately violating good relational database design principles in the interest of performance efficiency.
Denormalization is best suited for quick-hit solutions, in which you must get a small-scale relational data warehouse or a data mart up and running quickly. For example, you might create a denormalized relational database for a specific charter to produce a certain set of reports that will no longer be available as a result of a legacy system migration effort.
Although denormalization isn’t quite a dead end, it does create a great deal of duplicate data, and the database structures you create don’t have much flexibility. Additionally, you probably have limited querying capabilities (in addition to your standard reports) because those capabilities are closely tied to the reporting structures formalized in the table design. Still, you might want to check out this approach.
A simple example of denormalization, shown in the figure, shows what the source database tables look like in an application that tracks sales performance, with those tables structured primarily according to standard relational database design principles (they’re normalized).
To support the report format shown at the bottom of the figure, the source structures are mapped into a denormalized table from which the report can be generated without having to join any tables. (To put it more simply, your report runs very quickly.)
Note: A real-world example would involve many more tables (from 10 to 50 or more) and many more reports than shown in the figure. This figure should get the idea across, however.
Alternatively, you might want to follow the principles and techniques of dimensional design. Because RDBMSs now have much less trouble dealing with dimensionally oriented structures than in the past, you’re likely to get adequate performance for your reporting needs and still have the flexibility to support a large variety of ad hoc, multidimensional queries.
For rapid deployment that’s reporting-oriented, though, at least consider denormalization-based design for relational data.