SQL Domain-Key Normal Form (DK/NF) and Abnormal Form
After a SQL database is in third normal form, you’ve eliminated most, but not all, chances of modification anomalies. Normal forms beyond the third are defined to squash those few remaining bugs.
Domain-key normal form (DK/NF)
Boyce-Codd normal form (BCNF), fourth normal form (4NF), and fifth normal form (5NF) are examples of such forms. Each form eliminates a possible modification anomaly but doesn’t guarantee prevention of all possible modification anomalies. Domain-key normal form, however, provides such a guarantee.
A relation is in domain-key normal form (DK/NF) if every constraint on the relation is a logical consequence of the definition of keys and domains. A constraint in this definition is any rule that’s precise enough that you can evaluate whether or not it’s true. A key is a unique identifier of a row in a table. A domain is the set of permitted values of an attribute.
Look at this database, which is in 1NF, to see what you must do to put that database in DK/NF.
Table: SALES (Customer_ID, Product, Price)
Customer_ID determines Product
Product determines Price
Customer_ID must be an integer > 1000
To enforce Constraint 3 (that Customer_ID must be an integer greater than 1000), you can simply define the domain for Customer_ID to incorporate this constraint. That makes the constraint a logical consequence of the domain of the CustomerID column. Product depends on Customer_ID, and Customer_ID is a key, so you have no problem with Constraint 1, which is a logical consequence of the definition of the key.
Constraint 2 is a problem. Price depends on (is a logical consequence of) Product, and Product isn’t a key. The solution is to divide the SALES table into two tables. One table uses Customer_ID as a key, and the other uses Product as a key. The database, besides being in 3NF, is also in DK/NF.
Design your databases so they’re in DK/NF if possible. If you can do that, enforcing key and domain restrictions causes all constraints to be met, and modification anomalies aren’t possible. If a database’s structure is designed to prevent you from putting it into DK/NF, then you have to build the constraints into the application program that uses the database. The database itself doesn’t guarantee that the constraints will be met.
As in life, so in databases: Sometimes being abnormal pays off. You can get carried away with normalization and go too far. You can break up a database into so many tables that the entire thing becomes unwieldy and inefficient. Performance can plummet. Often the optimal structure for your database is somewhat denormalized.
In fact, practical databases (the really big ones, anyway) are almost never normalized all the way to DK/NF. You want to normalize the databases you design as much as possible, however, to eliminate the possibility of data corruption that results from modification anomalies.
After you normalize the database as far as you can, make some retrievals as a dry run. If performance isn’t satisfactory, examine your design to see whether selective denormalization would improve performance without sacrificing integrity. By carefully adding redundancy in strategic locations and denormalizing just enough, you can arrive at a database that’s both efficient and safe from anomalies.