Exploring Database Design Tips
One of the most important aspects of any application-development project is the database design. And so, without further ado, here are some tips for designing good databases.
Use the right number of tables
In Amadeus, the Emperor of Germany criticizes one of Mozart’s works as having “too many notes.” Mozart replies indignantly that he uses neither too many nor too few notes, but the exact number of notes that the composition requires.
So it should be with database design. Your database should have as many tables as the application requires — not more, not fewer. There is no single “right” number of tables for all databases.
Inexperienced database designers have a tendency to use too few tables — sometimes trying to cram an entire database-worth of information into a single table. At the other extreme are databases with dozens of tables, each consisting of just a few fields.
Avoid repeating data
One of the core principles of relational database design is to handle repeating data by breaking it out into a separate table. For example, in the old days of flat-file processing, it was common to create invoice records that had room for a certain number of line items. Thus the invoice record would have fields with names like Item1, Item2, Item3, and so on.
Whenever you find yourself numbering field names like that, you should create a separate table. In the case of the invoice record, you should create a separate table to store the line item data.
Avoid redundant data
When designing the tables that make up your database, try to avoid creating redundant data. Whenever redundant data creeps into a database, it introduces the likelihood that the data will become corrupt. For example, suppose you store a customer’s name in two different tables. Then, if you update the name in one of the tables but not the other, the database has become inconsistent.
The most obvious type of redundant-data mistake is to create a field that exists in two or more tables. But there are more subtle types of redundant data. For example, consider an Invoice table that contains a LineItemTotal field that represents the sum of the Total fields in each of the invoice’s line items. Technically, this field represents redundant data; the data is also stored in the Total fields of each line item.
Whether you should allow this type of redundancy depends on the application. In many cases, it’s better to put up with the redundancy for the convenience and efficiency of not having to recalculate the total each time the data is accessed. But it’s always worth considering whether the added convenience is worth the risk of corrupting the data.
Use a naming convention
To avoid confusion, pick a naming convention for your database objects and stick to it. That way your database tables, columns, constraints, and other objects will be named in a consistent and predictable way. (Just think of the savings on aspirin.)
You can argue from now until St. Swithen’s day about what the naming conventions should be. That doesn’t matter so much. What does matter is that you make a convention — and follow it.
Allowing nulls in your database tables significantly complicates the application programming required to access the tables. As a result, avoid nulls by specifying NOT NULL whenever you can. Use nulls rarely, and only when you truly need them.
Nulls are often misused anyway. The correct use of null is for a value that is unknown; not for a blank or empty value. For example, consider a typical address record that allows two address lines, named Address1 and Address2. Most addresses have only one address, so the second address line is blank. The value of this second address line is, in fact, known — it’s blank. That’s not the same thing as null. Null would imply that the address may have a second address line; we just don’t know what it is.
Even for columns that might seem appropriate for nulls, it’s usually more convenient to just leave the column value blank for values that aren’t known. For example, consider a phone number column in a Customer table. It’s safe to assume that all your customers have phone numbers, so it would be correct to use null for phone numbers that you don’t know. However, from a practical point of view, it’s just as easy to disallow nulls for the phone number column, and leave the unknown phone numbers blank.
Avoid secret codes
Avoid fields with names like CustomerType, where the value of the field is one of several constants that aren’t defined elsewhere in the database, such as R for Retail or W for Wholesale. You may have only these two types of customers today, but the needs of the application may change in the future, requiring a third customer type.
An alternative would be to create a separate table of customer-type codes (call it CustomerTypes), and then create a foreign-key constraint so the value of the CustomerType column must appear in the CustomerTypes table.
Use constraints wisely
Constraints let you prevent changes to the database that violate the internal consistency of your data. For example, a check constraint lets you validate only data that meets certain criteria. For example, you can use a check constraint to make sure the value of a field named Price is greater than zero.
A foreign-key constraint requires that the value of a column in one table must match the value that exists in some other table. For example, if you have a LineItems table with a column named ProductID, and a Products table with a column also named ProductID, you could use a foreign-key constraint to make sure that the ProductID value for each row in the LineItems table matches an existing row in the Products table.
Use triggers when appropriate
A trigger is a procedure that kicks in when certain database data is updated or accessed. Triggers are a great way to enforce those database rules that are more complicated than simple constraints. For example, suppose an Invoice table contains an ItemCount column whose value is the number of line items for the invoice. One way to maintain the value of this column automatically would be to create triggers that increment the ItemCount column whenever a line item is inserted, and decrement the ItemCount column whenever a line item is deleted. Sometimes automation is a beautiful thing.
Use stored procedures
Stored procedures are SQL procedures that are tucked away in the database and are part of it. There are several advantages to using stored procedures instead of coding SQL in your applications:
- Using stored procedures removes the burden of SQL programming from your application programmers. Instead, it makes the SQL used to access the database a part of the database itself — no fuss, no muss. All the application programs have to do is call the appropriate stored procedures to select, insert, update, or delete database data.
- Stored procedures are more efficient as a way of handling transactions, because the database server handles the entire transaction.
- Stored procedures are also more efficient because they reduce the amount of network traffic between the database server and the Web server.
- Finally, stored procedures are more secure because they reduce the risk of SQL injection attacks.