HTML5 and CSS3 All-in-One For Dummies
Book image
Explore Book Buy On Amazon

Trying to cram all your data into a single SQL table usually causes problems for HTML5 and CSS3 programming. The process for solving these problems is called data normalization. Normalization is really a set of rules. When your database follows the first rule, it's said to be in first normal form. Here, you get to the third normal form, which is suitable for most applications.

First normal form

The official definitions of the normal forms sound like the offspring of a lawyer and a mathematician. Here's an official definition of the first normal form:

A table is in first normal form if and only if it represents a relation. It does not allow nulls or duplicate rows.

Yeah, whatever.

Here's what it means in practical terms:

Eliminate listed fields.

A database is in first normal form if

  • It has no repeating fields. Take any data that would be in a repeating field and make it into a new table.

  • It has a primary key. Add a primary key to each table. (Some would argue that this requirement isn't necessarily part of first normal form, but it'll be necessary in the next step, anyway.)

In a practical sense, the first normal form means getting rid of listed fields and making a new table to contain powers. You'll need to go back to the model view to create a new table and then create the diagram again.

image0.jpg

A couple of things happen here:

  1. Make a new table called power.

    This table contains nothing but a key and the power name.

  2. Take the power field away from the hero table.

    The hero table no longer has a power field.

  3. Add a primary key to both tables.

    Both tables now have an integer primary key. Looking over the tables, there are no longer any listed fields, so you’re in first normal form.

Second normal form

The official terminology for the second normal form is just as baffling as the first normal form:

A table is in second normal form (2NF) only if it is in 1NF and all nonkey fields are dependant entirely on the entire candidate key, not just part of it.

Huh? You've gotta love these computer scientists.

In practical terms, second normal form is pretty easy, too. It really means

Eliminate repetition.

Look at all those places where you've got duplicated data and create new tables to take care of them.

In the hero data, you can eliminate a lot of problems by breaking the hero data into three tables.

Many of the problems in the badhero design happen because apparently more than one hero can be on a particular mission, and thus the mission data gets repeated. By separating mission data into another table, you guarantee that the data for a mission is entered only once.

Note that each table has a primary key, and none of them has listed fields. The same data won't ever be entered twice. The solution is looking pretty good!

Notice that everything related to the mission has been moved to the mission table. One field was added to the hero table, which contains an integer. This field is called a foreign key reference.

image1.jpg

Third normal form

The third normal form adds one more requirement. Here is the official definition:

A table is in 3NF if it is in 2NF and has no transitive dependencies on the candidate key.

Wow! These definitions get better and better. Once again, it's really a lot easier than it sounds:

Ensure functional dependency.

In other words, check each field of each table and ensure that it really describes what the table is about. For example, is the plot related to the mission or the hero? What about the villain?

The tricky thing about functional dependency is that you often don't really know how the data is supposed to be connected. Only the person who uses the data really knows how it's supposed to work. You have to work with the client to figure out exactly what the business rules (the rules that describe how the data really works) are. You can't really tell from the data itself.

The good news is that, for simple structures like the hero data, you're often already in third normal form by the time you get to second normal form. Still, you should check. After a database is in third normal form, you've reduced the possibility of several kinds of anomalies, so your data is far more reliable than it was in the past.

Several other forms of normalization exist, but third normal form is enough for most applications.

About This Article

This article is from the book:

About the book author:

Andy Harris taught himself programming because it was fun. Today he teaches computer science, game development, and web programming at the university level; is a technology consultant for the state of Indiana; has helped people with disabilities to form their own web development companies; and works with families who wish to teach computing at home.

This article can be found in the category: