It is easier to illustrate the concept of relational database than explain it, but I will try to do both.
A relational database uses related tables of data to optimize the database structure. There are a few goals you want to keep in mind when designing a database:
- Efficiency and speed of retrieving data
- The ease of using your data for reports, data exports, web pages, etc.
- Data integrity: ensuring that data is accurate
A poorly designed database makes it hard to get data out. What purpose does a database serve if you can't retrieve the data in a predictable way?
Here is an example of how a poor structure can interfere with a database's purpose. The illustration above shows a table containing friends. The database designer simply uploaded an Excel spreadsheet, and it had multiple columns for phone numbers. What if:
- The person has more than 4 telephone numbers?
- The person has no home phone, but 3 cells?
- The person has fewer than 4 telephone numbers?
Instead of creating repeating columns in a table (a violation of database normalization rules), the database designer should have created a phone numbers table and allowed the user to enter as many or as few phone numbers as there were for a particular person. Features could be added to accurately describe which number was which. Perhaps additional information would be useful, for example a ranking of phone numbers: if someone has several, which should I try first?
Removing repeating columns to related tables also makes the database more efficient. Let's say in the example I gave that the database designer accommodated additional numbers by having 10 phone number columns in the table. Lauren Appel has eight phone numbers, Jonathan Bell has five, and Jacob Smith has just two. But the database now has several records with null values in many of the phone number fields. This is a waste of space, and a poor design for reporting as well.
Remove related data to separate tables.
Remember the primary key?
How do we relate the phone numbers table to the friends table? Were you thinking ahead to this question?
In the database conception article, I talked about the primary key (PK). This is what we use to create a relationship between the master table and its children, if you will, except in the subordinate table this field is called the foreign key (FK).
Subordinate tables need their own primary key as well as a foreign key that ties them to the master table. A simple primer is addressed in Relating Content with Keys.