Why Talk about Relational Database Design?



The Power of Relational Databases
Relational database programs (like Access, Filemaker pro, or Infomaker) let us extract information from collections of tables using queries.

The diagram above shows examples of tables that may be found in databases within departments. The blue lines signify relationships between departmental tables. The red lines show potential interactions among tables in other departments. Computer Resources and Student Services are experimenting with some interdepartmental linkages. We are learning that for the vision of data sharing to work, departmental databases have to be designed according to relational principles.

Reason #1: Database systems designed according to relational principles facilitate information access and sharing through the use of queries.

Queries:
Queries carry out three basic operations on relational tables:

1.

Join Rows from Different Tables

Example: Join rooms table with classes table to compare # of seats with # of students

2.

Select Rows from Tables

Example: Select students who graduate this term

3.

Summarize Rows

Example: Calculate the sum of all of the staff hours charged to each department.

Queries result in new tables which can be used for reports, or as targets for new queries.

Relational Tables
Although lots of things we use in our work are tabular in format -- like spreadsheets, or ledger reports. Tables in a relational environment have special qualities that assure relational query tools will work reliably. Folks who create relational tables for their data can assure that their tables have these relational qualities by following certain principles when building tables.

Reason #2: If we ignore the principles of relational database design when building our tables, then we shouldn't be surprised if our queries don't return reliable results, and if our information cannot be shared easily.

The most common misconception that folks suffer from when building a database is to think of the report that you want to see, and to build a table which looks like that report. This temptation arises from a familiarity with spreadsheets -- a pre-relational mind set.

Relational Principles
The principles that relational databases are based upon are derived from the mathematical theory of sets (E.F. Codd, 1969.) They are surprisingly simple and elegant:

  1. A table is a collection of rows. Each row in a table describes a single entity or relationship. The entities or relationships described by different rows in a table are unique (none repeat). Therefore, each entity in a table can be referenced by a single row-ID. Example: In a table of employees, you would not want to list Edna three times (if you did, the count of employees would be incorrect if you ever summarized this way.)
  2. When one table references information in another, the unique ID of the appropriate row in the other table is used. Example: In a table that contains the relationship between class meetings and rooms, the classes are referenced by the unique class ID, and the rooms are referenced by the unique room ID.
  3. There may be tables that deal only with relationships. Example: A table may hold the relation between employees and departments. Such a table may have Edna listed three times.
  4. The columns in each table describe one to one relationships. Example: A course has one title, and one description, but may have more than one instructor; Title, and Description, are valid columns for the course table, instructors and meetings belong in other tables. Where necessarry, addititonal tables can define the one-to-many relationships between courses and instructors, or meetings, courses and rooms.
  5. The information in columns for a particular row should all be attributes of the particular entity described by that row -- not attributes of some other entity that is referenced in another column. Example: In a table about books, "Publisher" is an attribute of the book, "Publisher's Address" is an attribute of the publisher and belongs in a Publisher table. If you follow this principle, there will be no relationship in your database which is defined more than once.
  6. A column should contain one particle of information. Example: Resist the temptation to put City and Zipcode in the same column; you never know when someone may want you to select students in a particular zipcode!

Creating tables according to these principles usually results in breaking information into more tables than you expected and joining information from multiple tables through queries. It is tempting to exclaim that this makes the database 'more complicated' but from a relational perspective, this process is making the database as simple as it can be.

A set of tables that conforms to these relational norms creates a 'relational environment.' The query tools provided by access and similar software assume that the tables adhere to these norms. The query tools may partially work in cases where the principles are not followed, but not totally, and not reliably.

Do I Really Need to Understand all of this?
It is not necessary to understand all of these relational design concepts before working with databases in Access. But you will eventually come to appreciate them if you spend time trying to figure out why queries don't work reliably, or why the report your boss asks for ought to be possible, and yet the information in the database cannot be rearranged that way without writing a macro. If we want to eventually arrive at a future where we are sharing information across departments, ultimately all of those participating will need to understand these things.