cmcculloh web design
cmcculloh.com - Personal - Tutorials - SQL Lectures - 2.14.06: Relational Tables

Relational Tables

Michael Xue
Christopher McCulloh
2.14.06

Valid XHTML 1.0 Strict

You can write a program to interface with the server, but in this class we will probably just use the premade "MySQL" client.

We are going to talk about how to create tables that are related.

Lets say we have a person, and we want to keep track of the information of this person (last name, first name, etc) as well of the food that this person likes.

Now, a naive view of this (especially of people addicted to MS) would be to keep this information in excel. This excel file has 500 columns. An excel person would make a table called "everything". DON'T EVER CREATE A TABLE LIKE THIS. THIS IS WHAT NOT TO DO. Then we will have a column called "last name" and one called "first name" and "address". Lets just say you keep track of just that for now. Now we add a column for food that they like. So person firstname => AA and lastname => AA and address => AA and food => pizza. Then person BB etc likes hotdogs. on and on until you get zz etc who likes Chinese food. Except that for a given person, they may like multiple foods. Like what if AA likes pizza AND hotdogs. So now we have to add AA to the table again and this time just say he likes hotdogs. Then if this person ALSO likes Chinese food, you have to make a whole new row just for them with the only change being that they like Chinese food. This is a bad idea. It contains alot of duplicate data. So now let's do the following:

Let's have a table called "people". In there we will only store information about people and nothing else. We will have a lastname column, a firstname column, and an address column.

Now let's go back to the bad way. Let's say that we are also keeping track of the drinks that people like. So now we stupidly add "drink". Now lets say the first guy likes water. But (heaven forbit) let's say that he likes pepsi too! So now we go down to hotdog and put pepsi next to it. Then we find out they like mountain dew, and so we put it next to Chinese food. Now lets say they find out they like beer too! So now we create a new row with just beer and no food. Over the years you will end up with a spreadsheet full of holes and duplicates. People run around and say, "yeah, who cares, it works!". But if you have a huge spreadsheet with all this info, it takes up alot of space! It's just a really really bad idea.

So let's look at the good way to say it. Ok, looking at the people table, we should add a column that has a unique ID column. Now, remember that this table only keeps unique information about people. So if we put food in there too, it ruins the DB design. It turns your DB into a giant spreadsheet. This is ugly. So, people should be in the people table, and food should be in the food table. So if we want to keep track of food, we just have a table with food, and a unique ID column.

The good thing about this is that people is by itself. If you want to keep track of food, you don't have to change the people table. You just change the food table. Now lets say you want to keep track of drinks. You make a drink table with two columns, one for drink and one for the unique ID. That way no table is actually part of the other. Now everything is independant, and you don't effect everything by changing one table.

Now we have an issue. How do we relate all of this together? How do we tell who likes pizza? Here is where relational databases come in.

In the food table we are going to somehow relate it to the people table. How do we do this? One way is to take the unique ID columns for people and for food, and make them relate to eachother. We rename the unique ID column in food to people ID. The one in people is a primary key, and the one in food is a foreign key. So if you want to know who likes pizza, you take the people ID column next to pizza in the food column, and then match that number with the number in people and see who is associated with in people. A foreign key points to a primary key. Now what is the primary key for food? It is the combination of the two columns.

Now we will look at another design that does not work in this case. We have so many pizza values in the food column, why not do the following to get rid of duplication?

We have a many to many situation. So we now need an intermediate table. It is called "people food". It has two columns. On is called "pid" the other is "fid". pid stores people's id. fid stores food ids. so lets say that a person with id 1 likes pizza (with an id of 1), then we just have the fid be 1 and the pid be 1. then if they also like hot dog (with an id of 2) then the fid will be 2 and the pid will be 1.

Now let's create the people table.

Now let's insert data into the People table.

Now we are going to make the second table, the "favorite food" table.

Now we are going to insert some data into the table.

MySQL has a special feature that allows you to just put a comma between the data to be insterted so that you don't have to define the columns that you want to insert into over and over again.

Now let's try putting some sort of a rediculous value like "9999" in for person_id. It should give you an error. The constraint for the foreign key should catch it since that person does not exist.

MySQL has two engines running the database. NODB and Storage Engine. Only when you are running storage engine will the violation be caught. But the default is NODB. So we need to use the storage engine. Oracle does not have that issue. If you violate your foriegn constraint it will give you an error.

Now let's go back to our person creation statement, and put gender "z" in. It works. It is easy to type the wrong data and your database will just accept it. Having invalid data is worse than having no data at all. Let's see if we can fix that.

Let's put something in that will only allow a certain range of values. (for oracle it won't even let you insert the row. For MySQL it will give you a warning, but it will set the value to nothing so that it won't have invalid data, but it still inserts the row. So we will use enum('M','F'), and it will only let you use Ms or Fs.

Foreign Key violation is caught only when you use innoDB storage Engine in MySQL. Use 'regular Engine' in MySQL will NOT do that. Check with Oracle.

So now we need to drop the table, and re-create it with the enum thing.

Now we need to learn how to modify existing data within a table without having to delete it and re-insert it.

In summary, we learned about primary and foreign key. Primary Key is the parent, and the foreign key is the child. We also learned about enum() to constrain data. We learned about the Date column, and the format must match YYYY-MM-DD (all DB systems are different). We also learned of the importance of using the where clause.

If you really want to understand something, you have to be a jedi. It may be harder in the beginning to be a jedi, it's not cool and neat because it's in command line and not graphical. But you will really understand the way everything is getting done. In the end the graphical programs are doing exactly what we are doing. You just get to use graphics, and it hides what is going on for you. If you use graphical methods, then you have to re-learn the graphical user interface for every different program. This is why we learn from the fundamental way, and we try to avoid product specific methods so that we can go and use anything and not just one specific product that we learn here. This is the Jedi way...

Once you understand the fundamental way of doing it, it is likely you will have a GUI interface to help you. But you will have bugs. But since you are a Jedi, you can go behind the scenes and fix it.

We will be using a database based upon a miniature bank. Below is a link to download the script file that will create this database for us to use, as well as a link to exercises...

Database Creation Script
Exercises

Select everything in the database creation script page, and paste it into a text file and name it something.sql. Then use the "source something.sql" command in MySQL to create your table.

Look around this database and get familiar with it. Try and figure out how the tables relate to eachother. Then look at the exercise, and see if you can do the exercises.