cmcculloh web design
cmcculloh.com - Personal - Tutorials - SQL Lectures - 2/28/06

Multi-Table Queries

Michael Xue
Christopher McCulloh
2.28.06

Valid XHTML 1.0 Strict

Up until now all info has come from just one table. Tonight we are going to talk about querying multiple tables.

So we will select from the employee and the department table. We want to get their fname, lname, and department name.

We need to link the tables. The linkage goes from a foreign key from the department table to a primary key in the employee table.

So we can do this:

Please note that we gave aliases to the employee and the department databases so that we could reference them.

If we do the code as written above, we are not specifying any keys. This is acalled a cartition join, and will give you alot of data. This is not correct.

Ok, now let's add something to make it more correct

In the first query when it ran, it actually just joined all employees to all departments. The next way actually did the query that we intended. But there is even a better way:

(you can actually just say join here, and it will default to inner join)

On functions as a where clause. It simply tells them what column to join them "on". But if you look at them they are the same except that you put on instead of where. We will see the advantage in the next example. Because although the syntax is almost exactly the same, it functions much differently.

Next we want to select the account id, customer id and open date from the account table. But we only want those that are opened by an experienced teller. This means that they are hired before the year 2003. Not only that, but we want those experienced tellers that are currently assigned to a branch called "woburn Branch".

So let's try and change that to an SQL statement...

The main information we are going to get will be from the account table. But we need to find the start date of the teller. That comes from the employee table. In terms of what branch the employee belongs to, we have to look at the branch table. Not only that, but we only want the tellers who opened the account. This could be a teller or a head teller. This is going to take three table joins.

The account links to the employee by openemp_id and emp_id (respectivly). In the employee table we have a start date, and a column that tell what their title is, as well as their assigned branch id. So we need to match the assigned_branch_id in the employee table and match it with the branch_id in the branch table.

So, our joins are:

open_emp_id to emp_id (account to employee)
assigned_branch_id to branch_id (employee to branch)

So what we need to get from each table is:

account:
open_emp_id where open_emp_id == emp_id
account_id
cust_id
open_date

employee:
start_date where start_date < 2003
title

branch:
name where branch_id == assigned_branch_id

So let's try something like this:

Here the teacher mentions that we could also use the IN clause. At the end of last weeks class he told us we would be talking about it this week, but he doesn't seem to remember that now. Here is a link where you can learn about "IN". To integrate "IN" into your statement, do the following:

So basically you just say the column name, then in, then the values you are searching for inside of parenthesis.

column_name in ('value1', 'value2', 'etc')

Now we are going to do a harder query.

You have an account table, and you want to select an account. You want a list of account ids along with their federal tax ids.

So first let's see what tables we need to draw from:

account
customer

Now let's see what fields we need:

account:
account_id

customer:
fed_id

Now we determine where our joins need to take place:

account.cust_id to customer.cust_id

So our query might look something like this:

Now let's say we only want a certain account. Let's say we only want accounts that the customer type is business.

So we need to also select the cust_type_cd from the customer table, and only select the accounts where the cust_type_cd is equal to B

Now let's make this harder still. Beyond the account_id and the fed_id we want to return the first and last name of the employee who opened the account.

Ok, so that information is going to come from the employee table. So now lets do a full layout of all of the data we need for our query to work:

So first let's see what tables we need to draw from:

account
customer
employee

Now let's see what fields we need:

account:
account_id

customer:
fed_id

employee:
fname
lname

Now we determine where our joins need to take place:

account.cust_id to customer.cust_id
account.open_emp_id to employee.emp_id

so our query might look something like this:

Now we are going to look at a join that only involves one table.

Let's look at our employee table. There is a column that has the employee id, and the superior employee id. What that means is, if you look at the employee table, and you see emp_id and superior_emp_id that means that the superior is also an employee in that table with an emp_id value equal to superior_emp_id in a row.

So one column in the table has a one to many quality with the table.

So we want to have the employee's last name and the employee's first name, then we want to put the supervisor's first name, and the supervisor's last name.

This is called a self join

We can look at one table as two seperate tables through aliasing. So we look at the first table, table a, and it has only supervisors records. The second table, table b, it only has subordinates.

So what we need is this:

employee.peon table:
fname
lname

employee.manager table:
fname
lname

join on:
employee.peon.superior_emp_id to employee.manager.emp_id

So let's try and write this query

If any column has a null value, that condition fails. Therefore while there are 18 employees our results only contain 17 people because the "big cheese" is not included. To show them we would have to use the "outer join" (which we will learn later).

If you want to just see how many records a table contains, you can use the count function:

Now we want to get a list of the employees and their supervisors who work at the woburn branch.

So the records we would need are:

tables:
employee
branch

rows:
employee.fname
emplyee.lname
branch.name

join:
employee.assigned_branch_id to branch.branch_id

where:
branch.name = 'Woburn Branch'

So our query might look something like this:

Now we are going to go back to the exercise we didn't finish last week. This is what I had at the end of class:

4. Fill in blanks(denoted by <#>) for the multi-data-set query to achieve the results shown
   below:

   select p.product_cd, a.cust_id, a.available_balance
   from product p inner join account <1> on p.product_cd = <2>
   where p.<3> = 'ACCOUNT'
   order by <4>

+------------+---------+---------------+
| product_cd | cust_id | avail_balance |
+------------+---------+---------------+
| CD         |       1 |       3000.00 |
| CD         |       6 |      10000.00 |
| CD         |       7 |       5000.00 |
| CD         |       9 |       1500.00 |
| CHK        |       1 |       1057.75 |
| CHK        |       2 |       2258.02 |
| CHK        |       3 |       1057.75 |
| CHK        |       4 |        534.12 |
| CHK        |       5 |       2237.97 |
| CHK        |       6 |        122.37 |
| CHK        |       8 |       3487.19 |
| CHK        |       9 |        125.67 |
| CHK        |      10 |      23575.12 |
| CHK        |      12 |      38552.05 |
| MM         |       3 |       2212.50 |
| MM         |       4 |       5487.09 |
| MM         |       9 |       9345.55 |
| SAV        |       1 |        500.00 |
| SAV        |       2 |        200.00 |
| SAV        |       4 |        767.77 |
| SAV        |       8 |        387.99 |
+------------+---------+---------------+

This doesn't work!

Man, I was close. Really, I only have two errors here, but all of the logic is correct. See if you can figure it out.

What happened was that I have two ";" and I have written "a.available_balance" instead of "a.avail_balance". So let's fix it:

yay

Now we are looking at set.

The first function of set is Union. That is everything that is A and everything that is B.

The next function is Intersection. That is everything that is A and B, but not just A or B.

Next is Minus. That is everything that is A but not B.

Next is Not Both. That is everything that is A or B but not both.

Sql has union and union all operations. As well as intersect, and except (mySql) or minus (oracle).

If you have a set that is (A, B) (A, C). When you union them, the result is (A, B, C). The duplicates are filtered out. But with union all you get (A, B, A, C).

If you have a set that is (A, B) Intersect (A, C) the result is (A).

If you have a set that is (A, B) \ (A, C) ("\" means except) the result is (B). This is because whatever is on in the first one stays, the rest is dropped (whatever is in the first one and the second one or just the second one gets dropped). Since B is only in the first, it is the only thing that is left.

If you have two seperate queries and you want to combine them, they must have the same number of columns. Also, they must have the same data type. So if you have an ID, fname, lname and gender in the first query. And in the second query you have ID, fname, lname and sex. They match up. Cool, it works.

Normally the operation goes like this: You have two seperate select statements, and then you say things like "union" or whatever.

Ok, now we are going to do two queries. In the first one we are going to select everything from the employee table, and then in the second query we are going to select everything from (suprise!) the employee table

Since this drops duplicates, we get 18 rows. But we want to see the duplicates so we will change it to:

and now we get 36 rows.

Now let's do something a bit harder. First we are going to select the customer ID and last name from a table called individual. Second we are going to select the customer ID, and their name, from the business table

Notice that when we got our results the heading in the second column was "lname". We should use an alias so that when it comes out that column will be correct.

Write a query that selects all of the employees' employee ID's that do not belong to Woburn Branch.

In SQL when we want to select things equal to something, we put "=", but for is not, we need to use "<>"

He wants us to go home and select out all of the employee IDs and then do an except to take out the ones that work at the Woburn Branch...