Wrox Home  
Search P2P Archive for: Go

  Return to Index  

sql_language thread: Outer Join Fun


Message #1 by "Jon Hanlon" <JHanlon@c...> on Tue, 27 Aug 2002 02:51:33
There is a difference in the timing of the evaluation of the conditions in a
WHERE clause versus the conditions in the FROM clause (ON expression), which
is why you are seeing the different behaviors.

Although systems optimize all over the place, you can think of the
processing of a query to follow these steps.  Because of optimizations,
things don't actually happen this way, but it is helpful to think of them as
though they do:

1.  The first thing that happens is that the FROM clause is evaluated.  A
working table is built from all of the tables in the FROM clause.  Each row
in the working table is constructed containing all of the columns in in all
of the tables.  In the case of an INNER JOIN, only those rows are retained
where the join condition is true.  In the case of an OUTER JOIN, all rows in
the first (preserved) table are retained, and only those rows in the second
are retained which match the join condition.  If there is no row in the
second table which matches the outer join condition for a given row of the
preserved table, then all of the columns corresponding to that (missing) row
from the second table are set to NULL in the working table.  The 'preserved'
table is that table on the left of the LEFT OUTER JOIN expression, and on
the right for a RIGHT OUTER JOIN.  If you use the<tablename> AS
<correlationname> syntax, you are giving an alias to the column names in
this working table, and you must use this name in the remainder of the
query, since all the rest of the expressions in this query refer to columns
in the working table.  JOIN operations are processed left to right unless
modified by parenthesis.

2.  Next, the WHERE clause is evaluated and it removes rows from the working
table that do not pass criteria; that is, that do not test to TRUE -
conditions which evaluate to FALSE (or UNKNOWN because of comparisons to
NULL) mean the row is removed from the working table.

3.  If there is an optional GROUP BY clause,  then groups are constructed by
reducing each group to a single row, replacing the original working table
with the new grouped table. All of the columns in the rows of a grouped
table must be either a grouping column or an aggregate functions on the
column.

4.  If there is an optional HAVING clause, it is applied to the grouped
working table in much the same manner as a WHERE clause.

5.  Only the columns mentioned in the SELECT clause are retained in the
final result.

Note that expressions in the FROM clause refer to columns in the tables
mentioned in the FROM clause, whereas the columns in a WHERE clause actually
refer to columns in the working table.

That's why you get different answers depending on where you place an
expression like 'o.OrderID IS NULL'.  In a JOIN ... ON expression, it refers
to the column in the actual table; in a WHERE clause it refers to the column
named 'o.OrderID' in the working table.  This column can be NULL if there is
no match on the OUTER JOIN condition, even if the actual table doesn't have
a row with a NULL in it.

--
Jeff Mason			Custom Apps, Inc.
Jeff@c...

-----Original Message-----
From: Jon Hanlon [mailto:JHanlon@c...]
Sent: Tuesday, August 27, 2002 2:52 AM
To: sql language
Subject: [sql_language] Outer Join Fun


I have a question regarding outer join syntax.  It's best explained by
example.
In the Microsoft Northwind database there are seven customers starting
with 'B' (ignore the other letters for the moment).  Their CustomerID's
are:
Bergs  Blaus  Blonp  Bolid  Bonap  Bottm  Bsbev

Now, on the Orders table, we see that Employee 5 has had dealings with
three of them:
Bergs  Blonp  & Bonap.

I want a list of Customers that Employee 5 hasn't dealt with.  ie:
Blaus  Bolid  Bottm  Bsbev

The following query works:
SELECT c.customerID, o.employeeID
       FROM customers c LEFT OUTER JOIN  orders o
       ON c.customerID = o.customerID AND o.employeeID = 5
       WHERE o.OrderID IS NULL
       ORDER BY c.customerID

However, if I swop the conditions in the AND and WHERE clauses:
SELECT c.customerID, o.employeeID
       FROM customers c LEFT OUTER JOIN  orders o
       ON c.customerID = o.customerID AND o.OrderID IS NULL
       WHERE o.EmployeeID = 5
       ORDER BY c.customerID
It returns no records.

Neither does:
SELECT c.customerID, o.employeeID
       FROM customers c LEFT OUTER JOIN  orders o
       ON c.customerID = o.customerID AND o.OrderID IS NULL and
o.employeeID = 5
       ORDER BY c.customerID

Nor:
SELECT c.customerID, o.employeeID
       FROM customers c LEFT OUTER JOIN  orders o
       ON c.customerID = o.customerID
       WHERE o.OrderID IS NULL and o.employeeID = 5
       ORDER BY c.customerID

Nor, in fact, any other combination of clauses.

Could someone please explain the rules of ON and OUTER and WHERE to me?
Thanks.




  Return to Index