A Visual Explanation of SQL Joins Assume we have the following two tables. TableA is on the left, and TableB is on the right. We'll populate them with four records each. TABLE A id name -- ---1 Pirate 2 Monkey 3 Ninja 4 Spaghetti TABLE B id name -- ---1 Rutabaga 2 Pirate 3 Darth Vader 4 Ninja create table TableA (id number(2), name varchar2(30)); create table TableB (id number(2), name varchar2(30)); insert insert insert insert into into into into TableA TableA TableA TableA values(1,'Pirate'); values(2,'Monkey'); values(3,'Ninja'); values(4,'Spaghetti'); insert into insert into insert into insert into COMMIT; TableB TableB TableB TableB values(1,'Rutabaga'); values(2,'Pirate'); values(3,'Darth Vader'); values(4,'Ninja'); Let's join these tables by the name field in a few different ways and see if we can get a conceptual match to those nifty Venn diagrams. INNER JOIN produces only the set of records that match in both Table A and Table B. SELECT * FROM TableA A, TableB b where A.name = B.name; SELECT * FROM TableA INNER JOIN TableB ON TableA.name = TableB.name; id -1 3 name ---Pirate Ninja id -2 4 name ---Pirate Ninja FULL OUTER JOIN produces the set of all records in Table A and Table B, with matching records from both sides where available. If there is no match, the missing side will contain null. SELECT * FROM TableA FULL OUTER JOIN TableB ON TableA.name = TableB.name; id -1 2 3 4 null null name ---Pirate Monkey Ninja Spaghetti null null id -2 null 4 null 1 3 name ---Pirate null Ninja null Rutabaga Darth Vader LEFT OUTER JOIN(With common data) produces a complete set of records from Table A, with the matching records(where available) in Table B. If there is no match, the right side will contain null. SELECT * FROM TableA A, TableB B where A.name = B.name(+) order by 1; SELECT * FROM TableA LEFT OUTER JOIN TableB ON TableA.name = TableB.name order by 1; id -1 2 3 4 name ---Pirate Monkey Ninja Spaghetti id -2 null 4 null name ---Pirate null Ninja null LEFT OUTER JOIN(With UNIQUE data) produces a complete set of records from Table A with NO MATCHING records in Table B. To produce the set of records only in Table A, but not in Table B, we perform the same left outer join, then exclude the records we don't want from the right side via a where clause. SELECT * FROM TableA A, TableB B where A.name = B.name(+) and B.id is null order by 1; SELECT * FROM TableA LEFT OUTER JOIN TableB ON TableA.name = TableB.name WHERE TableB.id IS null; id -2 4 name ---Monkey Spaghetti id -null null name ---null null or SELECT * FROM TableA MINUS SELECT * FROM TableB; SELECT * FROM TableA FULL OUTER JOIN TableB ON TableA.name = TableB.name WHERE TableA.id IS null OR TableB.id IS null; id -2 4 null null name ---Monkey Spaghetti null null id -null null 1 3 name ---null null Rutabaga Darth Vader To produce the set of records unique to Table A and Table B, we perform the same full outer join, then exclude the records we don't want from both sides via a where clause. There's also a cartesian product or cross join, which as far as I can tell, can't be expressed as a Venn diagram: SELECT * FROM TableA , TableB; SELECT * FROM TableA CROSS JOIN TableB; This joins "everything to everything", resulting in 4 x 4 = 16 rows, far more than we had in the original sets. If you do the math, you can see why this is a very dangerous join to run against large tables. More on Outer Joins Outer joins enable rows to be returned from a join where one of the tables does not contain matching rows for the other table. eg. Suppose we have two tables: Person -----Person_id --------- Name ---------------- Address_id ---------- 00001 00002 00003 00004 Address ------Address_id ---------00057 00092 00113 Fred Bloggs Joe Smith Jane Doe Sue Jones 00057 00092 00111 Address_Desc ------------------------1, Acacia Avenue, Anytown 13, High Street, Anywhere 52, Main Road, Sometown Then the simple join: SELECT PERSON.NAME, ADDRESS.ADDRESS_DESC FROM PERSON, ADDRESS WHERE PERSON.ADDRESS_ID = ADDRESS.ADDRESS_ID returns: NAME ---------Fred Bloggs Joe Smith ADDRESS_DESC -----------1, Acacia Avenue, Anytown 13, High Street, Anywhere But the outer join: SELECT PERSON.NAME, ADDRESS.ADDRESS_DESC FROM PERSON, ADDRESS WHERE PERSON.ADDRESS_ID = ADDRESS.ADDRESS_ID(+) returns: NAME ---------Fred Bloggs Joe Smith Jane Doe Sue Jones ADDRESS_DESC -----------1, Acacia Avenue, Anytown 13, High Street, Anywhere Note the two new rows for Jane Doe and Sue Jones. These are the people who do not have matching records on the ADDRESS table. Sue Jones had an address_id on her PERSON record, but this didn't match an address_id on the ADDRESS table. ( Probably a data inconsistency ). Jane Doe had NULL in her PERSON.ADDRESS_ID field, which obviously doesn't match any address_id on the ADDRESS table. Note that the outer join is created by including (+) on the WHERE clause which joins the two tables. The (+) is put against the column-name on the deficient table, ie. the one with the missing rows. It is very important to put the (+) on the correct table: putting it on the other table will give different results. eg. the query: SELECT PERSON.NAME, ADDRESS.ADDRESS_DESC FROM PERSON, ADDRESS WHERE PERSON.ADDRESS_ID(+) = ADDRESS.ADDRESS_ID returns: NAME ---------Fred Bloggs Joe Smith ADDRESS_DESC -----------1, Acacia Avenue, Anytown 13, High Street, Anywhere 52, Main Road, Someplace Anti Joins and Semi-Joins Anti-joins: Anti-joins are written using the NOT EXISTS or NOT IN constructs. An anti-join between two tables returns rows from the first table for which there are no corresponding rows in the second table. In other words, it returns rows that fail to match the sub-query on the right side. Suppose you want a list of departments with no employees. You could write a query like this: SELECT d.department_name FROM departments d MINUS SELECT d.department_name FROM departments d, employees e WHERE d.department_id = e.department_id ORDER BY department_name; The above query will give the desired results, but it might be clearer to write the query using an anti-join: SELECT d.department_name FROM departments d WHERE NOT EXISTS (SELECT NULL FROM employees e WHERE e.department_id = d.department_id) ORDER BY d.department_name; Semi-joins: Semi-joins are written using the EXISTS or IN constructs. A semi-join between two tables returns rows from the first table where one or more matches are found in the second table. The difference between a semi-join and a conventional join is that rows in the first table will be returned at most once. Suppose you want a list of departments with at least one employee. You could write the query like this: SELECT d.department_name FROM departments d, employees e WHERE d.department_id = e.department_id ORDER BY department_name; The department name in the query result will appear as many times as the number of employees in it. So, for example if a department has 30 employees then that department will appear in the query output 30 times. To eliminate the duplicate rows, you could use the DISTINCT or GROUP BY keywords. A more elegant solution is to use a semi-join between the departments and employees tables instead of a conventional join: SELECT d.department_name FROM departments d WHERE EXISTS (SELECT NULL FROM employees e WHERE e.department_id = d.department_id) ORDER BY d.department_name; The above query will list the departments that have at least one employee. The department will appear only once in the query output no matter how many employees it has. Equi and non-Equijoins The join condition determines whether a join is an equijoin or a non-equijoin. An equijoin is a join with a join condition containing an equality operator. An equijoin combines rows that have equivalent values for the specified columns. When a join condition relates two tables by an operator other than equality, it is a non-equijoin. A query may contain equijoins as well as non-equijoins. Equijoins are the most commonly used. An example of an equijoin: SELECT e.first_name, d.department_name FROM employees e INNER JOIN departments d ON e.department_id = d.department_id; FIRST_NAME -------------------Steven Neena Lex Alexander Bruce DEPARTMENT_NAME -----------------------------Executive Executive Executive IT IT Non-equijoins are less frequently used. An example of a non-equijoin: SELECT zip_codes.zip_code, zones.ID AS zip_zone, zones.low_zip, zones.high_zip FROM zones INNER JOIN zip_codes ON zip_codes.zip_code BETWEEN zones.low_zip AND zones.high_zip; ZIP_CODE ZIP_ZONE LOW_ZIP HIGH_ZIP -------- -------- ------- -------57000 1 57000 57999 84006 2 84000 84999