ANSI Join SQL - csit.parkland.edu

advertisement
To ANSI or Not To ANSI
Gravenstein, Costello, Maurer
Amtrust Bank
Session #420
Speaker Qualifications
 Rumpi Gravenstein, Application Developer,
Senior, Amtrust Bank
 Newsletter Editor, North East Ohio Oracle
Users Group
 Has been working with Oracle since 1988
 Past Presenter at meetings of:


Independent Oracle Users Group
North East Ohio Oracle Users Group
Agenda
 Brief History
 Review Join Technologies
 Analysis
 Recommendation
ANSI/Oracle Support History
 ANSI here refers to SQL/99 Join Syntax
 The standard to which all RDBMS vendors
strive to comply
 SQL/99 support started with Oracle 9i in 2001
 Two years after the release of the standard,
Oracle supports it.
 This presentation restricted to the Oracle
implementation of the ANSI standard
3 Join Condition Types
 Equijoin


Columns with the same name
Columns with different names
 Outerjoin



Left (left driving table)
Right (right driving table)
Full (both tables driving)
 Cross/Cartesian product
Traditional Equijoin - Same Name
 Traditional Oracle Approach
SELECT e.ename AS Employee_name,
d.deptno,
d.dname AS Department_name
FROM emp e,
dept d
WHERE e.deptno = d.deptno;
Table prefix is required to
remove ambiguity on
common columns
Join conditions must be
listed
ANSI Equijoin Natural Syntax
 ANSI SQL Natural Join
SELECT ename AS employee_name,
deptno,
dname AS department_name
FROM emp
NATURAL JOIN dept;
God forbid that you accidentally
add a non-join column to both
tables…(um.. audit columns...)
No table prefix if column is
part of join condition.
No commas between
tables.
Join columns implied,
based on columns that
have the same name
ANSI Equijoin Using Syntax
 ANSI SQL Join USING
Table prefix allowed on columns that
are not part of the using clause (join
condition)
SELECT d.dname,
e.ename
FROM emp d
JOIN dept d USING ( deptno );
Add additional join columns using
( deptno, join_col2, join_col3, …)
Several columns share
same name, only joining
on some of them, in this
case deptno
Traditional Equijoin Syntax
 Traditional Join, columns different
SELECT d.department_name,
l.city
FROM departments d,
locations l
WHERE d.location_id = l.id;
Join column names are
different
ANSI On Equijoin Syntax
 ANSI SQL ON
SELECT d.department_name,
l.city
FROM departments d
JOIN locations l ON ( d.location_id = l.id );
Use ON when join column
names are different
List join conditions here
like traditional syntax
ANSI Equijoin Syntax
 ANSI SQL Multi Table On
Bring in first table join
SELECT e.empno,
l.loc_id,
No commas between tables
d.dname,
l.state_tx
FROM locations l
JOIN dept d ON ( d.location_id = l.id )
JOIN emp e ON ( d.deptno = e.deptno );
Bring in second table join
Any prior table column is visible
– joins from left to right
ANSI Equijoin Syntax
 ANSI SQL INNER
SELECT e.emp_id,
l.city,
ON clause requires reference to join
columns by table name to resolve
d.dept_name,
ambiguity
d.deptno
FROM locations l
INNER JOIN dept d ON ( d.location_id = l.id )
INNER JOIN emp e ON d.deptno = e.deptno;
INNER – an optional keyword
stating this is an equijoin (not an
outer or cross join)
Parenthesis are optional, we
like to include them for clarity
Traditional Outerjoin Syntax
 Traditional Outer Join
SELECT e.ename,
d.dname
FROM emp e,
dept d
WHERE e.deptno (+) = d.deptno
NULL in name if no employees in
the department.
Traditional Outer Join Notation
(+) indicator denotes expand
records on this side if needed
ANSI Outerjoin Syntax
 Left Outer Join
NULL in last name if no employees
in the department.
SELECT e.ename,
d.dname
OUTER keyword is optional.
FROM dept d
LEFT OUTER JOIN emp e
ON (e.deptno = d.deptno);
LEFT denotes that the dominant table is to the left (dept)
and that all of it’s rows will be returned. The right table
is expanded with NULL records
ANSI Outerjoin Syntax
 Left Outer Join
USING e.g.
LEFT JOIN emp e USING (deptno)
can be used in an INNER and OUTER join.
SELECT e.ename,
d.dname
FROM dept d
NATURAL LEFT JOIN emp e;
We don’t recommend using it here
either!
NATURAL can be used in an
INNER and OUTER join.
ANSI Outerjoin Syntax
 Right Outer Join
NULL in last name if no employees
in the department.
SELECT e.ename,
d.dname
FROM emp e
RIGHT OUTER JOIN dept d
ON (e.deptno = d.deptno);
RIGHT OUTER denotes that the dominant table is
to the right. The left table gets expanded with
NULLS.
Traditional Outerjoin Syntax -Close
 Full Outer Join

Can only be represented with a “UNION” query.
SELECT e.ename,
d.dname
FROM emp e, dept d
WHERE e.deptno (+) = d.deptno
UNION
SELECT e.ename,
d.dname
FROM emp e, dept d
WHERE e.deptno = d.deptno (+)
This “full” join syntax can be found
on internet as “true” full join
Shouldn’t see many of these
(We’ve never needed one)
UNION ALL is incorrect as it
results in duplicate rows
UNION performs an implicit
“DISTINCT” on result – possibly
removing desired rows
Traditional Full Close-Execution
scott@VOTER> SELECT e.ename,
2
d.dname
3
FROM emp e, dept d
4
WHERE e.deptno (+) = d.deptno
5 UNION
6 SELECT e.ename,
7
d.dname
8
FROM emp e, dept d
9
WHERE e.deptno = d.deptno (+)
10 ;
16 rows selected.
Plan reflects the implicit
DISTINCT inherent in UNION
clause
Notice Cost 15
Execution Plan
---------------------------------------------------------0
SELECT STATEMENT Optimizer=ALL_ROWS (Cost=15 Card=28 Bytes=588)
1
0
SORT (UNIQUE) (Cost=15 Card=28 Bytes=588)
2
1
UNION-ALL
3
2
HASH JOIN (OUTER) (Cost=7 Card=14 Bytes=294)
4
3
TABLE ACCESS (FULL) OF 'DEPT' (TABLE) (Cost=3 Card=5 Bytes=60)
5
3
TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=3 Card=14 Bytes=126)
6
2
HASH JOIN (OUTER) (Cost=7 Card=14 Bytes=294)
7
6
TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=3 Card=14 Bytes=126)
8
6
TABLE ACCESS (FULL) OF 'DEPT' (TABLE) (Cost=3 Card=5 Bytes=60)
Traditional Full Join - True
SELECT e.ename, d.dname
FROM emp e, dept d
WHERE e.deptno = d.deptno(+)
UNION ALL
SELECT NULL, d.dname
FROM dept d
WHERE NOT EXISTS
(SELECT 1
FROM emp e
WHERE e.deptno = d.deptno)
Outer join emp table with the dept
table – all emp rows now returned
Use UNION ALL to avoid implicit
DISTINCT inherent in UNION
Add Dept rows that have no match
from emp
This syntax can be found on internet at http://optimizermagic.blogspot.com under
post on “Outerjoins in Oracle”
Traditional Full Join True-Execution
scott@VOTER> SELECT e.ename, d.dname
2
FROM emp e, dept d
3
WHERE e.deptno = d.deptno(+)
4 UNION ALL
5 SELECT NULL, d.dname
6
FROM dept d
7
WHERE NOT EXISTS
8
(SELECT 1
9
FROM emp e
10
WHERE e.deptno = d.deptno);
16 rows selected.
Explain plan for “True” traditional
outer join – SORT (UNIQUE)
missing
Cost 13 here, prior plan had
cost of 15
Execution Plan
---------------------------------------------------------0
SELECT STATEMENT Optimizer=ALL_ROWS (Cost=13 Card=16 Bytes=324)
1
0
UNION-ALL
2
1
HASH JOIN (OUTER) (Cost=7 Card=14 Bytes=294)
3
2
TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=3 Card=14 Bytes=126)
4
2
TABLE ACCESS (FULL) OF 'DEPT' (TABLE) (Cost=3 Card=5 Bytes=60)
5
1
HASH JOIN (ANTI) (Cost=7 Card=2 Bytes=30)
6
5
TABLE ACCESS (FULL) OF 'DEPT' (TABLE) (Cost=3 Card=5 Bytes=60)
7
5
TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=3 Card=14 Bytes=42)
ANSI Outerjoin Syntax
Full Outer Join
NULL in last name if no employees
in the department.
SELECT e.ename,
d.dname
NULL in department name if
employee not in a department.
FROM emp e
FULL OUTER JOIN dept d
ON (e.deptno = d.deptno);
OUTER is optional
FULL OUTER denotes that the table to the right
AND the table to the left will have all their records
returned
ANSI Full Execution Plan
scott@VOTER> SELECT e.ename,
2
d.dname
3
FROM emp e
4
FULL OUTER JOIN dept d
5
ON (e.deptno = d.deptno);
16 rows selected.
Cost 13 = Cost of
Traditional True
Execution Plan
---------------------------------------------------------0
SELECT STATEMENT Optimizer=ALL_ROWS (Cost=13 Card=16 Bytes=256)
1
0
VIEW (Cost=13 Card=16 Bytes=256)
2
1
UNION-ALL
3
2
HASH JOIN (OUTER) (Cost=7 Card=14 Bytes=294)
4
3
TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=3 Card=14 Bytes=126)
5
3
TABLE ACCESS (FULL) OF 'DEPT' (TABLE) (Cost=3 Card=5 Bytes=60)
6
2
HASH JOIN (ANTI) (Cost=7 Card=2 Bytes=30)
7
6
TABLE ACCESS (FULL) OF 'DEPT' (TABLE) (Cost=3 Card=5 Bytes=60)
8
6
TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=3 Card=14 Bytes=42)
Traditional Cross Join/Cross Product
 Cross Join
SELECT emp_id,
ename,
dname
Inadvertent cross join not obvious if
FROM emp e,
joined table has a single row
dept d
WHERE d.deptno = 10;
No join condition between tables…
um, normally not good.
ANSI Cross Join/Cross Product
 Cross Join
SELECT emp_id,
ename,
dname
FROM emp e
CROSS JOIN dept d
WHERE d.deptno = 10;
Explicit CROSS condition, impossible
to do this accidentally.
ANSI Correlated Join
 Correlated Query Syntax Join
ANSI doesn’t allow join clause on
first table
SELECT empno,
ename
Only tables in “current”
FROM emp e
FROM clause visible to
WHERE EXISTS
ANSI join logic
( SELECT NULL
FROM dept d
INNER JOIN locations l
ON ( l.loc_id = d.loc )
WHERE d.deptno = e.deptno
);
Not really a “mixed”
syntax join
ANSI Correlated Subquery Issues
SELECT *
FROM dept d
INNER JOIN locations2 l
USING ( loc )
“NATURAL” joins have the same
WHERE EXISTS
issue
(
Don’t be tempted to remove the
SELECT NULL
table prefix “l”
FROM emp e
WHERE e.loc = l.loc
)
Scope of reference dictates that the
closest column be used
ORA-25154: column part of USING clause cannot
have qualifier
ANSI Outer Join Subtleties
SELECT d.deptno, e.ename, e.job
FROM dept d
LEFT JOIN emp e
ON (
e.deptno = d.deptno
AND e.job = 'SALESMAN');
DEPTNO
======
30
30
30
30
50
40
20
10
ENAME
======
ALLEN
WARD
MARTIN
TURNER
JOB
========
SALESMAN
SALESMAN
SALESMAN
SALESMAN
Filter is applied before join is
executed
A number of rows returned that
have “OUTER” joined emp data
Only dept 30, Sales, has
SALESMAN as a job.
ANSI Outer Join Subtleties
SELECT d.deptno, e.ename, e.job
FROM dept d
LEFT JOIN emp e
ON (
e.deptno = d.deptno)
WHERE e.job = 'SALESMAN';
DEPTNO
======
30
30
30
30
ENAME
======
ALLEN
WARD
MARTIN
TURNER
JOB
========
SALESMAN
SALESMAN
SALESMAN
SALESMAN
Filter is applied after join is
executed
No outer joined data
The condition has effectively converted this
OUTER JOIN query to an INNER JOIN
Mixed Traditional/ANSI Join
 Mixed Syntax Join
Presentation Bonus – This
section not in conference
proceedings.
SELECT emp_id,
Oracle will run this without
ename,
throwing an error
dname
FROM emp e
INNER JOIN dept d USING (deptno),
dual
WHERE deptno = 10;
Uhg….
Choose one or the other, but
not both please!
Mixed Syntax Join
scott@VOTER> SELECT empno,
Oracle will run this without
2
ename,
throwing an error
3
dname,
4
dummy
5
FROM dual,
6
emp e
7
INNER JOIN dept d ON (d.deptno = e.deptno)
8
WHERE e.deptno = 10;
EMPNO
---------7782
7839
7934
ENAME
---------CLARK
KING
MILLER
3 rows selected.
DNAME
-------------ACCOUNTING
ACCOUNTING
ACCOUNTING
D
X
X
X
Multiple traditional and ANSI
sections are allowed
Mixed Syntax Join
scott@VOTER> SELECT empno,
Traditional mixed in between
2
ename,
ANSI
3
dname,
4
dummy
5
FROM emp e
6
dual,
7
INNER JOIN dept d
8
ON ( d.deptno = e.deptno )
9
WHERE e.deptno = 10;
dual,
*
ERROR at line 6:
ORA-00933: SQL command not properly ended
Mixed Syntax Join
Presence of both comma (Traditional)
and JOIN Mixed syntax join
SELECT empno,
ename,
dname,
dummy
FROM dual,
emp e
INNER JOIN dept d on
and
WHERE e.deptno = 10;
and
ANSI joins can only see other tables
taking part in ANSI join
d.deptno = e.deptno
dummy IS NOT NULL
dummy IS NOT NULL
*
ERROR at line 8:
ORA-00904: "DUMMY": invalid identifier
ANSI vs Traditional Join Analysis
 Impact areas
 Code Clarity




Flexibility
Ease of Use




Readability
Join Errors
Developer Training
DBA Training
Legacy Code
Standards
Code Clarity – Readability
(Traditional)
SELECT /*+ qb_name(orig) */
fdla.dim_borrower_v_id dim_borrower_v_id
FROM dim_as_of_date_vw daod,
dim_daily_loan_applctn_detl ddlad,
dim_disbursement_date_vw dddv,
dim_loan_originator dlo,
fact_daily_loan_application fdla,
dim_loan_applctn_status_vw dlasv
WHERE
daod.dim_as_of_date_v_id = ddlad.dim_as_of_date_v_id
AND daod.dim_as_of_date_v_id = fdla.dim_as_of_date_v_id
AND ddlad.dim_daily_loan_applctn_detl_id =fdla.dim_daily_loan_applctn_detl_id
AND ddlad.dim_as_of_date_v_id = fdla.dim_as_of_date_v_id
AND dddv.dim_disbursement_date_v_id = fdla.dim_disbursement_date_v_id
AND dlo.dim_loan_originator_id = fdla.dim_loan_originator_id
AND dlasv.DIM_LOAN_APPLCTN_STATUS_V_ID = fdla.DIM_LOAN_APPLCTN_STATUS_V_ID
AND NOT (dlasv.STATUS_CODE BETWEEN '700' AND '740')
AND NOT (dlasv.status_code BETWEEN '000' AND '429')
AND daod.as_of_calendar_date = (CASE WHEN &in_DATE_SLICE IS NULL THEN
LAST_DAY (ADD_MONTHS (TRUNC(SYSDATE), -1)) +
&c_DEFAULT_SLICE_OFFESET
ELSE TO_DATE( &in_DATE_SLICE, &c_DATE_FORMAT )
END)
AND dddv.disburse_date BETWEEN
TRUNC(NVL(TO_DATE(&in_START_REPORT_MONTH,&c_DATE_FORMAT),ADD_MONTHS(SYSDATE, -1)), 'MM')
AND TRUNC (LAST_DAY
(NVL(TO_DATE(&in_END_REPORT_MONTH,&c_DATE_FORMAT),ADD_MONTHS(SYSDATE, -1))))
AND ddlad.loan_transfer_status_code != 'T'
Can you quickly
determine how tables
are joined?
This is a real join we’ve
implemented as part of
a recent project
Code Clarity – Readability (ANSI)
SELECT /*+ qb_name(orig) */
fdla.dim_borrower_v_id dim_borrower_v_id
FROM
dim_as_of_date_vw daod
INNER JOIN fact_daily_loan_application fdla
ON (daod.dim_as_of_date_v_id = fdla.dim_as_of_date_v_id)
INNER JOIN dim_daily_loan_applctn_detl ddlad
ON (
ddlad.dim_as_of_date_v_id = daod.dim_as_of_date_v_id
AND ddlad.dim_daily_loan_applctn_detl_id =
fdla.dim_daily_loan_applctn_detl_id
AND ddlad.dim_as_of_date_v_id = fdla.dim_as_of_date_v_id )
INNER JOIN dim_disbursement_date_vw dddv
ON (
dddv.dim_disbursement_date_v_id = fdla.dim_disbursement_date_v_id)
INNER JOIN dim_loan_originator dlo
ON (dlo.dim_loan_originator_id = fdla.dim_loan_originator_id)
INNER JOIN dim_loan_applctn_status_vw dlasv
ON (dlasv.dim_loan_applctn_status_v_id = fdla.dim_loan_applctn_status_v_id)
WHERE
NOT (dlasv.STATUS_CODE BETWEEN '700' AND '740')
AND NOT (dlasv.status_code BETWEEN '000' AND '429')
AND daod.as_of_calendar_date = (CASE WHEN &in_DATE_SLICE IS NULL THEN
LAST_DAY (ADD_MONTHS (TRUNC(SYSDATE), -1)) +
&c_DEFAULT_SLICE_OFFESET
ELSE TO_DATE( &in_DATE_SLICE, &c_DATE_FORMAT )
END)
AND dddv.disburse_date BETWEEN
TRUNC(NVL(TO_DATE(&in_START_REPORT_MONTH,&c_DATE_FORMAT),ADD_MONTHS(SYSDATE, -1)), 'MM')
AND TRUNC (LAST_DAY
(NVL(TO_DATE(&in_END_REPORT_MONTH,&c_DATE_FORMAT),ADD_MONTHS(SYSDATE, -1))))
AND ddlad.loan_transfer_status_code != 'T'
Table
Join
Conditions
Are
Easily
Identified
Code Clarity – Join Errors
 Traditional Syntax
SELECT col1,
col2,
...
FROM tab1 t1,
tab2 t2,
...
WHERE ...
Is a table join condition
missing? How do you know?
Code Clarity – Join Errors
 ANSI Syntax
must identify join type:
INNER
SELECT col1,
OUTER
FULL
col2,
CROSS
...
FROM tab1 t1
[join type] tab2 t2
[join condition]
...
WHERE ...
“Impossible” to do inadvertent
CROSS join
Identifies join condition
USING or ON
Flexibility –
(+) Restrictions not present in ANSI





You cannot specify the (+) operator in a query block that also contains FROM clause join syntax.
The (+) operator can appear only in the WHERE clause or, in the context of left-correlation (that is, when specifying
the TABLE clause) in the FROM clause, and can be applied only to a column of a table or view.
If A and B are joined by multiple join conditions, then you must use the (+) operator in all of these conditions. If you do
not, then Oracle Database will return only the rows resulting from a simple join, but without a warning or error to
advise you that you do not have the results of an outer join.
The (+) operator does not produce an outer join if you specify one table in the outer query and the other table in an
inner query.
You cannot use the (+) operator to outer-join a table to itself, although self joins are valid. For example, the following
statement is not valid:
SELECT employee_id, manager_id
FROM employees
WHERE employees.manager_id(+) = employees.employee_id;

However, the following self join is valid:
SELECT e1.employee_id, e1.manager_id, e2.employee_id
FROM employees e1, employees e2
WHERE e1.manager_id(+) = e2.employee_id;




“Oracle strongly
recommends that you use
the more flexible FROM
clause (ANSI) join syntax”
The (+) operator can be applied only to a column, not to an arbitrary expression. However, an arbitrary expression can
contain one or more columns marked with the (+) operator.
A WHERE condition containing the (+) operator cannot be combined with another condition using the OR logical
operator.
A WHERE condition cannot use the IN comparison condition to compare a column marked with the (+) operator with
an expression.
A WHERE condition cannot compare any column marked with the (+) operator with a subquery.
Ease of Use/Developer & DBA
Training
 Traditional Join


Long time Oracle developers do nothing
New Oracle Developers need to learn Oracle
syntax
 ANSI Join



Works with SQL Server/Oracle/MySQL/…
Syntax is more readable/self documenting
Natural join is “un-natural” - can lead to errors
Installed Code Base
 ANSI Joins not present in the Oracle installed
code base
 ANSI Joins present in other RDBMS installed
code

More of these databases coming all the time
 Harm in having two join syntaxes


Support personnel have to be comfortable with
both syntaxes
Additional training required
Fishbone Diagram
Only Allow Traditional Join Syntax
Error Free
SQL
Allow ANSI Join Syntax
Recommendations
 Allow both
 Provide training so that all are familiar with both
 Place some restrictions on ANSI syntax to prevent
problems

Do not allow NATURAL joins and possibly USING
clause
 Single SQL statements should use one or the other
but not both

New development should try to use same syntax
throughout
 Long term goal, ANSI only
Session Goals
 Familiarity with ANSI Join Syntax
 Understanding the merits of the ANSI join
syntax
 Intention to start using the ANSI syntax
?
Questions
Thank You
 Please complete the evaluation forms
 My Name: Rumpi Gravenstein
 Session Title: To ANSI or not to ANSI
 Session #: 420
 If you have additional questions, I can be
reached at rgravenstein@amtrust.com
Download