To ANSI or Not To ANSI Gravenstein, Costello, Maurer Amtrust Bank Session #420 Speaker Qualifications Rumpi Gravenstein, Application Developer, Senior, Amtrust Bank Newsletter Editor, North East Ohio Oracle Users Group Has been working with Oracle since 1988 Past Presenter at meetings of: Independent Oracle Users Group North East Ohio Oracle Users Group Agenda Brief History Review Join Technologies Analysis Recommendation ANSI/Oracle Support History ANSI here refers to SQL/99 Join Syntax The standard to which all RDBMS vendors strive to comply SQL/99 support started with Oracle 9i in 2001 Two years after the release of the standard, Oracle supports it. This presentation restricted to the Oracle implementation of the ANSI standard 3 Join Condition Types Equijoin Columns with the same name Columns with different names Outerjoin Left (left driving table) Right (right driving table) Full (both tables driving) Cross/Cartesian product Traditional Equijoin - Same Name Traditional Oracle Approach SELECT e.ename AS Employee_name, d.deptno, d.dname AS Department_name FROM emp e, dept d WHERE e.deptno = d.deptno; Table prefix is required to remove ambiguity on common columns Join conditions must be listed ANSI Equijoin Natural Syntax ANSI SQL Natural Join SELECT ename AS employee_name, deptno, dname AS department_name FROM emp NATURAL JOIN dept; God forbid that you accidentally add a non-join column to both tables…(um.. audit columns...) No table prefix if column is part of join condition. No commas between tables. Join columns implied, based on columns that have the same name ANSI Equijoin Using Syntax ANSI SQL Join USING Table prefix allowed on columns that are not part of the using clause (join condition) SELECT d.dname, e.ename FROM emp d JOIN dept d USING ( deptno ); Add additional join columns using ( deptno, join_col2, join_col3, …) Several columns share same name, only joining on some of them, in this case deptno Traditional Equijoin Syntax Traditional Join, columns different SELECT d.department_name, l.city FROM departments d, locations l WHERE d.location_id = l.id; Join column names are different ANSI On Equijoin Syntax ANSI SQL ON SELECT d.department_name, l.city FROM departments d JOIN locations l ON ( d.location_id = l.id ); Use ON when join column names are different List join conditions here like traditional syntax ANSI Equijoin Syntax ANSI SQL Multi Table On Bring in first table join SELECT e.empno, l.loc_id, No commas between tables d.dname, l.state_tx FROM locations l JOIN dept d ON ( d.location_id = l.id ) JOIN emp e ON ( d.deptno = e.deptno ); Bring in second table join Any prior table column is visible – joins from left to right ANSI Equijoin Syntax ANSI SQL INNER SELECT e.emp_id, l.city, ON clause requires reference to join columns by table name to resolve d.dept_name, ambiguity d.deptno FROM locations l INNER JOIN dept d ON ( d.location_id = l.id ) INNER JOIN emp e ON d.deptno = e.deptno; INNER – an optional keyword stating this is an equijoin (not an outer or cross join) Parenthesis are optional, we like to include them for clarity Traditional Outerjoin Syntax Traditional Outer Join SELECT e.ename, d.dname FROM emp e, dept d WHERE e.deptno (+) = d.deptno NULL in name if no employees in the department. Traditional Outer Join Notation (+) indicator denotes expand records on this side if needed ANSI Outerjoin Syntax Left Outer Join NULL in last name if no employees in the department. SELECT e.ename, d.dname OUTER keyword is optional. FROM dept d LEFT OUTER JOIN emp e ON (e.deptno = d.deptno); LEFT denotes that the dominant table is to the left (dept) and that all of it’s rows will be returned. The right table is expanded with NULL records ANSI Outerjoin Syntax Left Outer Join USING e.g. LEFT JOIN emp e USING (deptno) can be used in an INNER and OUTER join. SELECT e.ename, d.dname FROM dept d NATURAL LEFT JOIN emp e; We don’t recommend using it here either! NATURAL can be used in an INNER and OUTER join. ANSI Outerjoin Syntax Right Outer Join NULL in last name if no employees in the department. SELECT e.ename, d.dname FROM emp e RIGHT OUTER JOIN dept d ON (e.deptno = d.deptno); RIGHT OUTER denotes that the dominant table is to the right. The left table gets expanded with NULLS. Traditional Outerjoin Syntax -Close Full Outer Join Can only be represented with a “UNION” query. SELECT e.ename, d.dname FROM emp e, dept d WHERE e.deptno (+) = d.deptno UNION SELECT e.ename, d.dname FROM emp e, dept d WHERE e.deptno = d.deptno (+) This “full” join syntax can be found on internet as “true” full join Shouldn’t see many of these (We’ve never needed one) UNION ALL is incorrect as it results in duplicate rows UNION performs an implicit “DISTINCT” on result – possibly removing desired rows Traditional Full Close-Execution scott@VOTER> SELECT e.ename, 2 d.dname 3 FROM emp e, dept d 4 WHERE e.deptno (+) = d.deptno 5 UNION 6 SELECT e.ename, 7 d.dname 8 FROM emp e, dept d 9 WHERE e.deptno = d.deptno (+) 10 ; 16 rows selected. Plan reflects the implicit DISTINCT inherent in UNION clause Notice Cost 15 Execution Plan ---------------------------------------------------------0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=15 Card=28 Bytes=588) 1 0 SORT (UNIQUE) (Cost=15 Card=28 Bytes=588) 2 1 UNION-ALL 3 2 HASH JOIN (OUTER) (Cost=7 Card=14 Bytes=294) 4 3 TABLE ACCESS (FULL) OF 'DEPT' (TABLE) (Cost=3 Card=5 Bytes=60) 5 3 TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=3 Card=14 Bytes=126) 6 2 HASH JOIN (OUTER) (Cost=7 Card=14 Bytes=294) 7 6 TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=3 Card=14 Bytes=126) 8 6 TABLE ACCESS (FULL) OF 'DEPT' (TABLE) (Cost=3 Card=5 Bytes=60) Traditional Full Join - True SELECT e.ename, d.dname FROM emp e, dept d WHERE e.deptno = d.deptno(+) UNION ALL SELECT NULL, d.dname FROM dept d WHERE NOT EXISTS (SELECT 1 FROM emp e WHERE e.deptno = d.deptno) Outer join emp table with the dept table – all emp rows now returned Use UNION ALL to avoid implicit DISTINCT inherent in UNION Add Dept rows that have no match from emp This syntax can be found on internet at http://optimizermagic.blogspot.com under post on “Outerjoins in Oracle” Traditional Full Join True-Execution scott@VOTER> SELECT e.ename, d.dname 2 FROM emp e, dept d 3 WHERE e.deptno = d.deptno(+) 4 UNION ALL 5 SELECT NULL, d.dname 6 FROM dept d 7 WHERE NOT EXISTS 8 (SELECT 1 9 FROM emp e 10 WHERE e.deptno = d.deptno); 16 rows selected. Explain plan for “True” traditional outer join – SORT (UNIQUE) missing Cost 13 here, prior plan had cost of 15 Execution Plan ---------------------------------------------------------0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=13 Card=16 Bytes=324) 1 0 UNION-ALL 2 1 HASH JOIN (OUTER) (Cost=7 Card=14 Bytes=294) 3 2 TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=3 Card=14 Bytes=126) 4 2 TABLE ACCESS (FULL) OF 'DEPT' (TABLE) (Cost=3 Card=5 Bytes=60) 5 1 HASH JOIN (ANTI) (Cost=7 Card=2 Bytes=30) 6 5 TABLE ACCESS (FULL) OF 'DEPT' (TABLE) (Cost=3 Card=5 Bytes=60) 7 5 TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=3 Card=14 Bytes=42) ANSI Outerjoin Syntax Full Outer Join NULL in last name if no employees in the department. SELECT e.ename, d.dname NULL in department name if employee not in a department. FROM emp e FULL OUTER JOIN dept d ON (e.deptno = d.deptno); OUTER is optional FULL OUTER denotes that the table to the right AND the table to the left will have all their records returned ANSI Full Execution Plan scott@VOTER> SELECT e.ename, 2 d.dname 3 FROM emp e 4 FULL OUTER JOIN dept d 5 ON (e.deptno = d.deptno); 16 rows selected. Cost 13 = Cost of Traditional True Execution Plan ---------------------------------------------------------0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=13 Card=16 Bytes=256) 1 0 VIEW (Cost=13 Card=16 Bytes=256) 2 1 UNION-ALL 3 2 HASH JOIN (OUTER) (Cost=7 Card=14 Bytes=294) 4 3 TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=3 Card=14 Bytes=126) 5 3 TABLE ACCESS (FULL) OF 'DEPT' (TABLE) (Cost=3 Card=5 Bytes=60) 6 2 HASH JOIN (ANTI) (Cost=7 Card=2 Bytes=30) 7 6 TABLE ACCESS (FULL) OF 'DEPT' (TABLE) (Cost=3 Card=5 Bytes=60) 8 6 TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=3 Card=14 Bytes=42) Traditional Cross Join/Cross Product Cross Join SELECT emp_id, ename, dname Inadvertent cross join not obvious if FROM emp e, joined table has a single row dept d WHERE d.deptno = 10; No join condition between tables… um, normally not good. ANSI Cross Join/Cross Product Cross Join SELECT emp_id, ename, dname FROM emp e CROSS JOIN dept d WHERE d.deptno = 10; Explicit CROSS condition, impossible to do this accidentally. ANSI Correlated Join Correlated Query Syntax Join ANSI doesn’t allow join clause on first table SELECT empno, ename Only tables in “current” FROM emp e FROM clause visible to WHERE EXISTS ANSI join logic ( SELECT NULL FROM dept d INNER JOIN locations l ON ( l.loc_id = d.loc ) WHERE d.deptno = e.deptno ); Not really a “mixed” syntax join ANSI Correlated Subquery Issues SELECT * FROM dept d INNER JOIN locations2 l USING ( loc ) “NATURAL” joins have the same WHERE EXISTS issue ( Don’t be tempted to remove the SELECT NULL table prefix “l” FROM emp e WHERE e.loc = l.loc ) Scope of reference dictates that the closest column be used ORA-25154: column part of USING clause cannot have qualifier ANSI Outer Join Subtleties SELECT d.deptno, e.ename, e.job FROM dept d LEFT JOIN emp e ON ( e.deptno = d.deptno AND e.job = 'SALESMAN'); DEPTNO ====== 30 30 30 30 50 40 20 10 ENAME ====== ALLEN WARD MARTIN TURNER JOB ======== SALESMAN SALESMAN SALESMAN SALESMAN Filter is applied before join is executed A number of rows returned that have “OUTER” joined emp data Only dept 30, Sales, has SALESMAN as a job. ANSI Outer Join Subtleties SELECT d.deptno, e.ename, e.job FROM dept d LEFT JOIN emp e ON ( e.deptno = d.deptno) WHERE e.job = 'SALESMAN'; DEPTNO ====== 30 30 30 30 ENAME ====== ALLEN WARD MARTIN TURNER JOB ======== SALESMAN SALESMAN SALESMAN SALESMAN Filter is applied after join is executed No outer joined data The condition has effectively converted this OUTER JOIN query to an INNER JOIN Mixed Traditional/ANSI Join Mixed Syntax Join Presentation Bonus – This section not in conference proceedings. SELECT emp_id, Oracle will run this without ename, throwing an error dname FROM emp e INNER JOIN dept d USING (deptno), dual WHERE deptno = 10; Uhg…. Choose one or the other, but not both please! Mixed Syntax Join scott@VOTER> SELECT empno, Oracle will run this without 2 ename, throwing an error 3 dname, 4 dummy 5 FROM dual, 6 emp e 7 INNER JOIN dept d ON (d.deptno = e.deptno) 8 WHERE e.deptno = 10; EMPNO ---------7782 7839 7934 ENAME ---------CLARK KING MILLER 3 rows selected. DNAME -------------ACCOUNTING ACCOUNTING ACCOUNTING D X X X Multiple traditional and ANSI sections are allowed Mixed Syntax Join scott@VOTER> SELECT empno, Traditional mixed in between 2 ename, ANSI 3 dname, 4 dummy 5 FROM emp e 6 dual, 7 INNER JOIN dept d 8 ON ( d.deptno = e.deptno ) 9 WHERE e.deptno = 10; dual, * ERROR at line 6: ORA-00933: SQL command not properly ended Mixed Syntax Join Presence of both comma (Traditional) and JOIN Mixed syntax join SELECT empno, ename, dname, dummy FROM dual, emp e INNER JOIN dept d on and WHERE e.deptno = 10; and ANSI joins can only see other tables taking part in ANSI join d.deptno = e.deptno dummy IS NOT NULL dummy IS NOT NULL * ERROR at line 8: ORA-00904: "DUMMY": invalid identifier ANSI vs Traditional Join Analysis Impact areas Code Clarity Flexibility Ease of Use Readability Join Errors Developer Training DBA Training Legacy Code Standards Code Clarity – Readability (Traditional) SELECT /*+ qb_name(orig) */ fdla.dim_borrower_v_id dim_borrower_v_id FROM dim_as_of_date_vw daod, dim_daily_loan_applctn_detl ddlad, dim_disbursement_date_vw dddv, dim_loan_originator dlo, fact_daily_loan_application fdla, dim_loan_applctn_status_vw dlasv WHERE daod.dim_as_of_date_v_id = ddlad.dim_as_of_date_v_id AND daod.dim_as_of_date_v_id = fdla.dim_as_of_date_v_id AND ddlad.dim_daily_loan_applctn_detl_id =fdla.dim_daily_loan_applctn_detl_id AND ddlad.dim_as_of_date_v_id = fdla.dim_as_of_date_v_id AND dddv.dim_disbursement_date_v_id = fdla.dim_disbursement_date_v_id AND dlo.dim_loan_originator_id = fdla.dim_loan_originator_id AND dlasv.DIM_LOAN_APPLCTN_STATUS_V_ID = fdla.DIM_LOAN_APPLCTN_STATUS_V_ID AND NOT (dlasv.STATUS_CODE BETWEEN '700' AND '740') AND NOT (dlasv.status_code BETWEEN '000' AND '429') AND daod.as_of_calendar_date = (CASE WHEN &in_DATE_SLICE IS NULL THEN LAST_DAY (ADD_MONTHS (TRUNC(SYSDATE), -1)) + &c_DEFAULT_SLICE_OFFESET ELSE TO_DATE( &in_DATE_SLICE, &c_DATE_FORMAT ) END) AND dddv.disburse_date BETWEEN TRUNC(NVL(TO_DATE(&in_START_REPORT_MONTH,&c_DATE_FORMAT),ADD_MONTHS(SYSDATE, -1)), 'MM') AND TRUNC (LAST_DAY (NVL(TO_DATE(&in_END_REPORT_MONTH,&c_DATE_FORMAT),ADD_MONTHS(SYSDATE, -1)))) AND ddlad.loan_transfer_status_code != 'T' Can you quickly determine how tables are joined? This is a real join we’ve implemented as part of a recent project Code Clarity – Readability (ANSI) SELECT /*+ qb_name(orig) */ fdla.dim_borrower_v_id dim_borrower_v_id FROM dim_as_of_date_vw daod INNER JOIN fact_daily_loan_application fdla ON (daod.dim_as_of_date_v_id = fdla.dim_as_of_date_v_id) INNER JOIN dim_daily_loan_applctn_detl ddlad ON ( ddlad.dim_as_of_date_v_id = daod.dim_as_of_date_v_id AND ddlad.dim_daily_loan_applctn_detl_id = fdla.dim_daily_loan_applctn_detl_id AND ddlad.dim_as_of_date_v_id = fdla.dim_as_of_date_v_id ) INNER JOIN dim_disbursement_date_vw dddv ON ( dddv.dim_disbursement_date_v_id = fdla.dim_disbursement_date_v_id) INNER JOIN dim_loan_originator dlo ON (dlo.dim_loan_originator_id = fdla.dim_loan_originator_id) INNER JOIN dim_loan_applctn_status_vw dlasv ON (dlasv.dim_loan_applctn_status_v_id = fdla.dim_loan_applctn_status_v_id) WHERE NOT (dlasv.STATUS_CODE BETWEEN '700' AND '740') AND NOT (dlasv.status_code BETWEEN '000' AND '429') AND daod.as_of_calendar_date = (CASE WHEN &in_DATE_SLICE IS NULL THEN LAST_DAY (ADD_MONTHS (TRUNC(SYSDATE), -1)) + &c_DEFAULT_SLICE_OFFESET ELSE TO_DATE( &in_DATE_SLICE, &c_DATE_FORMAT ) END) AND dddv.disburse_date BETWEEN TRUNC(NVL(TO_DATE(&in_START_REPORT_MONTH,&c_DATE_FORMAT),ADD_MONTHS(SYSDATE, -1)), 'MM') AND TRUNC (LAST_DAY (NVL(TO_DATE(&in_END_REPORT_MONTH,&c_DATE_FORMAT),ADD_MONTHS(SYSDATE, -1)))) AND ddlad.loan_transfer_status_code != 'T' Table Join Conditions Are Easily Identified Code Clarity – Join Errors Traditional Syntax SELECT col1, col2, ... FROM tab1 t1, tab2 t2, ... WHERE ... Is a table join condition missing? How do you know? Code Clarity – Join Errors ANSI Syntax must identify join type: INNER SELECT col1, OUTER FULL col2, CROSS ... FROM tab1 t1 [join type] tab2 t2 [join condition] ... WHERE ... “Impossible” to do inadvertent CROSS join Identifies join condition USING or ON Flexibility – (+) Restrictions not present in ANSI You cannot specify the (+) operator in a query block that also contains FROM clause join syntax. The (+) operator can appear only in the WHERE clause or, in the context of left-correlation (that is, when specifying the TABLE clause) in the FROM clause, and can be applied only to a column of a table or view. If A and B are joined by multiple join conditions, then you must use the (+) operator in all of these conditions. If you do not, then Oracle Database will return only the rows resulting from a simple join, but without a warning or error to advise you that you do not have the results of an outer join. The (+) operator does not produce an outer join if you specify one table in the outer query and the other table in an inner query. You cannot use the (+) operator to outer-join a table to itself, although self joins are valid. For example, the following statement is not valid: SELECT employee_id, manager_id FROM employees WHERE employees.manager_id(+) = employees.employee_id; However, the following self join is valid: SELECT e1.employee_id, e1.manager_id, e2.employee_id FROM employees e1, employees e2 WHERE e1.manager_id(+) = e2.employee_id; “Oracle strongly recommends that you use the more flexible FROM clause (ANSI) join syntax” The (+) operator can be applied only to a column, not to an arbitrary expression. However, an arbitrary expression can contain one or more columns marked with the (+) operator. A WHERE condition containing the (+) operator cannot be combined with another condition using the OR logical operator. A WHERE condition cannot use the IN comparison condition to compare a column marked with the (+) operator with an expression. A WHERE condition cannot compare any column marked with the (+) operator with a subquery. Ease of Use/Developer & DBA Training Traditional Join Long time Oracle developers do nothing New Oracle Developers need to learn Oracle syntax ANSI Join Works with SQL Server/Oracle/MySQL/… Syntax is more readable/self documenting Natural join is “un-natural” - can lead to errors Installed Code Base ANSI Joins not present in the Oracle installed code base ANSI Joins present in other RDBMS installed code More of these databases coming all the time Harm in having two join syntaxes Support personnel have to be comfortable with both syntaxes Additional training required Fishbone Diagram Only Allow Traditional Join Syntax Error Free SQL Allow ANSI Join Syntax Recommendations Allow both Provide training so that all are familiar with both Place some restrictions on ANSI syntax to prevent problems Do not allow NATURAL joins and possibly USING clause Single SQL statements should use one or the other but not both New development should try to use same syntax throughout Long term goal, ANSI only Session Goals Familiarity with ANSI Join Syntax Understanding the merits of the ANSI join syntax Intention to start using the ANSI syntax ? Questions Thank You Please complete the evaluation forms My Name: Rumpi Gravenstein Session Title: To ANSI or not to ANSI Session #: 420 If you have additional questions, I can be reached at rgravenstein@amtrust.com