Enhancing ETL Performance Copyright © 2009, Oracle. All rights reserved. Objectives After completing this lesson, you should be able to: • Describe the performance enhancement methods that you can apply in ETL mapping design • Describe configuration of performance parameters in schema design • Describe the process of gathering statistics in Warehouse Builder 11 - 2 Copyright © 2009, Oracle. All rights reserved. Lesson Agenda • • Performance parameters and strategies Performance parameters in ETL design: – Operating modes – Commit control – Partition exchange loading • Performance parameters in Schema design: – Indexes and partitions – Enabling parallel access • 11 - 3 Error Logging and gathering statistics Copyright © 2009, Oracle. All rights reserved. Devise a Performance Strategy Early • • Involve a data architect (if you have one) prior to Warehouse Builder implementation to determine a performance strategy. A performance strategy influences: – Technology choices – Refresh frequency – Whether to use staging tables, partition exchange loading (PEL), or other technologies – How and when to collect data – How to organize source data Have data architect think about performance first! 11 - 4 Copyright © 2009, Oracle. All rights reserved. Performance Tuning at Various Levels You can tune performance at various levels of data warehouse implementation: • ETL mapping design • Schema design • Possible performance bottlenecks in: – – – – – 11 - 5 Hardware Operating system Network Database Application Copyright © 2009, Oracle. All rights reserved. ETL Design: Mappings • • • Extract, transform, and load (ETL) involves the movement and transformation of data from your sources to your targets. OWB mappings specify which source data objects provide data to which target data objects. Well designed ETL mapping can make all the difference in the performance of your data warehouse. • Consider the cost of using function calls. • Minimize context switching. 11 - 6 Copyright © 2009, Oracle. All rights reserved. Performance-Related Parameters in ETL Design When creating ETL logic, you need to keep in mind the run-time performance expectations. • Assess your mappings on the following parameters: – – – – 11 - 7 Operating modes: “Set based” or “Row based” Commit Control Enable Parallel DML Enable Partition Exchange Loading for targets Copyright © 2009, Oracle. All rights reserved. Pros and Cons of Operating Modes Set based • Processes rows as one transaction • Fastest way • Offers DML error logging for rows with problems Row based • Processes data row by row • Continues processing despite an error for a row • Logs information for each row Target Target Source Set-based Processing 11 - 8 Source Copyright © 2009, Oracle. All rights reserved. Row-based Processing Configuring Mappings for Operating Modes In the default “Set based fail over to row based” mode, Warehouse Builder uses the “Set based” operating mode, but switches to the slower “Row based” mode if data errors are encountered. 11 - 9 Copyright © 2009, Oracle. All rights reserved. DML Error Logging To enable DML error logging, specify the name of the error table by using the DML Error table name property. The Error table name property is used for “logical errors,” which include the orphan management errors and the data rule violation errors. The DML error table format and the logical error table format are not the same, so they cannot actually share the same table. 11 - 11 Copyright © 2009, Oracle. All rights reserved. Commit Control Default is set to Automatic. With this setting, OWB commits automatically when it senses the need, based on algorithms in the code. Specifies how often data is committed. Default is set to 1000 rows. 11 - 12 Copyright © 2009, Oracle. All rights reserved. Setting Default Audit Level Use “Default audit level” to indicate the audit level used when executing the package. You can set it to NONE, ERROR DETAILS, STATISTICS, or COMPLETE. 11 - 13 Copyright © 2009, Oracle. All rights reserved. Additional Run-Time Parameters for Mappings B A Warehouse Builder uses the “Bulk size” parameter only when the “Bulk processing code” option is selected and the operating mode is set to “Row based.” 11 - 14 Use “Maximum number of errors” to indicate the maximum number of errors allowed when executing the package. Copyright © 2009, Oracle. All rights reserved. Enable Partition Exchange Loading (PEL) for Targets Partition Exchange Loading loads new data by exchanging it into a target table as a partition. To enable PEL for a mapping, set the PEL Enabled property for the target to true. 11 - 15 Copyright © 2009, Oracle. All rights reserved. Best Practices Tips: ETL Design Factors That Impact Performance • Custom transformation impact – Inefficient context switch to the PL/SQL engine, which must interpret and execute the function for every row • Loading type impact – Insert/Update and Update/Insert should be used when appropriate. • External tables versus SQL Loader – Can be queried as if they were read-only tables – Enabling parallel access for the external tables can activate parallelism and enhance the performance. 11 - 17 Copyright © 2009, Oracle. All rights reserved. Quiz In Warehouse Builder, you can set the “Default audit level” to indicate how detailed the audit information generation will be. a. True b. False 11 - 19 Copyright © 2009, Oracle. All rights reserved. Quiz From the following, select other performance parameters you should consider for better ETL performance: a. Commit Control b. Use External tables versus SQL Loader c. Minimize context switches d. Use Partition Exchange loading 11 - 20 Copyright © 2009, Oracle. All rights reserved. Performance-Related Parameters in Schema Design Warehouse Builder provides schema design capabilities for: • Index management • Partitions • Enabling parallel access • Configuration of physical implementation properties on objects – Tablespace – Constraints management 11 - 21 Copyright © 2009, Oracle. All rights reserved. Benefits of Using Schema Design Capabilities in Warehouse Builder Design Center • When designing database objects such as tables in Design Center, you need not switch to the SQL*Plus environment. – You need not write SQL commands to define partitions, indexes, constraints, and so on. – While generating code, Warehouse Builder automatically incorporates the schema design settings. • • 11 - 22 Schema design is integrated with logical design, exposing full editing capabilities within the editors. Warehouse Builder facilitates better schema design by allowing detailed configuration of physical storage and sizing properties of the database objects per configuration. Copyright © 2009, Oracle. All rights reserved. Indexing To create an index, you use the Indexes tab in the Table Editor. You can specify the type, key columns, partitions, values, and local or global scope. 11 - 23 Copyright © 2009, Oracle. All rights reserved. Configuring Properties of Indexes To configure an index, open the configuration editor for the associated table. The tool-tip text shows each property's description. 11 - 24 Copyright © 2009, Oracle. All rights reserved. Index Performance Considerations: Drop Indexes Before the Load Process • Drop the indexes of the target object before the loading process and re-create the indexes after the load is completed. – This significantly improves performance because indexes will not have to be maintained during the load. • To achieve this, in load mappings: – Use a Pre-mapping process operator that will invoke a procedure to drop the indexes before the loading starts. – Use a Post-mapping process operator that will invoke a procedure to re-create the indexes after the loading is complete. 11 - 25 Copyright © 2009, Oracle. All rights reserved. Index Performance Considerations: Drop Indexes Before the Load Process (Example Mapping) 11 - 26 Copyright © 2009, Oracle. All rights reserved. Constraints Management Constraints checking may slow load performance. 11 - 27 • Enable Constraints: Slower data load because constraints are checked for each row • Exceptions Table Name property: Row IDs of records that do not conform will be logged into this table. Copyright © 2009, Oracle. All rights reserved. Configuring Constraints Individually You can also configure constraints individually in the Configuration Properties window of the owning object. Configuring foreign key constraint 11 - 28 Copyright © 2009, Oracle. All rights reserved. Defining Partitions in Warehouse Builder You can define partitions in the Partitions tab of the Table Editor. Different types of partitions are supported. 11 - 30 Copyright © 2009, Oracle. All rights reserved. Defining Partitions in Warehouse Builder Set the partition tablespace parameters in the configuration properties of the partitioned table. 11 - 31 Copyright © 2009, Oracle. All rights reserved. Parallelism • • • 11 - 32 You can enable parallel access for tables in their configuration properties. You can set the Parallel Access mode to PARALLEL or NOPARALLEL. The default is PARALLEL. You can specify the Parallel Degree, which is the number of parallel threads used in the parallel operation. Copyright © 2009, Oracle. All rights reserved. Enable Parallel DML If you select this option, Warehouse Builder enables Parallel DML at run time. The objects involved in the mapping should be enabled for parallelism to take advantage of this option. 11 - 33 Copyright © 2009, Oracle. All rights reserved. Setting Tablespace Properties You can set the tablespace properties at: • User level • Module level • Object level User Object Module 11 - 34 Copyright © 2009, Oracle. All rights reserved. Minimum Error Logging Set the LOGGING MODE property to NOLOGGING when logging is not needed. Set the default audit level property for mappings appropriately. 11 - 35 Copyright © 2009, Oracle. All rights reserved. Gathering Statistics If you want the modification statistics to be collected on a table, set the Statistics Collection property to MONITORING in the Configuration Properties window. Configuration properties of a table. 11 - 36 Copyright © 2009, Oracle. All rights reserved. Analyze Table Statements Property • • 11 - 37 Set the “Analyze table statements” property for a mapping. By default, the “Analyze table statements” property is not enabled. Copyright © 2009, Oracle. All rights reserved. Gathering Schema Statistics In Design Center, select Tools > Preferences. In the Preferences dialog box, expand OWB, select Environment, and then check “Allow Optimize Repository Warning on Startup” to update the schema statistics every time you log in to Warehouse Builder. 11 - 38 Copyright © 2009, Oracle. All rights reserved. Quiz When you enable parallel DML for a mapping, you must also enable the source and target objects for parallel access. a. True b. False 11 - 39 Copyright © 2009, Oracle. All rights reserved. Summary In this lesson, you should have learned how to: • Describe the performance enhancement methods that you can apply in ETL mapping design • Describe configuration of performance parameters in schema design • Describe the process of gathering statistics in Warehouse Builder 11 - 40 Copyright © 2009, Oracle. All rights reserved. Practice 11-1 Overview: Configuring Performance Parameters for Mappings and Tables In this practice, you examine the following: • The performance, in terms of elapsed time, of a mapping when executed in two different operating modes (Row based and Set based) • Configuration properties of mappings • Indexes, constraints, and partitions on a table (you also configure these) 11 - 41 Copyright © 2009, Oracle. All rights reserved.