Enhancing ETL Performance
Copyright © 2009, Oracle. All rights reserved.
Objectives
After completing this lesson, you should be able to:
• Describe the performance enhancement methods that you
can apply in ETL mapping design
• Describe configuration of performance parameters in
schema design
• Describe the process of gathering statistics in Warehouse
Builder
11 - 2
Copyright © 2009, Oracle. All rights reserved.
Lesson Agenda
•
•
Performance parameters and strategies
Performance parameters in ETL design:
– Operating modes
– Commit control
– Partition exchange loading
•
Performance parameters in Schema design:
– Indexes and partitions
– Enabling parallel access
•
11 - 3
Error Logging and gathering statistics
Copyright © 2009, Oracle. All rights reserved.
Devise a Performance Strategy Early
•
•
Involve a data architect (if you have one) prior to
Warehouse Builder implementation to determine a
performance strategy.
A performance strategy influences:
– Technology choices
– Refresh frequency
– Whether to use staging tables, partition exchange loading
(PEL), or other technologies
– How and when to collect data
– How to organize source data
Have data architect think
about performance first!
11 - 4
Copyright © 2009, Oracle. All rights reserved.
Performance Tuning at Various Levels
You can tune performance at various levels of data warehouse
implementation:
• ETL mapping design
• Schema design
• Possible performance bottlenecks in:
–
–
–
–
–
11 - 5
Hardware
Operating system
Network
Database
Application
Copyright © 2009, Oracle. All rights reserved.
ETL Design: Mappings
•
•
•
Extract, transform, and load (ETL) involves the movement
and transformation of data from your sources to your
targets.
OWB mappings specify which source data objects provide
data to which target data objects.
Well designed ETL mapping can make all the difference in
the performance of your data warehouse.
• Consider the cost of
using function calls.
• Minimize context
switching.
11 - 6
Copyright © 2009, Oracle. All rights reserved.
Performance-Related Parameters in ETL Design
When creating ETL logic, you need to keep in mind the run-time
performance expectations.
• Assess your mappings on the following parameters:
–
–
–
–
11 - 7
Operating modes: “Set based” or “Row based”
Commit Control
Enable Parallel DML
Enable Partition Exchange Loading for targets
Copyright © 2009, Oracle. All rights reserved.
Pros and Cons of Operating Modes
Set based
• Processes rows as one
transaction
• Fastest way
• Offers DML error logging for
rows with problems
Row based
• Processes data row by row
• Continues processing
despite an error for a row
• Logs information for each
row
Target
Target
Source
Set-based
Processing
11 - 8
Source
Copyright © 2009, Oracle. All rights reserved.
Row-based
Processing
Configuring Mappings for Operating Modes
In the default “Set
based fail over to row
based” mode,
Warehouse Builder
uses the “Set based”
operating mode, but
switches to the
slower “Row based”
mode if data errors
are encountered.
11 - 9
Copyright © 2009, Oracle. All rights reserved.
DML Error Logging
To enable DML error logging,
specify the name of the error table
by using the DML Error table name
property.
The Error table name property is
used for “logical errors,” which
include the orphan management
errors and the data rule violation
errors.
The DML error table format and the logical error table format are not
the same, so they cannot actually share the same table.
11 - 11
Copyright © 2009, Oracle. All rights reserved.
Commit Control
Default is set to Automatic.
With this setting, OWB
commits automatically
when it senses the need,
based on algorithms in the
code.
Specifies how often data is
committed.
Default is set to 1000 rows.
11 - 12
Copyright © 2009, Oracle. All rights reserved.
Setting Default Audit Level
Use “Default audit level” to
indicate the audit level used when
executing the package. You can
set it to NONE, ERROR DETAILS,
STATISTICS, or COMPLETE.
11 - 13
Copyright © 2009, Oracle. All rights reserved.
Additional Run-Time Parameters for Mappings
B
A
Warehouse
Builder uses the
“Bulk size”
parameter only
when the “Bulk
processing
code” option is
selected and the
operating mode
is set to “Row
based.”
11 - 14
Use
“Maximum
number of
errors” to
indicate the
maximum
number of
errors
allowed when
executing the
package.
Copyright © 2009, Oracle. All rights reserved.
Enable Partition Exchange Loading (PEL)
for Targets
Partition Exchange Loading loads new data by exchanging it
into a target table as a partition.
To enable PEL for a
mapping, set the PEL
Enabled property for
the target to true.
11 - 15
Copyright © 2009, Oracle. All rights reserved.
Best Practices Tips: ETL Design Factors
That Impact Performance
•
Custom transformation impact
– Inefficient context switch to the PL/SQL engine, which must
interpret and execute the function for every row
•
Loading type impact
– Insert/Update and Update/Insert should be used when
appropriate.
•
External tables versus SQL Loader
– Can be queried as if they were read-only tables
– Enabling parallel access for the external tables can activate
parallelism and enhance the performance.
11 - 17
Copyright © 2009, Oracle. All rights reserved.
Quiz
In Warehouse Builder, you can set the “Default audit level” to
indicate how detailed the audit information generation will be.
a. True
b. False
11 - 19
Copyright © 2009, Oracle. All rights reserved.
Quiz
From the following, select other performance parameters you
should consider for better ETL performance:
a. Commit Control
b. Use External tables versus SQL Loader
c. Minimize context switches
d. Use Partition Exchange loading
11 - 20
Copyright © 2009, Oracle. All rights reserved.
Performance-Related Parameters
in Schema Design
Warehouse Builder provides schema design capabilities for:
• Index management
• Partitions
• Enabling parallel access
• Configuration of physical implementation properties on
objects
– Tablespace
– Constraints management
11 - 21
Copyright © 2009, Oracle. All rights reserved.
Benefits of Using Schema Design Capabilities
in Warehouse Builder Design Center
•
When designing database objects such as tables in Design
Center, you need not switch to the SQL*Plus environment.
– You need not write SQL commands to define partitions,
indexes, constraints, and so on.
– While generating code, Warehouse Builder automatically
incorporates the schema design settings.
•
•
11 - 22
Schema design is integrated with logical design, exposing
full editing capabilities within the editors.
Warehouse Builder facilitates better schema design by
allowing detailed configuration of physical storage and
sizing properties of the database objects per configuration.
Copyright © 2009, Oracle. All rights reserved.
Indexing
To create an index, you use the Indexes tab in the Table Editor.
You can specify the type, key columns, partitions, values, and
local or global scope.
11 - 23
Copyright © 2009, Oracle. All rights reserved.
Configuring Properties of Indexes
To configure an index, open the configuration editor for the
associated table.
The tool-tip text shows each
property's description.
11 - 24
Copyright © 2009, Oracle. All rights reserved.
Index Performance Considerations:
Drop Indexes Before the Load Process
•
Drop the indexes of the target object before the loading
process and re-create the indexes after the load is
completed.
– This significantly improves performance because indexes will
not have to be maintained during the load.
•
To achieve this, in load mappings:
– Use a Pre-mapping process operator that will invoke a
procedure to drop the indexes before the loading starts.
– Use a Post-mapping process operator that will invoke a
procedure to re-create the indexes after the loading is
complete.
11 - 25
Copyright © 2009, Oracle. All rights reserved.
Index Performance Considerations: Drop Indexes
Before the Load Process (Example Mapping)
11 - 26
Copyright © 2009, Oracle. All rights reserved.
Constraints Management
Constraints checking may slow load performance.
11 - 27
•
Enable Constraints:
Slower data load
because constraints
are checked for
each row
•
Exceptions Table
Name property:
Row IDs of records
that do not conform
will be logged into
this table.
Copyright © 2009, Oracle. All rights reserved.
Configuring Constraints Individually
You can also configure constraints individually in the
Configuration Properties window of the owning object.
Configuring foreign
key constraint
11 - 28
Copyright © 2009, Oracle. All rights reserved.
Defining Partitions in Warehouse Builder
You can define partitions in the Partitions tab of the Table
Editor.
Different types of
partitions are
supported.
11 - 30
Copyright © 2009, Oracle. All rights reserved.
Defining Partitions in Warehouse Builder
Set the partition tablespace
parameters in the configuration
properties of the partitioned table.
11 - 31
Copyright © 2009, Oracle. All rights reserved.
Parallelism
•
•
•
11 - 32
You can enable parallel access for tables in their
configuration properties.
You can set the Parallel Access mode to PARALLEL or
NOPARALLEL. The default is PARALLEL.
You can specify the Parallel Degree, which is the number
of parallel threads used in the parallel operation.
Copyright © 2009, Oracle. All rights reserved.
Enable Parallel DML
If you select this option,
Warehouse Builder enables
Parallel DML at run time.
The objects involved in the
mapping should be
enabled for parallelism to
take advantage of this
option.
11 - 33
Copyright © 2009, Oracle. All rights reserved.
Setting Tablespace Properties
You can set the tablespace properties at:
• User level
• Module level
• Object level
User
Object
Module
11 - 34
Copyright © 2009, Oracle. All rights reserved.
Minimum Error Logging
Set the LOGGING MODE property to NOLOGGING when
logging is not needed.
Set the default audit level property
for mappings appropriately.
11 - 35
Copyright © 2009, Oracle. All rights reserved.
Gathering Statistics
If you want the modification statistics to be collected on a table,
set the Statistics Collection property to MONITORING in the
Configuration Properties window.
Configuration
properties of
a table.
11 - 36
Copyright © 2009, Oracle. All rights reserved.
Analyze Table Statements Property
•
•
11 - 37
Set the “Analyze table statements” property for a mapping.
By default, the “Analyze table statements” property is not
enabled.
Copyright © 2009, Oracle. All rights reserved.
Gathering Schema Statistics
In Design Center, select
Tools > Preferences. In the
Preferences dialog box,
expand OWB, select
Environment, and then
check “Allow Optimize
Repository Warning on
Startup” to update the
schema statistics every time
you log in to Warehouse
Builder.
11 - 38
Copyright © 2009, Oracle. All rights reserved.
Quiz
When you enable parallel DML for a mapping, you must also
enable the source and target objects for parallel access.
a. True
b. False
11 - 39
Copyright © 2009, Oracle. All rights reserved.
Summary
In this lesson, you should have learned how to:
• Describe the performance enhancement methods that you
can apply in ETL mapping design
• Describe configuration of performance parameters in
schema design
• Describe the process of gathering statistics in Warehouse
Builder
11 - 40
Copyright © 2009, Oracle. All rights reserved.
Practice 11-1 Overview: Configuring Performance
Parameters for Mappings and Tables
In this practice, you examine the following:
• The performance, in terms of elapsed time, of a mapping
when executed in two different operating modes (Row
based and Set based)
• Configuration properties of mappings
• Indexes, constraints, and partitions on a table (you also
configure these)
11 - 41
Copyright © 2009, Oracle. All rights reserved.