ODI Scheduler - Sonra. Unleash the Value of your Data.

advertisement
AGENDA
• ODI Performance
• ODI Scheduling
• ODI Deployment/Release
BI-Quotient
www.bi-q.ie
ULI BETHKE
•
•
•
•
•
•
•
BI-Quotient
www.bi-q.ie
Dublin based
Blog www.bi-q.ie
ODI 2007
Reviewer two ODI books
ODI articles OTN
Deputy chair OUG BI SIG. Next event 11th June
ODI advanced trainer
ODI PERFORMANCE
BI-Quotient
www.bi-q.ie
ODI is a metadata driven (SQL) code generator using
code templates (knowledge modules). It uses a Java
agent to communicate and send data between
source and target systems and the repository over
the network.
SQL
BI-Quotient
www.bi-q.ie
- > 80%: ODI performance issues = SQL issues
=> SQL main ODI skill
- Perfect your SQL. Advanced SQL. Analytic
Functions
- Know your database(s) inside out. In
particular the target
- Understand, write, and modify Knowledge
Modules
AGENT
-
BI-Quotient
www.bi-q.ie
Light weight Java based application
Tied to host OS
Generates code based on ODI metadata.
Communicates source, target, repository.
JDBC data transport
XML
Jetty
Interpreters: Jython, JBS, JavaScript, Groovy
HSQLDB in memory database
Scheduler
Sizing
AGENT
BI-Quotient
www.bi-q.ie
Target
- Least amount of roundtrips. Network (JDBC, XML)
- One target database server only (DW)
Another Server
- ODBC drivers
- JEE agent on Weblogic
- No support for target OS
- Resources on target
- DBA
INTERFACES
BI-Quotient
www.bi-q.ie
- No!! KM using row by row processing
- Use ODI functions rather than DB functions
- Don’t overuse CKM (especially for large data
volumes)
- temp indexes (I$)
- Gather statistics (C$, I$, TGT when applicable)
- Rule of thumb: Use loader KMs or db link KMs
rather than JDBC KMs
SOURCE/TARGET
BI-Quotient
www.bi-q.ie
- Schemas on same database server. Physical
schema and not data server.
- Have sources physically close to target
- Minimize impact on source
- Chunking
CRITICAL PATH
NETWORK PATHS:
B> E>H
B>D> F
B>D>G
A>C>G
BI-Quotient
www.bi-q.ie
PATH DURATIONS:
6 + 2 + 11=19
6 + 4 + 14=24
6 + 4 + 10=20
9 + 8 + 10=27  CRITICAL PATH
MICRO TUNING
•
•
•
•
•
•
BI-Quotient
www.bi-q.ie
JDBC drivers
JVM
Type 4 or 5 JDBC drivers (Data Direct)
Array fetch size.
DB packet size.
Network packet size.
PERFORMANCE MONITORING
•
•
•
•
•
ODI Log Data Mart
Facts
Dimensions
Metrics
Frontend
BI-Quotient
www.bi-q.ie
DBMS_SQLTUNE_UTIL0
BI-Quotient
www.bi-q.ie
• dbms_sqltune_util0.sqltext_to_sqlid
• Link to Data Dictionary Tables
BI-Quotient
MACIEJ KOCON
•
•
•
•
•
Dublin based
ODI 2005 (Sunopsis)
Reviewer two ODI books
Blog www.bi-q.ie
maciek@bi-q.ie
www.bi-q.ie
ORCHESTRATING DWH
PROCESSES
BI-Quotient
www.bi-q.ie
• Orchestration of Data Process Flow
– Standard DWH Process flow orchestration
– Packages in Oracle Data Integrator 10g
– Load Plans in Oracle Data Integrator 11g
• Process Flow use cases - efficiency analysis
• Alternative scheduling
– benefits
TYPICAL DATA FLOW in DWH
step
BI-Quotient
www.bi-q.ie
1
STAGE
DATA EXTRACT
loads data from
sources
E-LT
TYPICAL DATA FLOW in DWH
step
1
step
BI-Quotient
www.bi-q.ie
2
STAGE
DIMs
DATA EXTRACT
loads data from
sources
LABEL
provides
structured labeling
information
E-LT
BI-Quotient
TYPICAL DATA FLOW in DWH
step
1
step
2
www.bi-q.ie
step
3
STAGE
DIMs
FACTS
DATA EXTRACT
loads data from
sources
LABEL
provides
structured labeling
information
FACTS
consists of
measurements,
metrics or facts
E-LT
BI-Quotient
TYPICAL DATA FLOW in DWH
step
1
step
2
www.bi-q.ie
step
3
STAGE
DIMs
FACTS
DATA EXTRACT
loads data from
sources
LABEL
provides
structured labeling
information
FACTS
consists of
measurements,
metrics or facts
E-LT
data transport &
transform units
BI-Quotient
TYPICAL DATA FLOW in DWH
step
1
step
www.bi-q.ie
2
step
3
STAGE
DIMs
FACTS
DATA EXTRACT
loads data from
sources
LABEL
provides
structured labeling
information
FACTS
consists of
measurements,
metrics or facts
ODI 10g
Packages
ODI 11
Load Plans
E-LT
data transport &
transform units
orchestration
ORCHESTRATION – ODI PACKAGES
using object directly
PKG_ABC
INT_A
PRC_B
INT_C
PKG_DE
INT_D
INT_E

BI-Quotient
www.bi-q.ie
ORCHESTRATION – ODI PACKAGES
using object directly
PKG_ABC
SYNCHRONOUS
INT_A
PRC_B
PRC_B
INT_C
INT_C
PKG_DE
INT_E

www.bi-q.ie
using scenarios – compiled code
INT_A
INT_D
BI-Quotient
PKG_DE
PKG_ABCDE
ORCHESTRATION – ODI PACKAGES
using object directly
PKG_ABC
BI-Quotient
www.bi-q.ie
using scenarios – compiled code
SYNCHRONOUS
INT_A
INT_A
PRC_B
PRC_B
INT_C
INT_C
PKG_ABCDE
PKG_DE
INT_D
INT_E

PKG_DE
ASYNCHRONOUS
INT_A
PRC_B
INT_C
PKG_DE
PKG_ABCDE
BI-Quotient
ODI 10g vs. ODI 11
www.bi-q.ie
FACTS
DIMs
STAGE
PKG_DM
PKG_ABC
PKG_FG
PKG_DE
INT_A
INT_C
INT_F
ODI 10g
PRC_B
INT_C
PRC_D
PRC_G
Packages
A
D
F
B
E
G
C
BI-Quotient
ODI 10g vs. ODI 11
STAGE
PKG_DM
PKG_ABC
www.bi-q.ie
DIMs
PKG_DE
FACTS
PKG_FG
INT_A
INT_C
INT_F
ODI 10g
PRC_B
INT_C
PRC_D
PRC_G
Packages
ODI 11
Load plans
BI-Quotient
ODI 10g vs. ODI 11
www.bi-q.ie
FACTS
DIMs
STAGE
PKG_DM
PKG_ABC
PKG_FG
PKG_DE
INT_A
INT_C
INT_F
ODI 10g
PRC_B
INT_C
PRC_D
PRC_G
Packages
ODI 11
Load plans
A
D
F
B
E
G
C
SAME
EFFECT!
BI-Quotient
PROCESS FLOW EFFICIENCY ANALYSIS
www.bi-q.ie
Standard Flow Orchestration:
Stage-(stop)DIMs-(stop)Facts
30
B
10
C
10
D
E
30
10
30
10
F
10
G
10
A
10
10
10
sequential
parallel
A
30
10
30 + 30 + 10 = 70
D
F
E
G
B
C
BI-Quotient
PROCESS FLOW EFFICIENCY ANALYSIS
www.bi-q.ie
Standard Flow Orchestration:
Stage-(stop)DIMs-(stop)Facts
30
B
10
C
10
D
E
30
10
30
10
F
10
G
10
A
10
10
10
sequential
parallel
A
30
10
30 + 30 + 10 = 70
DOWNSIDES:
• POSSIBLE INEFFICIENCIES (IDLE RESOURCES)
D
F
E
G
B
C
PROCESS FLOW EFFICIENCY ANALYSIS
OPTIMIZATION ATTEMPT
A
30
B
10
C
10
D
E
10
30
F
10
G
10
BI-Quotient
www.bi-q.ie
BI-Quotient
PROCESS FLOW EFFICIENCY ANALYSIS
www.bi-q.ie
OPTIMIZATION ATTEMPT
B
10
C
sequential
D
E
10
30
F
10
G
10
A
10
10
30
10
10
parallel
A
30
30
10
10
30 + 10
+ 10 = 50
10 + 30
70  50 = 1.4 times quicker!
UPSIDE:
• EFFICIENCY IMPROVED
D
F
B
E
C
G
ADVANCED DATA FLOW EXAMPLE
BI-Quotient
www.bi-q.ie
ENTERPRISE DWH DATA FLOW EXAMPLE
BI-Quotient
www.bi-q.ie
ENTERPRISE DWH DATA FLOW EXAMPLE
BI-Quotient
www.bi-q.ie
BI-Quotient
PROCESS FLOW EFFICIENCY ANALYSIS
www.bi-q.ie
OPTIMIZATION ATTEMPT
B
10
C
sequential
D
E
10
30
F
10
G
10
A
10
10
30
10
10
parallel
A
30
30
10
10
30 + 10
+ 10 = 50
10 + 30
70  50 = 1.4 times quicker!
UPSIDE:
• EFFICIENCY IMPROVED
DOWNSIDES:
• TIMINGS KNOWLEDGE REQUIRED
• OVERALL DEPENDECY KNOWLEDGE REQURED
D
F
B
E
C
G
BI-Quotient
PROCESS FLOW EFFICIENCY ANALYSIS
www.bi-q.ie
OPTIMIZATION ATTEMPT
30
B
10
C
D
E
10
30
F
10
G
10
10
A
70
10
10
10
70
10
30
30
10
sequential
30 + 30 + 10 = 70
DOWNSIDE:
• INEFFICIENCY EXISTS BUT CAN’T BE RESOLVED
• CONSUMER WAITING & IMPACT
parallel
A
D
F
E
G
B
C
TRADITIONAL SCHEDULING - LIMITATIONS
•
•
•
•
•
Possible inefficiencies (idle resources)
Timings knowledge required
Overall dependecy knowledge requred
Inefficiency exists but can’t be resolved
Consumer waiting & impact
BI-Quotient
www.bi-q.ie
TRADITIONAL SCHEDULING - LIMITATIONS
•
•
•
•
•
BI-Quotient
www.bi-q.ie
Possible inefficiencies (idle resources)
Timings knowledge required
Overall dependecy knowledge required
Inefficiency exists but can’t be resolved
Consumer waiting & impact
SCHEDULER
BI-Quotient
DEPENDENCY DRIVEN SCHEDULING
A
www.bi-q.ie
D
B
C
E
A
A
A
B
B
B
C
C
C
D
D
D
E
E
E
A
A
A
B
B
B
C
C
C
D
D
D
E
E
E
BI-Quotient
DEPENDENCY DRIVEN SCHEDULING
A
www.bi-q.ie
D
PACKGAGES
&
LOAD PLANS
B
C
E
A
A
A
B
B
B
C
C
C
D
D
D
E
E
E
A
A
A
B
B
B
C
C
C
D
D
D
E
E
E
BI-Quotient
PROCESS FLOW EFFICIENCY ANALYSIS
30
B
10
C
D
E
10
30
F
10
G
10
10
A
70
10
10
30
10
A
30
B
10
C
10
10
70
10
30
D
E
10
10
30
F
10
G
10
10
10
30
30
30 + 30 + 10 = 70
10
10
sequential
parallel
A
www.bi-q.ie
D
F
E
G
B
C
BI-Quotient
PROCESS FLOW EFFICIENCY ANALYSIS
30
B
10
C
D
E
10
30
F
10
G
10
10
30
10
A
30
B
10
10
A
70
10
10
C
70
10
30
D
E
10
30
sequential
parallel
A
30 + 30 + 10 = 70
10
F
10
G
10
30
70
10
10
10
10
30
30
www.bi-q.ie
10
70  30 = 2.3 times faster!
D
F
E
G
B
C
BI-Quotient
www.bi-q.ie
DEPENDENCY DRIVEN SCHEDULING
• Simplifies orchestrating the flow
– only immediate upstream definition required
– execution timings not relevant
– self-adapts in the most effective way
• Improves overall E-LT performance
– Less idle resources – better utilization
– Independency
– unveils its full potential in complex Enterprise
class DWHs (Inmon)
BI-Quotient
www.bi-q.ie
DEPENDENCY DRIVEN SCHEDULING
• Notifications
– errors (+auto-restartability)
– finish summary
– logging
• Multiple/overlapping E-LT streams
– load with different frequencies
• Parameterization
– improved system stress control
– process prioritization
F I R S T RUN
10
processes
F I R S T RUN
T O D AY
10 584
processes
processes
1389
DEPENDENCIES
F I R S T RUN
T O D AY
10 584
processes
processes
1389
DEPENDENCIES
132 231 SCENARIOS RUN
F I R S T RUN
T O D AY
10 584
processes
processes
1389
DEPENDENCIES
132 231 SCENARIOS RUN
TIME
12h43m
LOAD PLANS
F I R S T RUN
T O D AY
10 584
processes
processes
1389
DEPENDENCIES
132 231 SCENARIOS RUN
TIME
12h43m 4h21m
LOAD PLANS SCHEDULER
2.9
TIMES
FASTER
ENTERPRISE DWH DATA FLOW
BI-Quotient
www.bi-q.ie
RELEASE 1.0
BI-Quotient
www.bi-q.ie
RELEASE 2.0 TST
BI-Quotient
www.bi-q.ie
TESTING RELEASE 2.0
BI-Quotient
www.bi-q.ie
DEPLOY RELEASE 2.0 PRD
BI-Quotient
www.bi-q.ie
THE HOT FIX SITUATION
RELEASE FREQUENTLY
BI-Quotient
www.bi-q.ie
CI ENVIRONMENT
BI-Quotient
www.bi-q.ie
CI ENVIRONMENT
BI-Quotient
www.bi-q.ie
THE BUILD MASTER
BI-Quotient
www.bi-q.ie
AUTOMATE STUFF
BI-Quotient
www.bi-q.ie
ODI VS. SOURCE CONTROL
BI-Quotient
www.bi-q.ie
ODI STRUCTURE
BI-Quotient
www.bi-q.ie
BEYOND INTRA BUILD
DEPENDENCIES
BI-Quotient
www.bi-q.ie
Download