Uploaded by Narendra Peddireddy

SAP DS10 BODS 4.2 Part NW

For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Data Services - Platform and
Transforms
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
PARTICIPANT HANDBOOK
INSTRUCTOR-LED TRAINING
Course Version: 10
Course Duration: 3 Day(s)
Material Number: 50120656
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
SAP Copyrights and Trademarks
© 2014 SAP AG. All rights reserved.
No part of this publication may be reproduced or transmitted in any form or for any purpose without
the express permission of SAP AG. The information contained herein may be changed without prior
notice.
Some software products marketed by SAP AG and its distributors contain propr ietary software
components of other software vendors.
•
Microsoft, Windows, Excel, Outlook, and PowerPoint are registered trademarks of Microsoft
Corporation.
•
IBM, DB2, DB2 Universal Database, System i, System i5, System p, System p5, System x,
System z, System zlO, System z9, zl O, z9, iSeries, pSeries, xSeries, zSeries, eServer, z/VM,
z/OS, i5/0S, S/390, OS/390, OS/400, AS/400, S/390 Parallel Enterprise Server. PowerVM,
Power Architecture, POWER6+, POWER6, POWER5+, POWER5, POWER, OpenPower, PowerPC,
BatchPipes, BladeCenter, System Storage, GPFS, HACMP, RETAIN, DB2 Connect, RACF,
Red books, OS/2, Parallel Sysplex, MVS/ESA, AIX, Intelligent Miner, WebSphere, Netfinit y, Tivoli
and lnformix are trademarks or r egistered trademarks of IBM Corporat ion.
•
Linux is the registered t rademar k of Linus Torvalds in the U.S. and other countries.
•
Adobe, the Adobe logo, Acrobat, Postscript, and Reader are either trademarks or registered
trademar ks of Adobe Systems Incorporated in the United States and/or other countries.
•
Oracle is a registered tradema rk of Oracle Corporation
•
UNIX, X/Open, OSF/ 1, and Mot if are registered trademarks of the Open Group.
•
Cit rix, ICA, Program Neighborhood, MetaFrame, WinFrame, VideoFrame, and MultiWin are
trademar ks or registered trademarks of Citrix Systems, Inc.
•
HTML, XM L, XHTM L and W3C are trademarks or registered t rademar ks of W3C"', World Wide
Web Consortium, Massachusetts Institute of Technology.
•
Java is a registered trademark of Sun Microsystems, Inc.
•
JavaScript is a registered trademark of Sun Microsystems. Inc., used under license for
technology invented and implemented by Netscape.
•
SAP, R/3, SAP NetWeaver, Duet, PartnerEdge, ByDesign, SAP BusinessObjects Explorer,
StreamWork, and other SAP products and services mentioned herein as well as their respective
logos are trademarks or r egistered trademarks of SAP AG in Germany and other countries.
•
Business Objects and the Business Objects logo, BusinessObjects, Crystal Reports, Crystal
Decisions, Web Intelligence, Xcelsius, and other Business Objects products and services
mentioned herein as well as their respective logos are trademarks or registered trademarks of
Business Objects Software Ltd. Business Objects is an SAP company.
•
Sybase and Adaptive Server, iAnywhere, Sybase 365, SQL Anywhere, and other Sybase
products and services mentioned herein as well as their respective logos are trademarks or
register ed trademarks of Sybase, Inc. Sybase is an SAP company.
All other product and service names mentioned are the trademarks of their respect ive companies.
Data contained in this document serves informational purposes only. National product
specifications may vary.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
These materials are subject to change without not ice. These materials are provided by SAP AG
and its affiliated companies ("SAP Group") for informat ional purposes only, without
representation or warranty of any kind, and SAP Group shall not be liable for errors or omissions
with respect to the materials. The only warranties for SAP Group products and services are
those th at are set forth in the express war ranty statements accompanying such products and
services, if any. Nothing herein should be construed as constituting an additional warranty.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
© Copy right . All r ights r eserved.
iii ~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
IV
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
About This Handbook
This handbook is intended to both complement the instructor-led presentation of this course and to serve
as a reference for self-study.
Typographic Conventions
American English is the standard used in this handbook.
The following typographic conventions are also used.
This information is displayed in the instructor's presentation
Demonstration
Procedure
Warning or Caution
Hint
A
0
Related or Additional Information
Facilitated Discussion
.... .
User interface control
Example text
Window title
Example text
© Copyright . All r ights r eserved.
v ~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
VI
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Contents
ix
Course Overview
1
Unit 1:
Lesson: Defining Data Services
2
15
Unit 2:
16
21
29
35
41
Lesson: Defining a Data Services Flat File Format
Exercise 2: Create a Flat File Format
Unit 3 :
108
117
121
125
130
141
Batch Job Creation
Lesson: Creating Batch Jobs
Exercise 3: Create a Basic Data Flow
Unit 4:
64
67
75
78
83
88
95
107
Source and Target Metadata
Lesson: Defining Datastores in Data Services
Exercise 1: Create Source and Target Datastores
42
55
63
Data Services
Batch Job Troubleshooting
Lesson: Writing Comments with Descriptions and Annotations
Lesson: Validating and Tracing Jobs
Exercise 4: Set Traces and Annotations
Lesson: Debugging Data Flows
Exercise 5: Use the Interactive Debugger
Lesson: Auditing Data Flows
Exercise 6: Use Auditing in a Data flow
Unit 5:
Functions, Scripts, and Variables
Lesson: Using Built-In Functions
Exercise 7: Use the search_replace Function
Exercise 8: Use the lookup_ext() Function
Exercise 9: Use Aggregate Functions
Lesson: Using Variables. Parameters, and Scripts
Exercise 10: Create a Custom Function
© Copyright . All rights reserved.
vii ~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
151
Unit 6 :
153
156
159
163
169
184
187
197
201
207
209
217
Lesson: Using Platform Transforms
Lesson: Using the Map Operation Transform
Exercise 11: Use the Map Operation Transform
Lesson: Using the Validation Transform
Exercise 12: Use the Validation Transform
Lesson: Using the Merge Transform
Exercise 13: Use the Merge Transform
Lesson: Using the Case Transform
Exercise 14: Use the Case Transform
Lesson: Using the SQL Transform
Exercise 15: Use the SQL Transform
Unit 7:
218
227
235
272
277
281
287
293
viii
Error Handling
Lesson: Setting Up Error Handling
Exercise 16: Create an Alternative Work Flow
Unit 8:
236
242
247
256
263
271
Platform Transforms
Changes in Data
Lesson: Capturing Changes in Data
Lesson: Using Source-Based Change Data Capture (CDC)
Exercise 17: Use Source-Based Change Data Capture (CDC)
Lesson: Using Target-Based Change Data Capture (CDC)
Exercise 18: Use Target-Based Change Data Capture (CDC)
Unit 9:
Data Services Integrator Transforms
Lesson: Using Data Services Integrator Transforms
Lesson: Using the Pivot Transform
Exercise 19: Use the Pivot Transform
Lesson: Using the Data Transfer Transform
Exercise 20: Use the Data Transfer Transform
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Course Overview
TARGET AUDIENCE
This course is intended for the following audiences:
•
Data Consultant/Manager
•
Solution Architect
•
Super I Key I Power User
© Copy right . All r ights r eserved.
ix ~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
x
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Data Services
Lesson 1
Defining Data Services
2
UNIT OBJECTIVES
•
Define Data Services
© Copyright . All r ights r eserved.
1 ~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 1
Lesson 1
Defining Data Services
LESSON OVERVIEW
Data Services is a graphical interface for creating and staging jobs for data integration and data
quality purposes.
Create jobs t hat extract data from heterogeneous sources . Transform the data to meet the
business requirements of your organization. and load the data into a single location.
This unit describes the Data Services platform and its architecture, Data Services objects and its
graphical interface. the Data Services Designer.
LESSON OBJECTIVES
After completing this lesson, you will be able to:
•
Define Data Services
Data Services
Business Example
For reporting in SAP NetWeaver Business Warehouse. your company needs data from d iverse
data sources, such as SAP systems, non-SAP systems, the Internet. and other business
applications. Examine the technologies that SAP Data Services offers for data acquisition.
Data Services
Data Services provides a graphical interface that allows:
•
The easy creation of jobs that extract data from heterogeneous sources
•
The transformation of data to meet the business requirements of the organization
•
The loading of data to one or more locations
Data Services combines both batch and real-time data movement and management with
intelligent caching to provide a single data integration platform. As shown in the figure Data
Services Architecture-Access Server, the platform is used to manage information from any
information source and use.
2
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Defining Data Services
Real-time
Client
l
I
Adapters
I
Access
Server
-
Job
Server
•
Source
•
•
?:;~t'"' ~
.-r
- - - - - < > - - - - - - - Engines
Local
Repository
t
Management
Console
Target
Browser
Figure 1: Data Services Architecture - Access Server
Data Staging
This unique combination allows you to:
•
Stage data in an operational data store. data warehouse, or data mart
•
Update staged data in batch or real-time modes
•
Create a single environment for developing, testing, and deploying t he entire data integration
platform
•
Manage a single metadata repository to capture the relationships between different extraction
and access methods and provide integrated lineage and impact analysis
For most Enterprise Resource Planning (ERP) applications. Data Services generates SQL that is
optimized for the specific target database (for example. Oracle. DB2, SQL Server. and Inform ix).
Automatically generated, optimized code reduce the cost of maintaining data warehouses and
enables quick building of data solutions that meet user requirements faster t han other methods
(for example. custom-coding, direct-connect calls, or PL/SQL).
Data Services can apply data changes in various data formats, including any custom format using
a Data Services adapter. Enterprise users can apply data changes against mult iple back-office
systems singularly or sequentially. By generating calls native to the system in question, Data
Services makes it unnecessary t o develop and maintain customized code to manage the process.
It is also possible to design access intelligence into each transaction by adding flow logic that
checks values in a data warehouse or in the transaction itself before posting it to the target ERP
system.
Data Services Architecture
Data Services relies on several unique components to accomplish the data integration and dat a
quality activities required to manage corporate data.
Data Services Standard Components
•
Designer
© Copyright . All rights reserved.
3 ~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 1: Data Services
•
Repository
•
Job server
•
Engines
•
Access server
•
Adapters
•
Real-time services
•
Management console
The figure Data Services Architecture illustrates the relationships between components.
Real-time
Client
I Adapters I
Designer
Access
SeNer
Job
SeNer
•••:::?"
Source
Engines
Local
Repository
t
Br owser
Management
Console
Target
Figure 2: Data Services Architecture
The Data Services Designer
Data Services Designer is a Windows c lient application used to create, test, and manually execute
jobs that transform data and populate a data warehouse. Using the Designer, as shown in the
figure Data Services Designer Interface. create data management applications that consist of
data mappings, transformations, and control logic.
4
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Defining Data Services
1r:=:-
••
.. . . 0..,..
..:.
•.)6
,,.,,.,.6#.1
=1 .a*"-('
; ct
1'
.o---
•• Ila
""""""'HQuo>
•I
· &-..............;f.
A
- .
(
•
lf......ftt- ..
•
a...
..... ObtiK& Lbf/Y
-'"".....
•
x
~~dlr..
''"'-
---
I w.o.:.
''.,,_,__
w~
iii""
l~~~""'lf"'°!JO Ii? :F •
.,,,~~
-
"'
o.t...... ., . .
~-(w.tllMlfnJ)f"·
(;l IP
~
,.,.
Figure 3: Data Services Designer Interface
Create objects that represent data sources, and then drag, drop, and configure them in flow
diagrams. The Designer allows for the management of metadata stored in a local repository.
From the Designer. trigger the job server to run jobs for initial application testing.
The Data Services Repository
The Data Services repository is a set of tables that stores user-created and predefined system
objects, source and target metadata, and transformation rules. It is set up on an open client/
server platform to facilitate sharing metadata with other enterprise tools. Each repository is
stored on an existing Relational Database Management System (RDBMS).
Data Services Repository Tables
The Data Services repository is a set of tables that includes:
•
User-created and predefined system objects
•
Source and target metadata
•
Transformation rules
Each repository is stored on a supported RDBMS like MySQL, Oracle, Microsoft SQL Server,
Sybase, and 082. Each repository is associated with one or more job servers.
Repository Types
•
Local repository:
Known in the Designer as the Local Object Library, the local repository is used by an
application architect to store definitions of source and target metadata and Data Services
objects.
•
Central repository:
© Copyright . All rights reserved.
5 ~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 1: Data Services
Known in the Designer as the Central Object Library. the central repository is an optional
component that can be used to support a multi user environment. The Central Object Library
provides a shared library that allows developers to check objects in and out for development.
•
Profiler repository:
Is used to store information that determines the quality of data.
Data Services Job Server
Each repository is associated with at least one Data Services job server, which retrieves the job
from its associated repository and starts the data movement engine as shown in the figure Data
Services Architecture. The data movement engine integrates data from multiple heterogeneous
sources, performs complex data transformations, and manages extractions and transactions
from ERP systems and other sources. The job server can move data in batch or real-time mode
and uses distributed query optimization. multi threading, in-memory caching, in-memory data
transformations. and parallel processing to deliver high data throughput and scalability.
<
Real-time
Client
-
=
/
-
=
Designer
[
J
Access
Seiver
Job
Seiver
JI
I Adapters
E.
~A
~---\}
EliJ
B --+-----
Engines
Source
Target
Local
Repository
t
/
Management
Console
Brow'Ser
Figure 4: Data Services Architecture - Job Server
Designing a Job
When designing a job, run it from the Designer. In the production environment, the job server runs
jobs triggered by a scheduler or by a real-time service managed by the Data Services access
server. In production environments, balance job loads by creating a job server group (multiple job
servers), which execute jobs according to the overall system load . Data Services provide
distributed processing capabilities through server groups. A server group is a collection of job
servers that each reside on different Data Services server computers. Each Data Services server
can contribute one job server to a specific server group. Each job server collects resource
utilization information for its computer. This information is utilized by Data Services to determine
where a job, data flow, or subdata fl ow (depending on the distribution level specified) is executed.
The Data Services Engines
When Data Services jobs are executed, the job server starts Data Services engine processes to
perform data extraction . transformation, and movement. Data Services engine processes use
parallel processing and in-memory data transformations to deliver high data throughput and
scalability.
6
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Defining Data Services
The Data Services Management Console
The Data Services Management Console provides access to these features:
•
Administrator
Administer Data Services resources, including:
Scheduling, monitoring, and executing batch jobs.
Configuring, starting, and stopping real-time services.
Configuring job server, access server, and repository usage.
Configuring and managing adapters.
Managing users.
Publishing batch jobs and real-time services via web services.
Reporting on metadata.
Data Services object promotion which is used to:
Promote objects between repositories for development, testing and production phases.
•
Auto documentation
View, analyze, and print graphical representations of all objects as depicted in Data Services
Designer, including their relationships, properties etc.
•
Data Validation
Evaluate the reliability of the target data based on the validation rules created in the Data
Services batch jobs to review, assess, and identify potential inconsistencies or errors in source
data.
•
Impact and Lineage Analysis
Analyze end-to-end impact and li neage for Data Services tables and columns, and SAP
BusinessObjects Business Intelligence platform objects such as universes, business views,
and reports.
•
Operational Dashboard
View dashboards of status and performance execution statistics of Data Services jobs for one
or more repositories over a given time period.
•
Data Quality Reports
Use data quality reports to view and export SAP Crystal Reports for batch and real-time jobs
that include statistics-generating transforms. Report types include job summaries, transformspecific reports, and transform group reports.
To generate reports for Match, US Regulatory Address Cleanse, and Global Address Cleanse
transforms, enable the Generate Report Data option in the Transform Editor. Generate report
data option in the Transform Editor.
Other Data Services Tools
©Copyright . All rights reserved.
7 ~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 1: Data Services
There are also several tools that assist with the managing of Data Services installation. The Data
Services repository manager allows the creation, upgrade, and checking of versions of the local.
central. and profiler repositories.
The Data Services server manager allows adding. deleting. or editing of properties in the job
server. It is automatically installed on each computer on which a job server is installed.
Use the server manager to define li nks between job servers and repositories. Link multiple job
servers on different machines to a single repository (for load balancing) or each job server to
multiple repositories (with one default) to support individual repositories (for example, separating
test and production environments).
The license manager displays the Data Services components for which a license is currently held.
Data Services Objects
Data Services provides various objects that are used when building data integration and data
quality applications.
Data Services Object Types
•
Projects, for example, folders for organizing a repository
•
Jobs that are executable
•
Work flows. control operations. for example, sub jobs
•
Dataflows, where t he ETL occurs
•
Scripts, code embedded in other objects
•
Datastores. sources, and targets, for example. a database
•
File formats. for example, FLAT files. XML schemas, and Excel
Objects Used
In Data Services, all entit ies that are added. defined. modified. or worked with are objects. Some
of t he most frequently used objects are:
•
Projects
•
Jobs
•
Work flows
•
Data flows
•
Transforms
•
Scripts
The figure Data Services objects shows some common objects.
8
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Defining Data Services
Project Area
Pr oject
Job
Work fl ow
Data fl ow
Transform
-
x
+ B CJ~~ojects l
--+-I_ .,.18 @ CustomerOim_Job
!
.. 8
CustomerOim_WF
"'8
rf CustomerOim_DF
8 ~~M ods_customer(OEMO_Source.dem•
8 mJ cust_cflm(OEMO_Target.demo_tar•
--+----.s
Script - - . - - -- 18 ~
l><I Query
Script
Figure 5: Data Services Objects
Objects
All objects have options. properties. and classes. and can be modified to change the behavior of
the object.
Options control the object, for example. to set up a connection to a database. the database name
is an option for the connection. Properties describe the object, for example, the name and
creation date describes what the object is used for and when it became active. Attributes are
properties used to locate and organize objects. Classes define how an object can be used.
Every object is either reusable or single-use. Single-use objects appear only as components of
other objects. They operate only in the context in which they were created. Single-use objects
cannot be copied. A reusable object has a single definition and all calls to t he object refer to that
definition. If you change the definition of the object in one place. and then save the object, the
change is reflected in all other calls in the object.
Most objects created in Data Services are available for reuse. After you define and save a
reusable object, Data Services stores the definition in the repository. Reuse the definition as
necessary by creating calls to it, for example, a data flow within a project is a reusable object.
Multiple jobs. such as a weekly load job and a daily load job. can call the same data f low. If this
data flow is changed. both jobs call the new version of the data flow.
Edit reusable objects at any t ime independent of the current open project, for example. if you
open a new project, open a data flow and edit it. However. the changes made to the data flow are
not stored until they are saved.
Defining the Relationship between Objects
Jobs are composed of work flows and/or data flows as shown in the figure Data Services Object
Relationships:
•
A work flow is t he incorporation of several data flows into a sequence
•
A data flow process transforms source data into target data
©Copyright . All rights reserved .
9 ~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 1: Data Services
Pt'OjflctS
...
·1
,
~
I
I
'
Aowt ..
Cot-cl.1\10"-l'J.
Wort(
I
'O"ipU
~·ti:h O.t•
JI
I
flow•
•
Tr•nJtornw
I
O.t.Mtor..0.l•~·
d•t.&StOl'tS
-I
Doto
I
I
I I
~
I
I
I
I
"•\ fll0$
T.01..
~
Tr•nJfom'lf
o nf;I Toroot:s
I'll• form•ts
w
I
Flowt
'
I I
8ot• n:o~
..rr..,... O•t.
M eH4Q•
fomi.\S
H
XML folo•
H
T .mci&.t41
XML files
-I
Tetn,Plll\41
t•ble•
"'"
m"w~
FuftClll..M
AdtMll,-
-I
del.lllfl~•
TebhK
Ooc:Vft.,.,"
O utboUr'id
mflMOH
l"unebOnf
~
Kay:
I
I
I
I
I
I
.
WOI'~
&•uh •nd n.•l•lll'N
&etch only
'l-l·tim• """
flOWf •Ad OOl'ld•llO~lf ..,,
optlioo'"t end o.n be embedded
M ett bO•
fvnc:tiotK
Figure 6 : Data Services Object Relationships
Work and Data Flows
A work flow manages data flows and the operations that support them. It also defines the
interdependencies between data flows, for example, if one target table depends on values from
other tables, use the work flow to specify the order in which Data Services populates the tables.
Work f lows are also used to define strategies for handling errors that occur during project
execution. or to define conditions for running sections of a project.
A data flow defines t he basic task that Data Services accomplishes. it involves moving data from
one or more sources to one or more target tables or files. Define data f lows by identifying the
sources from which to extract data. the transformations the data should undergo, and targets.
Defining Projects and Jobs
A project is the highest-level object and is scheduled independently for execution . A project is a
single-use object that allow the grouping of jobs. for example, use a project to group jobs that
have schedules, which depend on one another or that are monitored together.
Project Characteristics
•
Projects are listed in the Local Object Library
•
Only one project can be open at a time
•
Projects cannot be shared among multiple users
The objects in a project appear hierarchically in the project area. If a plus sign (+)appears next to
an object. it can be expanded to view the lower-level objects contained in the object. Data
Services d isplays the contents as both names and icons in the project area hierarchy and in the
workspace. Jobs are associated with a project before t hey can be executed in the project area of
Designer.
Using Work Flows
10
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Defining Data Services
Jobs with data flows can be developed without using work flows. However, nesting data flows
inside of work flows by default is considered. This practice can provide various benefits.
By always using work f lows. jobs are more adaptable to additional development and/or
specification changes. For instance, if a job initially consists of four data flows that are to run
sequentially, they could be set up without work flows. But what if specification changes require
that they be merged into another job instead? The developer must create a correct replicate of
their sequence in the other job. If they had been init ially added to a work f low, the developer could
then copy the work f low into the correct position within the new job. It is unnecessary to learn,
copy, and verify the previous sequence. The change is made more quickly with greater accuracy.
If there is one data flow per work flow, t here are benefits to adaptability. Initially, it may have been
decided that recovery units are not important; the expectation being that if t he job fails, the whole
process could simply be rerun. However, as data volumes tend to increase, it may be determined
that a f ull reprocessing is t ime consuming. The job may then be changed to incorporate work
f lows to benefit from recovery units to bypass reprocessing of successful steps. However. these
changes can be complex and can consume more time than allotted in a project plan. It also opens
up the possibil ity that units of recovery are not properly defined. Setting these up during init ial
development when t he full analysis of the processing nature is preferred.
})
Note:
This course focuses on creating batch jobs using database datastores and file
formats.
Using the Data Services Designer
The Data Services Designer interface allows t he planning and organizing of data integration and
data quality jobs in a visual way. Most of the components of Data Services can be programmed
with this interface.
Describing the Designer Window
The Data Services Designer interface consists of a single application window and several
embedded supporting windows. The application window contains the menubar, toolbar, Local
Object Library, project area, tool palette, and workspace.
Using the Local Object Library
The Local Object Library gives access to Data Services object types. Import objects to and export
objects from the Local Object Library as a file. Importing objects from a f ile overwrites existing
objects with the same names in t he destination Local Object Library.
Whole repositories can be exported in either .atl or .xml format. Using the .xm l file format can
make repository content easier to read. It also allows for the exporting of Data Services to other
products.
Using the Tool Palette
The tool palette is a separate window that appears by default on the right edge of the Designer
workspace. Move the tool palette anywhere on the screen or dock it on any edge of the Designer
window.
The icons in the tool palette allow for t he creation of new objects in the workspace. Disabled icons
occur when there are invalid entries to the diagram open in the workspace. To show the name of
each icon. hold the cursor over the icon until the tool t ip for the icon appears.
©Copyright . All rights reserved.
11
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 1: Data Services
When you create an object from the tool palette, create a new definition of an object. If a new
object is reusable, it is automatically available in the Local Object Library after it has been
created.
If the data flow icon is selected from the tool palette and a new data flow called DFl is defined, it is
possible to later drag the existing data flow from the Local Object Library and add it to another
data flow called DF2.
Using the Workspace
When you open a job or any object within a job hierarchy, the workspace becomes active with
your selection. The workspace provides a place to manipulate objects and graphically assemble
data movement processes.
These processes are represented by icons that are dragged and dropped into a workspace to
create a d iagram. This diagram is a visual representation of an entire data movement application
or some part of a data movement application.
Specify the flow of data by connecting objects in the workspace from left to right in the order that
the data is to be moved.
LESSON SUMMARY
You should now be able to:
•
12
Define Data Services
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 1
Learning Assessment
1. Which of the following statements about work f lows are true? A work flow:
Choose the correct answers.
D
D
D
D
A Transforms source data into target data.
B Incorporates several data flows into a sequence.
C Makes jobs more adaptable to additional development and/or specification changes.
D Is not an object like a project, a job, a transform or a script.
© Copy right . All r ights r eserved.
13
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 1
Learning Assessment - Answers
1. Which of the following statements about work f lows are true? A work flow:
Choose the correct answers.
D
14
A Transforms source data into target data.
0
0
C Makes jobs more adaptable to addit ional development and/or specification changes .
D
D Is not an object like a project. a job, a transform or a script.
B Incorporates several data flows into a sequence.
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Source and Target Metadata
Lesson 1
16
21
Defining Datastores in Data Services
Exercise 1: Create Source and Target Datastores
Lesson 2
29
Defining a Data Services Flat File Format
35
Exercise 2: Create a Flat File Format
UNIT OBJECTIVES
•
Define types of datastores
•
Define flat file formats
© Copy right . All r ights r eserved.
15
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 2
Lesson 1
Defining Datastores in Data Services
LESSON OVERVIEW
Use Datastores to define data movement requirements in Data Services.
LESSON OBJECTIVES
After completing this lesson. you will be able to:
•
Define types of datastores
Datastores
Business Example
You are responsible for extracting data into the company's SAP NetWeaver Business Warehouse
system and want to convert it using Data Services as the new data transfer process.
A datastore provides a connection or multiple connections to data sources such as a database.
Using the datastore connection. Data Services can import the metadata that describes the data
from the data source as shown in the figure Datastore.
ERP
Database
Application
<
Datastore
<
<
Datastore
Datastore
>
>
>
Data
Services
Figure 7: Datastore
Datastore
•
Connectivity to data source
•
Import metadata from data source
•
Read and write capability to data source
Data Services uses datastores to read data from source tables or to load data to target tables.
Each source or target is defined individually and the datastore options available depend on which
Relational Database Management System (RDBMS) or application is used for the datastore.
16
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Defining Datastores in Data Services
The specific information that a datastore contains depends on the connection. When a database
or application changes, corresponding changes are made to the datastore information in Data
Services as these structural changes are not detected automatically.
Datastore Types
•
Database datastores:
Import metadata directly from a RDBMS.
•
Application datastores:
Import metadata from most ERP systems, including: J.D. Edwards One World and J.D.
Edwards World. Oracle Applications, PeopleSoft. SAP Applications. SAP Master Data Services,
SAP NetWeaver BW. and Siebel Applications (see the appropriate supplement guide).
•
Adapter datastores:
Access application data and metadata, or just metadata. For example, if the data source is
SQL compatible, the adapter might be designed to access metadata. while Data Services
extracts data from or loads data directly to the application.
•
Web service datastores:
Represent a connection from Data Services to an external Web service-based data source.
Adapters
Adapters provide access to a third-party application's data and metadata. Depending on the
implementation, adapters can be used to:
•
Browse application metadata
•
Import application metadata to the Data Services repository
For batch and real-time data movement between Data Services and applications, SAP offers an
Adapter Software Development Kit (SDK) to develop custom adapters. It is also possible to buy
Data Services prepackaged adapters to access application data and metadata in any application.
Use the Data Mart Accelerator for the SAP Crystal Reports adapter to import metadata from the
SAP BusinessObjects Business Intelligence platform.
Datastore Options and Properties
Changing a Datastore Def inition
Like all Data Services objects, datastores are defined by both options and properties:
•
Options control the operation of objects, including the database server name, database name,
user name, and password for the specific database.
The Edit Datastore dialog box allows for the editing of connection properties except datastore
name and datastore type for adapter and application datastores. For database datastores,
edit all connection properties except datastore name, datastore type, database type, and
database version.
© Copyright . AlI rights reserved .
17
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 2: Source and Target Metadata
•
Properties document the object. for example. the name of the datastore and the date on which
it is created are datastore properties. Properties are descriptive of the object and do not affect
its operation.
Table 1: Properties
The properties and description of each types are outlined in the table:
Properties Tab
Description
General
Contains the name and description of the
datastore. if available. The datastore name
appears on the object in the Local Object
Library and in calls to the object. The name of
the datastore cannot be changed after creation
Attributes
Include the date when the datastore is created .
This value is not changeable
Class attributes
Includes overall datastore information such as
description. and date created
Metadata
Importing Metadata from Data Sources
Data Services determines and stores a specific set of metadata information for tables. Import
metadata by naming, searching, and browsing. After importing metadata, edit column names.
descriptions, and data types. The edits are propagated to all objects that call these objects.
Datastore Metadata
•
External metadata:
Connects to database. displays objects for which access is granted.
•
Repository metadata:
Has been imported into the repository. The metadata is used by data services.
•
Reconcile vs reimport:
Reconcile compares external metadata to repository metadata.
Reimport overwrites repository metadata with external.
Table 2: Metadata
The metadata and description of each types are outlined in the table:
18
Metadata
Descri ption
Table name
The name of the table as it appears in the
database
Table description
The description of the table
Column name
The name of the table column
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Defining Datastores in Data Services
Metadata
Description
Column description
The description of the column
Column data type
The data type for each column. If a column is
defined as an unsupported data type (see data
types listed below), data services converts the
data type to one that is supported. In some
cases, if data services cannot convert the data
type, it ignores the column entirely. Supported
data types are: BLOB, CLOB, date, date time.
decimal, double, int, interval, long, numeric.
real, t ime, t ime stamp, and varchar
Primary key column
The column that comprises the primary key for
the table. After a table has been added to a
data flow diagram, these columns are
indicated in the column list by a key icon next
to the column name
Table attribute
Information Data Services records about the
table such as the date created and date
modified, when available
Owner name
Name of the table owner
It is also possible to import stored procedures from DB2. MS SQL Server, Oracle, SAP HANA,
SQL Anywhere, Sybase ASE, Sybase IQ. and Teradata databases. Import stored functions and
packages from Oracle, use these functions and procedures in the extraction specifications given
to Data Services.
Imported Information
Information that is imported for f unctions includes:
•
Function parameters
•
Return type
•
Name, owner
Imported functions and procedures appear in the Function branch of each datastore tree on the
datastores tab of the Local Object Library.
Importing Metadata from Data Sources
The easiest way to import metadata is by browsing. Note that functions cannot be imported using
this method.
Import Metadata by Browsing
1. On the datastores tab of the Local Object Library, right-click the datastore and select Open
from the menu.
The items available to import appear in the workspace, choose External Metadata.
© Copyright . AlI rights reserved .
19
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 2: Source and Target Metadata
2. Navigate to and select the tables for which you want to import metadata.
Hold down the Ctr/ or Shift keys and select multiple tables.
3. Right-click the selected items and select Import from the menu.
The workspace contains columns that indicate whether the table has already been imported
into data services (Imported) and if the table schema has changed since it was imported
(Changed). To verify whether the repository contains the most recent metadata for an object,
right-click the object and select Reconcile.
4. In the Local Object Library, expand the datastore to display the list of imported objects.
organized into functions. tables. and template tables.
5. To view data for an imported datastore. right-click a table and select View Data from the
menu.
20
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 2
Exercise 1
Create Source and Target Datastores
Business Example
You are working as an ETL developer using SAP Data Services Designer. You will create
datastores for the source, t arget, and staging databases.
Note:
When the data values for the exercise include XX, replace XX with the number that
your instructor has provided to you.
Start the SAP BusinessObjects Data Services Designer
1. Log in to the Data Services Designer.
Create Datastores and import metadata for the Alpha Acquisitions, Delta, HR_Datamart, and
Omega databases.
1. In your Local Object Library, create a new source Datastore for the Alpha Acquisitions
database.
Table 3: Alpha Datastore Values
Field
Value
Datastore name
Alpha
Datastore type
Database
Database type
Sybase ASE
Database version
Sybase ASE 15.X
Database server name
WDFBLMT5074
Database name
ALPHA
User name
sourceuser
Password
sourcepass
2. Import the metadata for the Alpha Acquisitions database source tables.
3. In your Local Object Library, create a new Datastore for the Delta staging database.
Field
Value
Datastore name
Delta
© Copy right . All r ights r eserved.
21 ~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 2: Source and Target Metadata
Field
Value
Datastore type
Database
Database type
Sybase ASE
Database version
Sybase ASE 15.X
Database server name
WDFBLMT5074
Database name
DELTAXX
User name
studentXX
Password
studentXX
4 . In your Local Object Library, create a new target Datastore for the HR Data Mart.
Field
Value
Datastore name
HR_datamart
Datastore type
Database
Database type
Sybase ASE
Database version
Sybase 15.X
Database server name
WDFLBMT5074
Database name
HR_DATAMARTXX
User name
studentXX
Password
studentXX
5. Import the metadata for the HR_datamart database source tables.
6. In your Local Object Library, create a new target Datastore for the Omega data warehouse.
22
Field
Value
Datastore name
Omega
Datastore type
Database
Database type
Sybase ASE
Database version
Sybase ASE 15.X
Database server name
WDFLBMT507 4
Database name
OMEGAXX
User name
studentXX
Password
studentXX
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 2
Solution 1
Create Source and Target Datastores
Business Example
You are working as an ETL developer using SAP Data Services Designer. You will create
datastores for t he source, t arget, and staging databases.
Note:
When the data values for t he exercise include XX, replace XX with the number that
your instructor has provided to you.
Start the SAP BusinessObjects Data Services Designer
1. Log in to t he Data Services Designer.
a) In the Windows Terminal Server (WTS) training environment desktops, choose Start - All
Programs - _SAP Data Services 4.2 - Data Services Designer.
b) The System-host[:port] field should be: WDFLBMT5074 : 6400
c) In the SAP Data Services Repository Login dialog box, in t he User name field, enter your
user ID, train- xx.
d) In the password field, ent er your password, which is the same as your user name.
e) ChooseLogon.
f) From the list of repositories, choose your repository, DSREPOXX.
g) Choose OK.
Create Datastores and import metadata for the Alpha Acquisitions, Delta, HR_Datamart, and
Omega databases.
1. In your Local Object Library, create a new source Datastore for the Alpha Acquisitions
database.
Table 3: Alpha Datastore Values
Field
Value
Datastore name
Alpha
Datastore type
Database
Database type
Sybase ASE
Database version
Sybase ASE 15.X
© Copyright . All r ights r eserved.
23
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 2: Source and Target Metadata
Field
Value
Database server name
WDFBLMT5074
Database name
ALPHA
User name
sourceuser
Password
sourcepass
a) In t he Local Object Library, choose the Datastores tab.
b) Right click the white workspace of the tab and choose New.
c) In the resulting d ialog box, in the appropriate fields, enter the values from the Alpha
Datastore Values table.
d) To save the Datastore. choose OK.
e) To close the display, choose the x icon in the upper right corner of the data d isplay.
2 . Import the metadata for the Alpha Acquisitions database source tables.
a) Right click the Alpha datastore that you j ust created and choose Open .
You will see the following list of tables:
•
dbo.category
•
dbo.city
•
dbo.country
•
dbo.customer
•
dbo.department
•
dbo.employee
•
dbo.hr_comp_update
•
dbo.order_details
•
dbo.orders
•
dbo.product
•
dbo.region
b) To select all of the tables. hold the CTRL key and click each table name .
c) Right click the selected tables and choose Import.
d) To close the view of Alpha tables, on the tool bar, choose Back.
e) To confirm that there are four records in the Alpha table. right click the c atego ry table in
Local Object Library Datastore tab and choose View Data .
f) Close the data display.
24
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Defining Datastores in Data Services
3 . In your Local Object Library, create a new Datastore for the Delta staging database.
Field
Value
Datastore name
Delta
Datastore type
Database
Database type
Sybase ASE
Database version
Sybase ASE 15.X
Database server name
WDFBLMT5074
Database name
DELTAXX
User name
studentXX
Password
studentXX
a) In the Local Object Library, choose t he Datastores tab.
b) Right click the white workspace of the tab and choose New.
c) In the resulting dialog box, in the appropriate f ields, enter the values from the table, above
d) To save the Datastore, choose OK.
4 . In your Local Object Library, create a new target Datastore for the HR Data Mart.
Field
Value
Datastore name
HR_datamart
Datastore type
Database
Database type
Sybase ASE
Database version
Sybase 15.X
Database server name
WDFLBMT507 4
Database name
HR_DAT AMARTXX
User name
studentXX
Password
studentXX
a) In the Local Object Library, choose t he Datastores tab, right click t he white workspace of
the tab and choose New.
b) In the resulting dialog box, in t he appropriate f ields, enter the values from the table above.
c) To save the Datastore, choose Ok.
5. Import the metadata for the HR_datamart database source tables.
a) Right-click the HR_datamart datastore that you have just created.
© Copyright . AlI rights reserved .
25
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 2: Source and Target Metadata
b) Choose Import By Name.
c) Enter the table name EMP_DEPT.
The owner is dbo.
d) Repeat steps band c for the following table names:
•
EMPLOYEE
•
HR COMP UPDATE
•
RECOVERY STATUS
e) To close the view of the HR_datamart table, choose Back.
6. In your Local Object Library, create a new target Datastore for the Omega data warehouse.
Field
Value
Datastore name
Omega
Datastore type
Database
Database type
Sybase ASE
Database version
Sybase ASE 15.X
Database server name
WDFLBMT5074
Database name
OMEGAXX
User name
studentXX
Password
studentXX
a) In the Local Object Library, choose the Datastores tab, right c lick the white workspace of
the tab and choose New.
b) In the result ing d ialog box, enter the values from the table above.
c) To import the metadata for the Omega database source tables, right click the Omega
datastore that you just created and choose Open.
You will see a list of tables:
•
dbo.emp_dim
•
dbo.product_dim
•
dbo.product_target
•
dbo.time_dim
d) To select all tables, select the first table and, while holding down the Shift key on the
keyboard, select the last table.
e) Right click the selected tables and choose Import.
26
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Defining Datastores in Data Services
f) Close the view of Omega tables.
g) To save your work, from the main menu, choose Project
© Copyright . AlI rights reserved .
~
Save All.
27
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 2: Source and Target Metadata
LESSON SUMMARY
You should now be able to:
•
28
Define types of datastores
© Copyright . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 2
Lesson 2
Defining a Data Services Flat File Format
LESSON OVERVIEW
Use f lat file formats to create Datastores to help define data movement requirements in Data
Services.
LESSON OBJECTIVES
Aft er completing this lesson, you will be able to:
•
Define flat file formats
File Formats
Business Example
You are responsible for extracting flat file data into the company's SAP NetWeaver Business
Warehouse system and want to convert to using Data Services as t he new data transfer process.
You must know how to create flat file formats as the basis for creating a datastore.
File formats are connections to f lat files as datastores are connections to databases.
Explaining file formats
As shown in the figure File Format Ed itor. a file format is a set of properties that describes the
structure of a flat file (ASCII). File formats describe the metadata structure . A file format
describes a specific f ile. A f ile format template is a generic description that can be used for
multiple data files.
The software uses data stored in files for data sources and targets. A f ile format defines a
connection to a file. Therefore, use a file format to connect to source or target data when the data
is stored in a f ile rather than a database table. The Local Object Library stores file format
templates that are used to define specific file formats as sources and targets in data flows .
© Copyright . All r ights reserved.
29
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 2: Source and Target Metadata
~ ftle
fo1n1ot Editor
_-
•
"'~
Gcner..~-
F
lVPO
Delmlod
Fle.Jon,..._667
Cu«om tran$1er pr·ogram
P111alel PftXOS$ threads
D.... File($)
No
(ncne)
.......
I-
l'lelclNomo o.tol
LOCRoot drectOtY
FleM0>0(5)
Ddimit.,..
I-
CCUNl
'(now
°"""°
lno)
(none)
Text
I"
DeloultFormot
Escapectw
MAI. indic6tor
I-
l9"10f6 IOW fl\AJ~(s)
(ncne)
(,.,.....)
(ncne)
Dot•
Tlmo
Nl24;mi:ss
Oate·flne
yyyy.nwn.dd hha''fim ...
yyyy.mn.dd
1111>\lt/ OU<ou\
St~
HNdefs
0
No
No
Sl<l>oed•Skk> row headef
wrb row header
S
•
Wrt:0-80M
Custom tr..-itf~
____,I
9-an
Figure 8: File Format Editor
File Format Objects
File format objects describe the following file types:
•
Delimited:
Characters such as commas or tabs separate each field.
•
Fixed width :
Specify the column width.
•
SAP transport:
Define data transport objects in SAP application data flows.
•
Unstructured text:
Read one or more files of unstructured text from a directory.
•
Unstructured binary:
Read one or more binary documents from a directory.
Table 4: File Format Editor Modes
Use the file format editor to set properties for file format templates, and source or target file
formats. Available properties vary by the mode of the file format editor:
30
Mode
Description
New mode
Create a new file format template
Edit mode
Edit an existing file format template
Source mode
Edit the file format of a particular source file
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Defining a Data Services Flat File Format
Mode
Description
Target mode
Edit the file format of a particular target file
Table 5: Date Formats
In the Property Values work area, it is possible to override default date formats for files at the field
level. The following date format codes can be used :
Code
Description
DD
2-dig it day of the month
MM
2-dig it month
MONTH
Full name of the month
MON
3 -character name of the month
yy
2-digit year
yyyy
4-digit year
HH24
2-dig it hour of the day (0-23)
Ml
2-dig it m inute (0-59)
SS
2-dig it second (0-59)
FF
Up to 9-digit subseconds
Create a New File Format
1. On the Formats tab of the Local Object Library, right-click Flat Files and select New from the
menu to open the file format editor.
To make sure your file format definition works properly, finish inputting the values for the file
properties before moving on to the Column Attributes work area.
2. In the Type field, specify the file type:
•
De limi ted :
Select this file type if the file uses a character sequence to separate columns.
•
Fixe d wid th:
Select this file type if the file uses specified widths for each column .
If a f ixed-width file format uses a multibyte code page, then no data is d isplayed in the Data
Preview section of the f ile format editor for its files.
3. In the Name field, enter a name that describes the file format template.
Once the name has been created, it cannot be changed. If an error is made, the file format is
deleted and a new format is created.
©Copyright . All rights reserved.
31 ~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 2: Source and Target Metadata
4. Specify the location information of the data file including Location, Root Directory, and
File name.
The Group File Read can read multiple flat files with identical formats with a single file format.
By substituting a wild card character or list of file names for the single file name. multiple f iles
can be read.
5. Select Yes to overwrite the existing schema.
This happens automatically when you open a file.
6. Complete the other properties to describe files that this template represents.
Overwrite the existing schema as required.
Table 6: Column Attributes Work Area
For source files, specify the structure of each column in the Column Attributes work area:
Column
Description
Field Name
Enter the name of the column
Data Type
Select the appropriate data type from the
dropdown list
Field Size
For columns with a data type of varchar,
specify the length of the field
Precision
For columns with a data type of decimal or
numeric, specify the precision of the f ield
Scale
For columns with a data type of decimal or
numeric, specify the scale of the field
Format
For columns with any data type but varchar,
select a format for the f ield, if desired. This
information overrides the default format set in
t he Property Values work area for that data
type
Columns do not need to be specified for files used as targets. If the columns are specified and
they do not match the output schema from the preceding transform, Data Services writes to the
target file using the transform's output schema.
For a decimal or real data type, if the column names and data types in the target schema do not
match those in the source schema, Data Services cannot use the source column format
specified . Instead, it defaults to the format used by the code page on the computer where the job
server is installed. Select Save & Close to save the file format and close t he file format editor. In
the Local Object Library, right-click the file format and select View Data from the menu to see the
data.
Create File from Existing Format
To create a file format from an existing f ile format:
32
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Defining a Data Services Flat File Format
1. On the Formats tab of the Local Object Library. right-click an existing file format and select
Replicate.
The f ile format editor opens, displaying the schema of the copied file format.
2. In the Name field, enter a unique name for the replicated file format.
Data Services does not allow you to save the replicated file with the same name as the original
(or any other existing file format object). After it is saved, you cannot modify the name again.
3. Edit the other properties as desired.
4. Select Save & Close to save the file format and close the file format editor.
Multiple Flat Files
To read multiple flat files with ident ical formats with a single file format
1. On the Formats tab of the Local Object Library, right-click an existing file format and select
Edit from the menu.
The format must be based on one single file that shares the same schema as the other files.
2. In the Location field of the format wizard, enter one of:
•
Root directory (optional to avoid retyping)
•
List of file names, separated by commas
•
File name containing a wild character(*)
When you use the(*) to call the name of several file formats, Data Services reads one file format,
closes it and then proceeds to read the next one. For example, if you specify the file name
revenue* .txt, Data Services reads all flat files starting with revenue in the file name.
There are new unstructured_ text and unstructured_binary file reader types for read ing all files in a
specific folder as long/BLOB records. There is also an option for trimming fixed width f iles.
File Format Error Handling
One of the features available in the File Format Editor is error handling as shown in the figure Flat
File Error Handling. The option Capture data conversion errors is not a yes or no option, instead it
captures errors or warnings. Selecting Yes identifies that there are errors, selecting No identifies
warnings.
Error handling
log data conversion warnings
log row format warnings
Maximum warnings to log
Capture data conversion errors
Capture row format errors
Maximum errors to stop job
Write error rows to file
Error file root directory
Error fife name
Yes
Ve$
{no limit}
Yes
Yes
{no limit}
Ves
i.. C: \Documents and Settings\Administrator\Oesktop\Scripts
~ file_errors. txt
i
Figure 9 : Flat File Error Handling
Error Handling for a File Format
©Copyright . All rights reserved .
33
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 2: Source and Target Metadata
When error handling is enabled for a f ile format, Data Services wil l:
•
Check for two types of f lat-file source errors:
Datatype conversion errors. for example, a f ield might be defined in the file format editor as
the integer data type, but the data encountered is the varchar data type.
Row-format errors. for example, in the case of a fixed-width file, Data Services identifes a
row that does not match the expected width value.
•
Stop processing the source file after reaching a specified number of invalid rows
•
Log errors in t he Data Services error log. It is possible to limit the number of log entries
allowed without stopping the job
It is possible to write rows with errors to an error f ile, which is a semicolon-delimited text file that
is created on the same machine as the job server.
Entries in an error file have the following syntax:
source file path and name; row number in source file; Data Services error; column number where
the error occurred; all columns from the invalid row
Flat File Error Handling
To enable flat file error handling in the file format editor:
1. On the Formats tab of the Local Object Library, right-click the file format and select Edit from
the menu.
2. Under the Error Handling section. in the Capture Data Conversion Errors dropdown list, select
Yes .
3. In the Capture Row Format Errors dropdown list, select Yes.
4. In the Write Error rows to file dropdown list, select Yes .
It is possible to specify the maximum warnings to log and the maximum errors before a job is
stopped.
5. In the Error File Root Directory field , select the folder icon to browse to the directory in which
you have stored the error handling text f ile you created .
6. In the Error File Name f ield, enter the name for the text f ile created to capture the f lat file error
logs in that directory.
7. Select Save & Close.
34
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 2
Exercise 2
Create a Flat File Format
Business Example
In addition to the main databases for source information, records for orders are stored in f lat
f iles. You need to extract dat a from these flat f iles, and so must create the appropriate file format
for the extraction.
1. Create a file format Orders_Format for an orders flat f ile so that you can use it as a source
object for extraction.
2. Adjust the datatypes for the columns proposed by the Designer based on their content.
Table 7: Column Attributes Values
Field Size
Column
Datatype
ORDERID
int
EMPLOYEEID
varchar
ORDERDATE
date ( dd-mon-yyyy)
CUSTOMER ID
int
COMPANYNAME
varchar
50
CITY
varchar
50
COUNTRY
varchar
50
© Copy right . All r ights r eserved.
15
35
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 2
Solution 2
Create a Flat File Format
Business Example
In addition to t he main databases for source information, records for orders are stored in f lat
f iles. You need to extract data from these fl at f iles, and so must create the appropriate f ile format
for the extraction.
1. Create a file format Orders_Format for an orders flat f ile so that you can use it as a source
object for extraction.
a) In the Local Objects Library, choose Formats tab.
b) Right click the Flat Files node and choose New.
c) Enter Orders Format as the format name.
d) In the Data File(s) section, use the drop down menu to change Location to Job Server.
e) In the Root Directory field, enter D: \CourseFiles \DataServices \Ac ti vi ty_Source.
f) In the File name field, enter orders_12_21_06. bet.
A pop-up message "Overwrite the current schema with t he schema from the f ile you
selected?" opens. Choose Yes.
g) In the Delimiters section , in the Colum n field, use the drop down menu to change the file
delimiter to Semicolon.
A pop-up message "Overwrite the current schema wit h t he schema from the f ile you
selected?" opens. Choose Yes.
h) In the Input/Output sections, in the Skip Row Header f ield, use t he dropdown menu to
change the value to Yes.
i) Save your work.
2. Adj ust t he datatypes for the columns proposed by the Designer based on their content.
Table 7: Column Attributes Values
36
Column
Dat atype
ORDER ID
int
EMPLOYEE ID
varchar
ORDERDAT E
date ( dd-mon-yyyy)
CUSTOMER ID
int
Field Size
15
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Defining a Data Services Flat File Format
Column
Datatype
Field Size
COMPANYNAME
varchar
50
CITY
varchar
50
COUNTRY
varchar
50
a) In the Column Attributes pane. change the field datatypes to the datatypes in the Column
Attributes Values table.
b) In the ORDERDATE field, to change the format of t he date, enter dd- mon - yyyy.
c) Choose Save and close.
d) Right click your new file format Orders_Fo rmat and choose View Data.
©Copyright . All rights reserved.
37
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 2: Source and Target Metadata
LESSON SUMMARY
You should now be able to:
•
38
Define flat fil e formats
© Copyright . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 2
Learning Assessment
1. Which of the following statements are true?
Choose the correct answers.
D
A A datastore provides a connection or multiple connections to data sources.
B Data Services uses the datastore to import metadata that describes the data in the
D data
source.
D
C Data Services uses the datastore to read data from the source.
D D Data Services cannot use the datastore to write data to a target.
D E Datastore options are the same across all Relational Database Management Systems.
2. When do you use a file format to connect to source or target data?
© Copy right . All r ights r eserved.
39
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 2
Learning Assessment - Answers
1. Which of the following statements are true?
Choose the correct answers.
A A datastore provides a connection or multiple connections to data sources.
B Data Services uses the datastore to import metadata that describes the data in the
data source.
0
C Data Services uses the datastore to read data from the source.
D
D
D Data Services cannot use the datastore to write data to a target.
E Datastore options are the same across all Relational Database Management Systems.
2. When do you use a file format to connect to source or target data?
Use a file format to connect to source or target data when the data is stored in a f ile rather
than a database table
40
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Batch Job Creation
Lesson 1
Creating Batch Jobs
Exercise 3 : Create a Basic Data Flow
42
55
UNIT OBJECTIVES
•
Examine batch jobs
•
Creat e a basic data flow
© Copy right . All r ights r eserved.
41 ~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 3
Lesson 1
Creating Batch Jobs
LESSON OVERVIEW
Once metadata has been imported for your datastores, create data flows to define data
movement requirements. Data flows consist of a source and a target connected with a transform.
Data f lows can then be placed into a work flow as an optional object. Data flows must be placed in
a job for execution.
LESSON OBJECTIVES
After completing this lesson, you will be able to:
•
Examine batch jobs
•
Create a basic data flow
Data Services Projects
Data flows define how information is moved from a source to a target. Data flows are organized
into executable jobs, and executable jobs are grouped into projects. The f igure Data Services
Project Area shows the project area in t he Designer.
x
Project Area
8 l!::l !Pr";j~~~
Proj ect - - 1
Job --+---+-·8 ·&··;:·~~tomerDim_Job
Work flow --+----+-8 ~ CustomerDim_WF
Data flow --+----+l ~
CustomerDim_DF
cf
Gml ods_customer(DEMO_Source.dem•
1 GmJ cust_dim(DEMO_Target.demo_tarr
Transform - -+---___.,....,..18 IX! Query
Script - -+-- -- m 1¥), script
~~~-..================4----=.!.l~ Designer § Monitor I~Log
I
Figure 10: Data Services Project Area
Project Creation
A project is a single-use object that allows you to group jobs. A project is the highest level of
organization offered by SAP Data Services as shown in the figure Data Services Project. When a
42
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Creating Batch Jobs
project is opened, one group of jobs is accessible in the user interface. Projects are stored in the
Local Object Library and only one project can be open at a t ime.
Project Area
X
s O rm.ml
EJ ~_Job
El cf New_DataFlow
mi CUSTOMERS(Source.080)
l><l Query
mi CUST_DIM(Target.080)
>tfDesigner
~ Monitor
IGil log I
Figure 11: Data Services Project
A project is used solely for organizational purposes, for example, to group jobs that have
schedules that depend on one another or to group jobs that you want to monitor together.
The objects in a project appear hierarchically in the project area of the Designer. If a plus sign ( +)
appears next to an object. expand it to view the lower-level objects.
Data Services Jobs
A job is the only executable object in Data Services. When developing data f lows, you can
manually execute and test jobs directly in Data Services. In production, you can schedule batch
jobs and set up real-time jobs as services that execute a process when Data Services receives a
message request. The f igure Data Services Job shows where new batch jobs and new real-time
jobs are set.
Job Execution
•
A job is a schedulable object.
•
A job is always executed immediately in Designer.
•
Execution can be scheduled in the Data Services Management Console or by a third-party
scheduler.
•
The steps of a job are executed together.
A job is made up of steps that are executed together. Each step is represented by an object icon
that you place in the workspace to create a job diagram. A job d iagram is made up of two or more
objects connected together.
©Copyright . All rights reserved .
43
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 3: Batch Job Creation
Project Area
X
El e:J IB' New Batch lob
El &c
Job_ _ •
El Cl _New
_Real·time
___
Rename
F2
Properties ...
'iL Designer
§
Monitor
If!l Log I
Figure 12: Data Services Job
Job Definition Objects
You can include any of the following objects in a job definition :
•
Work Flows
•
Scripts
•
Conditionals
•
While Loops
•
Try/Catch Blocks
•
Data Flows
Source Objects
Target Objects
Transforms
If a job is complex, organize its content into individual work flows and create a single job that calls
t he work flows.
Hint:
Follow the recommended naming conventions for consistent object identification
across all systems in your enterprise.
Data Services Designer Workspace
When a job is created, add objects to the job workspace area using the Local Object Library or t he
tool palette.
Add Local Object Library Objects
To add objects from the Local Object Library to the workspace:
1. Select the tab in t he Local Object Library for the type of object you want to add.
2. Select the object and drag the object icon to the workspace.
Add Tool Palette Objects
44
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Creating Batch Jobs
To add objects from the tool palette to the workspace:
1. Select the desired object in the tool palette.
2. Move the cursor to the workspace.
3. Select the workspace to add the object.
Data Services Work Flows
A work flow is an optional object that defines the decision-making process for executing other
objects. Work flows are created in the Local Object Library or in the tool palette. The figure Data
Services Work f low shows an example of a work flow.
LastUpdate
DataFlowA
~~~-~
Tiy
~-~~f
DataFlowB
-~
WorkFlowA
SetStatus
Catch
~-~®~
WorkFlowB
Figure 13: Data Services Work Flow
Elements in a work flow can determine the path of execution based on a value set by a previous
job or can indicate an alternative path if something goes wrong in the primary path. The purpose
of a work flow is to prepare for executing data flows and to set the state of the system when the
data flows are complete.
Work f lows can contain data flows. conditionals, while loops, try/catch blocks, and scripts. They
can also call other work flows. and you can nest calls to any depth. A work flow can even call itself.
You can connect objects in the workspace area by dragging the right -hand triangle or square of
one object to the left-hand triangle or square of the second object.
To disconnect objects in the workspace area, select the connecting line between the objects and
press Delete.
Note:
Jobs are just work flows that can be executed. Almost all of the features documented
for work flows also apply to jobs.
Work Flow Order of Execution
The connections you make between the icons in the workspace determine the order in which
work flows execute, unless the jobs containing those work flows execute in parallel.
©Copyright . All rights reserved .
45
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 3: Batch Job Creation
Steps in a work flow execute in a sequence from left to right. You must connect t he objects in a
work flow when there is a dependency between the steps. To execute more complex work flows in
parallel, define each sequence as a separate work fl ow, and then call each of the work flows from
another work flow.
Single and Continuous Work Flows
When you create a work flow, a job can execute the work f low one time only, as a single process,
or the job can execute the work flow as a continuous process, even if t he work flow appears in t he
job multiple times.
•
Single Work Flow:
A single work flow runs all of its child data flows in one operating system process. If the data
f lows are designed to be run in parallel then they are run in different threads instead of in
different processes. A single work flow is an option when developing complex jobs with
multiple paths, such as jobs wit h try/catch blocks or conditionals. It is possible to share
resources such as database connections across multiple data flows.
•
Continuous Work Flow
A continuous work flow runs all data flows in a loop but keeps them in the memory for the next
iteration. This eliminates the need to repeat some of the common steps of execution, for
example, connecting to the repository, parsing/optim izing/compiling ATL or opening
database connections.
Data Services Data Flows
Data f lows contain t he source, transform, and target objects that represent the key activit ies in
data integration and data quality processes.
Data Flow Usage
Data flows determine how information is extracted from sources. transformed, and loaded into
targets. The lines connecting objects in a data flow represent t he flow of data with data
integration and data quality processes.
Data f lows:
•
Extract. t ransform and load data.
•
Determ ine t he flow of data.
•
Are closed operations.
•
Are created in the Local Object Library or in the tool palette.
Each icon you place in t he data flow d iagram becomes a step in the data flow as shown in the
figure Data Services Data flow.
CUSTOMERS(Source ....
mJ~ •·
SJ
Query
~xr
CUST_DIM(Target.D ...
••
~mJ
SJ
Figure 14: Data Services Data Flow
46
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Creating Batch Jobs
You can use source and target objects and transforms as steps in a data flow. Make connections
between the icons to determine the order in which Data Services completes the steps.
Data Flow Steps in Work Flows
Each step in a data flow, up to the target definition. produces an intermediate result. The
intermediate result is called a data set.
For example. the results of an SQL statement contain a WHERE clause that flows to the next step
in the data flow. The intermediate result is a set of rows from the previous operation and the
schema in which the rows are arranged. This data set may be further fil tered and directed into
another data set.
Data flows are closed operations, even when t hey are steps in a work flow. Any data set created
within a data flow is not available to other steps in the work flow.
Work Flow Operations
A work flow does not operate on data sets and cannot provide more data to a data flow. However,
a work f low can perform the following operations:
•
Call data flows to perform data movement operations
•
Define the conditions appropriate to run data flows
•
Pass parameters to and from data flows
Data Flow Properties
Table 8: Data Flow Properties and Descriptions
You can specify the following data properties for a data flow:
Data Flow Property
Description
Execute only once
Specify that a data flow executes only once. Do not select this option if
the parent work f low is a recovery unit.
Use database links
Database links are communication paths between one database server
and another. Database links allow local users to access data on a
remote database.
Degree of parallelism
Degree of parallelism (DO) defines how many t imes each transform
within a data fl ow processes a parallel subset of data.
Cache type
Cache data to improve the performance of operations such as joins.
groups, sorts, filtering. lookups, and table comparisons. Choose the
default Pageable value to return only a small amount of data at a t ime.
Choose t he In Memory value if your data flow produces a small amount
of data t hat will fit in the available memory.
Source and Target Objects
A data flow reads data directly from source objects and loads the data to target objects.
Before you add source and target objects to a data flow, create the datastore and import the
table metadata for any databases, or create the file format for flat files.
©Copyright . All rights reserved .
47
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 3: Batch Job Creation
Table 9: Object Types and Descript ions
The following table contains a description of source and target objects that can be used in a data
flow:
Object
Description
Table
A file formatted with columns and rows as used Source and target
in relational databases.
Template table
A template t able that has been created and
saved in another data flow.
Source and target
File
A delimited or fixed-width flat f ile.
Source and target
Document
A file with an application-specific format that is
not readable by an SQL or XML parser.
Source and target
XM L file
A file formatted with XML tags.
Source and target
XM L message
A source in real-time jobs.
Source only
XML template f ile
An XML file with a format based on t he
preceding transform output. An XML template
file is used primarily for debugging data flows
in development.
Target only
Transform
A prebuilt set of operations t hat can create
new data, such as the Date Generation
transform .
Source only
Type
The Query Transform
The Query transform is the most commonly used transform, and is included in most data f lows. It
enables you to select data from a source and filter it or reformat it as it moves to the target.
The figure Query transform shows how a filter can be applied between the source and the target.
Input:
Output:
Name
Address
Region
Name
Address
Region
Charlie
11 Maii Street
IL
Charlie
11 Maii Street
IL
Daisy
13 Nelson Avenue
IL
Daisy
13 Nelson Avenue
IL
Dolly
9 Park Lane
CA
Sid
a Andrew Crescent
IL
Joe
77 Miller Street
NY
Megan
21 Brand Avenue
CA
Sid
8 Andrew Cresoent
IL
Figure 15: Query Transform
Transform Editor
The Transform Editor is a graphical interface for defining the properties of transforms. The
workspace contains the following areas:
48
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Creating Batch Jobs
•
Input schema
•
Output schema
•
Parameters
Query Transform Operat ions
The Query transform is included in the tool palette with other standard objects. The Query
transform performs the following operations:
•
Filters data extracted from sources
•
Joins data from multiple sources
•
Maps columns from input to output schemas
•
Performs transformations and functions on the data
•
Performs data nesting and unnesting
•
Adds new columns, nested schemas. and function results to the output schema
•
Assigns primary keys to output columns
Input and Output Schemas
The figure Query Transform Editor displays the input and output schema areas.
lnput sche1no GrE!CI
SChema In:
ltii! ods_customtr
O utput sche•nb
.:J
• OJst..classf
• Mime!
• 1'1)me1
• Adcbss
~ Oty
~ ~""-"'
• Add -ess
• Oty
..
I+-
~ Re~MJO
~ l~
OJst..tlMtitatrO
·- ~
B
I
.:J
··~ ''"~ 10
~ ""lJO
• OJst_c~ssf
'I
l~Q~iy
ii' Que1y
13 fill ods_customer
--+
SdletnaOUt:
Oreo
'
•I
"•
I
'
Figure 16: Qu ery Transform Editor
The input schema area displays the schema of the input data set. For source objects and some
transforms. this area is not available.
The output schema area displays the schema of the output data set, including any functions. For
template tables, the output schema can be defined based on your preferences.
Define a relationship between the input and output schemas to move data from the source to the
target. Map each input column to the corresponding output column to create this relationship.
Map Input Columns t o Output Columns
Perform one of the following actions in the transform editor to map input columns to output
columns:
•
Drag a single column from the input schema area to the output schema area.
© Copyright . All rights reserved .
49
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 3: Batch Job Creation
•
Drag a single input column over the corresponding output column, release the cursor, and
select Remap Column from the menu.
•
Select multiple input columns using Ctrl+click or Shift+click on your keyboard and drag to the
Query output schema for automatic mapping.
•
Select the output column and manually enter the mapping on the Mapping tab in the
parameters area . You can type the column name in the parameters area or drag the column
from the input schema pane.
•
Select the output column, highlight and manually delete the mapping on the Mapping tab in
the parameters area.
Parameters Area
The Parameters area is below the input and output schema areas in the Query Transform Editor.
The options on this tab vary based on the transform or object you are modifying.
Table 10: Parameters Area Options
The following table describes the Parameters area tabs in the Query Transform Editor:
Tab
Description
Mapping
Specify how the selected output column is derived.
Select
Select only distinct rows, discarding duplicate rows.
From
Specify the input schemas used in the current output schema.
Outer Join
Specify an inner table and an outer table for joins that you want to treat
as outer joins.
Where
Set conditions to determine which rows are output.
Group By
Specify a list of columns to combine output.
Order By
Specify columns to sort the output data set.
Advanced
Create separate subflows to process resource-intensive query clauses.
Find
Search for a specific item in the input schema or in the output schema.
Query Transform Editor for Joins
Define joins in the From tab of the Transform Editor. The From tab is d isplayed in the figure Query
Transform Editor for Joins.
50
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Creating Batch Jobs
lnplt sdleme(s)
Alt-
"''"" lotlrri ~
:iii 0
i'i!El<EC_EM'
1l!MAIS.}'ERSOtffl. ,"J
"ljMATS,.D4'
';;)
..
Input
schemas
0
No
0
Al.tOftllll<
- - ...
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _!
·~··
~-··~----·-------------­
Lell
loin Type
Join pairs
Rl!#1t
MAIS...£1'1' left o.ter Jon MATSJlEPI
._,
!mofjon
El<ECJM'
fltOM clause:
OM MATS_EM' lffT OUl£R JOIN MAISJl(PT (SU
TSJlEPT.OEPTNOol>IAIS_El'P.llEPINO
FROM
clause (read
only)
r«<he• •'no)) OH (
SETN_on_teJ<t"•'MATSJlEPT .OEPTNO-MATS.J;..,.OEPTNO') C...£1'1'•...,.-o • MATSPf'.EMl'NO «Id
J01N El<EC...£1'1' OH (
CJ:"".~TSJl(PT.OEPINO
Figure 17: Query Transfor m Editor for Joins
Join Ranks and Cache settings show in the From tab. Outer and inner joins show in the F ROM
clause.
Join ranks and cache options impact the performance of joins that are not pushed down to the
database. Tune your data flaw's performance with this advanced setting. For example, use the
Query transform to select a subset of the data in a table to show records from a specific region
only.
Target Tables
The target object for your data flow can be a physical table or file, or a template table.
When your target object is a physical table in a database, the target table editor opens in the
workspace. The workspace contains tabs for database type properties, table loading options. and
tuning techniques for loading a job.
Note:
Most of the tabs in the target table editor focus on migration or on performancetuning techniques. This course concentrates on the Options tab only.
Table 11: Target Table Editor Options
You can set the following table loading options in t he Options tab of the target table editor:
Option
Description
Rows per commit
Specify the transaction size in number of rows.
Column comparison
Specify how the input columns are mapped to output columns.
Validation errors occur if the data types of the columns do not
match.
©Copyright . All rights reserved .
~
51
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 3: Batch Job Creation
Option
Description
Delete data from a table
before loading
Use this option to send a TRUNCATE statement to clear the
contents of a table before loading during batch jobs. The option
defau lts to Not selected.
Number of loaders
Specify the number of loaders and the number of rows per
commit that each loader receives during parallel loading.
Use overview file
Use this option to write rows that cannot be loaded to the
overflow file for recovery purposes. Options are enabled for the
file name and file format.
Ignore columns with value
Specify a value in a source column that you do not want
updated in the target table.
Ignore columns with null
Ensure that NULL source columns are not updated in the target
table during auto correct loading.
Use input keys
Enable Data Services to use the primary keys from the source
table. By default, Data Integrator uses the primary key of the
target table.
Update key columns
Update key column values when loading data to the target.
Auto correct load
Ensure that the same row is not duplicated in a target table.
This option is useful for data recovery operations.
Include in transaction
Commit data to multiple tables as part of the same transaction.
The tables must be from the same datastore
Transaction order
By default, there is no ordering of the tables being loaded.
Specify orders among the tables and the loading operations are
applied according to the order. Tables with the same
transaction order are loaded together.
Template Tables
Use template tables in early application development when you are designing and testing a
project.
With template tables, you do not have to create a new table in your datastore and import the
metadata into Data Services. Data Services automatically creates the table in the database with
the schema defined by the data flow when you execute a job.
When you create a template table as a target in one data flow, you can use it as a source in other
data flows
52
•
Make schema changes without going to the Relational Database Management System
(RDMS).
•
Template tables do not exist in the underlying database until the data flow has been executed
successfully once.
•
Once executed. template tables become actual tables in the underlying database.
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Creating Batch Jobs
•
Template tables are only identified as template tables in the metadata within the Data Services
Repository.
•
The Target Table Editor has an option to Drop and Recreate Table for template tables.
•
The Import Table option converts a template table into a normal table.
You must convert template tables to normal tables so that you can use the new table in
expressions, functions. and transform options. When you convert the template table, you can no
longer alter the schema.
Job Execution
When you create your project. jobs, and associated data flows, you can execute the job in Data
Services to move the data from source to target.
Immediate Jobs and Scheduled Jobs
You can run jobs in the two following ways:
•
Immediate Jobs
Data Services initiates batch and real-time jobs and runs them immediately from within the
Designer. The Designer and the designated job server must be running to execute the job. Run
immediate jobs only during the development cycle.
•
Scheduled Jobs
Batch jobs are scheduled. Use the Data Services Management Console or a third-party
scheduler to schedule the job. The job server must be running to execute a scheduled job.
Note:
A job does not execute if it has syntax errors.
Table 12: Execution Properties
The following table shows the available options in the Execution Properties window:
Option
Description
Print all trace messages
Record all trace messages in the log.
Disable data validation
statistics collection
Do not collect audit statistics for this specific job execution.
Enable auditing
Collect audit statistics for this specific job execution.
Enable recovery
Enable recovery to save the results from completed steps and
to resume failed jobs.
Recover from last failed
execution
Retrieve the results from any steps that were previously
executed successfully and re-execute any other steps. This
option is only avai lable if a job has been executed once and if
recovery mode was enabled during the previous run.
©Copyright . All rights reserved .
53
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 3: Batch Job Creation
54
Option
Description
Collect statistics for
optimization
Collect statistics so that the Data Services optimizer can
choose an optimal in-memory or pageable cache type.
Collect statistics for
monitoring
Display cache statistics in the Performance Monitor in
Administrator.
Use collected statistics
Use the cache statistics collected on a previous execution of
the job.
System configuration
Specify the system configuration to use when executing the
job. A system configuration defines a set of datastore
configurations, which define the datastore connections.
Job server or server group
Specify the job server or server group to execute the job.
Distribution level
Distribute a job to multiple job servers for processing. Execute
the entire job on one server, execute each data flow within the
job on a separate server or execute each subflow within a data
flow on a separate job server.
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 3
Exercise 3
Create a Basic Data Flow
Business Example
You are an ETL developer working on a data warehousing project. You need to load data from the
product table and create a table for North American customers.
Use the Query transform to change the schema of the Alpha Acquisitions Customer table .
1. Create a new project called Omega.
2. In the Omega project, create a new batch job Alpha_Product_Job with a new data flow,
Alpha_Product_DF.
3. In the workspace for Alpha_Product_DF, add the product table from the Alpha datastore
as the source object.
4. In the workspace for Alpha_Product_DF, add the PRODUCT_TARGET table from the Omega
datastore as t he target object.
5. View the data for both t ables. Verify t hat both tables have the same column names. and that
t he target table is empty.
6. Connect the source table to the target table.
7. Open the target table editor to v iew the Schema In and Schema Out.
8. Save and execute the job, Alpha_Product_Job.
9. In the Omega project, create a new batch job Alpha_NACustomer_Job with a new data flow
called Alpha_NACustomer_DF.
10. In the workspace for Alpha_NACustomer_DF, add the customer table from t he Alpha
datastore as t he source object.
11. Create a new template table alpha_NA_customer in the Del ta datastore as the target
object.
12. Add the Query transform to the workspace between the source and target.
13. In the transform editor for the Query transform, map all columns from the Schema In to the
Schema Out.
14. Use a WHERE clause to select only customers in North America (North American countries
are United States, Canada, and Mexico which have COUNT RYID values of 1, 2 . and 11).
15. Save and execute the Alpha_NACustomer_Job
© Copy right . All r ights r eserved.
55
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 3
Solution 3
Create a Basic Data Flow
Business Example
You are an ETL developer working on a data warehousing project. You need to load data from the
product table and create a table for North American customers.
Use the Query transform to change the schema of the Alpha Acquisitions Customer table.
1. Create a new project called Omega.
a) In the Project menu, choose New -+ Project.
b) In the Project New dialog box. in the Project name field. enter Omega.
c) Choose Create .
The new project appears in the Project area.
2. In the Omega project. create a new batch job Alpha_Product_Job with a new data flow,
Alpha_Product_DF.
a) In the Project area, right-click the project name and, in the context menu, choose New
Batch Job.
b) Enterthe job name, Alpha_Product_Job and, on your keyboard, press the Enter key.
The job should open automatically. If it does not. open it by double-clicking.
c) In the Alpha_Product_Job workspace, in the tool palette, choose the Data Flow icon.
d) Click in the workspace where you want to add the data flow, and enter the name
Alpha_Product_DF.
3. In the workspace for Alpha_Product_DF, add the product table from the Alpha datastore
as the source object.
a) In the Local Object Library, choose the Datastores tab.
b) Select the product table from the Alpha datastore.
c) Drag the table to the data flow workspace and choose Make Source.
4. In the workspace for Alpha_Product_DF, add the PRODUCT_TARGET table from the Omega
datastore as the target object.
a) In the Local Object Library, select the Datastores tab
b) Select the PRODUCT_TARGET table from the Omega datastore.
c) Drag the table to the data flow workspace and choose Make Target.
56
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Creating Batch Jobs
5. View the data for both tables. Verify that both tables have the same column names. and that
the target table is empty.
a) Click t he View Data icon (magnifying glass) for the source table in t he workspace.
b) Click t he View Data icon (magnifying glass) for the target table in the workspace.
c) Close both view data windows.
6. Connect the source table to the target table.
a) Click t he right side of the source table.
b) Hold t he left mouse button down while dragging to the left side of the target table.
7. Open the target table editor to view the Schema In and Schema Out.
a) Double click t he target table.
Note that the source and target tables have the same schema and can connect directly
without use of a Query transform.
8. Save and execute the job, Al.pha_Product_Job.
a) In the main menu, choose Project
~ Save All
b) To save changes, choose OK.
c) In the Project Area, right click t he Alpha_Product_Job.
d) Choose Execute.
e) To accept the default execution properties. choose OK.
9. In t he Omega project. create a new batch job Alpha_NACustomer_Job with a new data f low
called Alpha_NACustomer_DF.
a) In the Project area, right click the project name.
b) Choose New Batch Job.
c) Name the job Al.pha_NACustomer_Job and, on your keyboard, press Enter.
The job should open automatically. If it does not, open it by double-clicking.
d) In the tool palette, choose the Data Flow icon.
e) Click t he workspace where you want to add the data flow.
f) Name the data flow Alpha_NACustomer_DF and, on your keyboard, press Enter.
The job should open automatically. If it does not. open it by double-clicking.
10. In t he workspace for Alpha_NACustomer_DF, add t he customer table from the Alpha
datastore as the source object.
a) In the Local Object Library, select the Datastores tab
b) Select the customer t able from the Alpha datastore .
c) Drag the table to the data flow workspace and choose Make Source.
©Copyright . All rights reserved .
57
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 3: Batch Job Creation
11. Create a new template table alpha_NA_c u s t omer in the Del t a datastore as the target
object.
a) To add a new template table to the workspace, in the tool palette, choose the Template
Table icon, and click the workspace.
b) In the Create Template dialog box, in the Table name f ield, enter a l p h a_NA_c u stomer.
c) In the In datastore field, choose Delta from the dropdown list.
d) Choose OK.
12. Add the Query transform to the workspace between the source and target.
a) To add a new Query template to the data flow, in the tool palette, choose the Query
Transform icon, and click the workspace.
b) To connect the source table to the Query transform. select the source table, hold down the
left mouse button, drag the cursor to the Query transform. and release the mouse button.
c) To connect the Query transform to the target template table, select the Query transform,
hold down the left mouse button. drag the cursor to the target table, and release the
mouse button.
13. In the transform editor for the Query transform, map all columns from the Schema In to the
Schema Out.
a) To open the Query Editor, in the data flow workspace, double click the Query transform.
b) To select all columns in the Schema In, choose the CUSTOMER ID column, hold the shift
key, and choose the PHONE column.
c) Drag all columns to the Schema Out.
14. Use a WHERE clause to select only customers in North America (North American countries
are United States, Canada, and Mexico which have COUNTRYID values of 1, 2. and 11).
a) In the Query editor, choose the WHERE tab.
b) Enter the where clause. c u stomer . COUNTRY I D in ( 1 , 2, 11)
15. Save and execute the Alpha_NACustomer_Job
a) In the main menu, choose Project -
Save All
b) To save all changes, choose OK.
c) In the Project Area, right click the Alpha_NACus tomer_Job and choose Execute.
d) To accept the default execution properties. choose OK.
e) To return to the Job workspace, in the tool bar, choose the Back icon.
f) To return to the Data Flow workpace, double click the Data Flow.
g) To view the template table data, select the small magnifying g lass in the lower right-hand
corner of the template table.
h) Confirm that 22 records were loaded.
58
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Creating Batch Jobs
i) Close the table.
j) Close the Data flow.
©Copyright . All rights reserved .
59
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 3: Batch Job Creation
LESSON SUMMARY
You should now be able to:
60
•
Examine batch jobs
•
Create a basic data flow
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 3
Learning Assessment
1. A job is the highest level of organization offered by SAP Data Services.
Determine whether this statement is true or false.
D
D
True
False
2. A job is a schedulable object.
Determine whether this statement is true or false.
D
D
True
False
3. When you connect objects in the workspace area you cannot disconnect them again.
Determine whether this statement is true or false.
D
D
True
False
4. Describe the function of these tabs in the Parameters area of the Query Transform Editor.
Match the item in the first column to the corresponding item in the second column.
Mapping
Outer Join
Group By
Specify how the selected
output column is derived.
Specify an inner table and an
outer table for joins that you
want to treat as outer joins.
Specify a list of columns to
combine output.
© Copy right . All r ights r eserved.
61 ~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 3
Learning Assessment - Answers
1. A job is the highest level of organization offered by SAP Data Services.
Determine whether this statement is true or false .
D
True
0
False
2. A job is a schedulable object.
Determine whether this statement is true or false .
0
True
D
False
3. When you connect objects in the workspace area you cannot disconnect them again.
Determine whether this statement is true or false .
D
True
0
False
4. Describe the function of these tabs in the Parameters area of the Query Transform Editor.
Match the item in the first column to the corresponding item in the second column.
Mapping
Outer Join
Group By
Specify how the selected
output column is derived.
Specify an inner table and an
outer table for joins that you
want to treat as outer joins.
Specify a list of columns to
combine output.
62
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Batch Job Troubleshooting
Lesson 1
Writing Comments with Descriptions and Annotations
64
Lesson 2
Validating and Tracing Jobs
Exercise 4: Set Traces and Annotations
67
75
Lesson 3
Debugging Data Flows
Exercise 5: Use the Interactive Debugger
78
83
Lesson 4
Auditing Data Flows
Exercise 6: Use Auditing in a Data f low
88
95
UNIT OBJECTIVES
•
Write comments with descriptions and annotations
•
Validate jobs
•
Trace jobs
•
Use the interactive debugger
•
Audit data flows
© Copy right . All r ights r eserved.
63 ~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 4
Lesson 1
Writing Comments with Descriptions and
Annotations
LESSON OVERVIEW
Add descriptions and annotations to jobs, work flows and data flows to document decisions when
executing your jobs.
LESSON OBJECTIVES
After completing this lesson, you will be able to:
•
Write comments with descriptions and annotations
Descriptions and Annotations
Business Example
Your company has recognized how useful it can be to integrate people, information, and business
processes in a heterogeneous system landscape and would like to obtain this benefit. Practice
has shown, though, that loading large datasets makes considerable demands on hardware and
system performance. It is therefore necessary to examine if and how the data records can be
loaded into SAP NetWeaver Business Warehouse with a delta process and to understand t he
modes of operation and the different variants of a delta loading process.
Descriptions and annotations are a convenient way to add comments to objects and workspace
diagrams.
Annotations
•
A sticky-note with folded-down corner
•
Added from tool palette descriptions
•
To make them visible in the Designer:
Input Description (in Object Properties)
Enable Description (right-click Object)
View Enabled Description (toolbar or menu)
Using Descriptions with Objects
A description is associated with a particular object. When a repository object is imported and
exported, the description is also imported or exported.
64
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Writing Comments with Descriptions and Annotations
A data flow designer determines when to show object descriptions based on a system-level
setting and an object-level setting. Both settings are activated to view the description for a
particular object.
Note:
The system-level setting is unique to the setup.
Requirements for Displaying Descriptions
•
A description has been entered into the properties of t he object
•
The description is enabled on the properties of that object
•
The global View Enabled Object Descriptions option is enabled
Show Object Descriptions at System Level
•
From the View menu, select Enabled Descriptions
Note:
The Enabled Descriptions option is only available when there is at least one object
presented in the workspace.
Add Description to an Object
1. In the project area or the workspace, right-click an object and select Properties from the
menu.
The Properties dialog box displays.
2. In the Description text box, enter your comments.
3. Select OK.
If you are modifying t he description of a reusable object, Data Services provides a warning
message that all instances of the reusable object are affected by the change.
4. Select Yes .
The description for the object displays in the Local Object Library.
Display Description in the Workspace
•
In the workspace, the description displays in the workspace under the object
Using Annotations to Describe Objects
An annotation is a sticky-note-like object in the workspace that describes a flow, part of a flow, or
a diagram. An annotation is associated with the object where it appears. When a job, work flow, or
data flow is imported or exported, and includes annotations, the associated annotations are also
imported or export ed.
©Copyright . All rights reserved.
65 ~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 4: Batch Job Troubleshooting
Add Annotation t o the Workspace
1. In the workspace, from the tool palette. select the Annotation icon and then select the
workspace.
An annotation appears on the d iagram.
2. Double-click the annotation.
3. Add text to the annotation .
4 . Select the cursor outside of the annotation to commit the changes.
Resize and move the annotation by clicking and dragging. It is not possible to hide annotations
that are added to the workspace. but they can be moved out of the way or deleted.
LESSON SUMMARY
You should now be able to:
•
66
Write comments with descriptions and annotations
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 4
Lesson 2
Validating and Tracing Jobs
LESSON OVERVIEW
Validate your jobs and their components before execution and trace jobs to troubleshoot and
evaluate job execution.
LESSON OBJECTIVES
Aft er completing this lesson, you will be able to:
•
Validate jobs
•
Trace jobs
Job Validation
It is advisable to validate jobs when ready for job execution to ensure that there are no errors. It is
also possible to select and set specific trace properties, which allow for the use of various log f iles
to help read job execution statuses or to troubleshoot job errors.
As a best practice, validate work whilst objects are being built to avoid many warnings and errors
at run time. Validate objects as the job is created or automatically validate all jobs before
executing them as shown in t he figure Validating Jobs .
Automatic Validation
-dot·-.. .-)
Cat.ooorv:
rn.t"cnwit
-"'-·
,.,,._. IA dww...,$ n ~• )COil Qnt: ( 11
"""""
....
.....
c.Nr-RtOOfiitOrY ~
. Job-
(1000
~~tr.. Wnetutotuto~:
~""""'"' (iOO
~ oc-. (iOO
P Ottd. Wtl'Nterttoior**' tJ ct. Hfllltrwnt
r--1-.
F7 f.-1.......,....-..,..............,
F7
--on)ObCo1GM<·---.-d4t•""'
r
rF7 ~--)Ob"~"'
,._\obi,, ___
r
~..-.,
...~.
F7 -~t••o"'""""'
Validation on Demand
Outptll
X
I ""
•
'
Figure 18: Validating Jobs
Validate Jobs
© Copy right . All r ights r eserved.
67
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 4: Batch Job Troubleshooting
To validate jobs automatically before job execution:
1. From the Tools menu. select Options.
The Options dialog box displays.
2. In the Category pane, expand the Designer branch and select General.
3. Select the Perform Complete validation before job execution option.
4. Select OK.
Validat e Object s
To validate objects on demand:
1. From the Validation menu, choose Validate -+ Current View or All Objects in View.
The Output d ialog box displays.
2. To navigate to the object where an error occurred. right-click the validation error message and
select Go To Error from the menu.
Tracing Jobs
Use trace properties to select the information that Data Services monitors and writes to the trace
log file during a job. Data Services writes trace messages to the trace log associated with the
current job server and writes error messages to the error log associated with the current job
server.
Setting Traces in Job Execution Properties
•
Determines what information is written to the log
•
Can be changed temporarily (Execution Properties) or persistently (Job Properties)
•
It is possible that some trace trace options are written for every row (for example, Trace Row,
SQL. and Loaders)
f>ee<utlon Properties
""""
_
YWo
Oe:scr' ion
rr~Row
No
Tiau Row
lfilrroce Se$sion
v..
Tr~Session
l!Jrr«e WorkFbw
y.,
Tr«e Work.Row
§} Trac.s Data Flow
Yes
Trace-Data fkwt
No
Trace Ttamform
No
Tr~ Cu~ Triltldotm
No
Tr.xe CustomFt.ne:tion
""
Trace />SAP Qoeory
No
nace scy. FunctiGns
No
Trac! SQt TrOMfouns
No
Tr.:xc: SQl Ac.oder$
No
Trace-SQll.O~ers
No
Trace ~Y Readef
~Trace. T1ansform
!mrra~Qgtem T1.!ltl:Sf0tm
~ Tr.w: Ou.stom ft.J'ICtiGn
jfil Tracs MAP Quo!:1y
~Trace SQl ForctiOns
~Trace SQl Transforms
~ Tr~SQI. Re«lers
~TraceSQl ~s
~Tl'ace l°':'lemory SOU'ce
Value:
I
..:.I
3
I ""
~
Figure 19 : Setting Traces in Job Execution Properties
68
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Validating and Tracing Jobs
Job Tracing
Table 13: Trace Options
The following trace options are available:
Trace
Description
Row
Writes a message when a transform imports or
exports a row
Session
Writes a message when the job description is
read from the repository, when the job is
optimized, and when the job runs
Work flow
Writes a message when the work flow
description is read from the repository, when
the work flow is optimized, when the work flow
runs, and when the work f low end
Data flow
Writes a message when the data flow starts
and when the data flow successfully finishes or
terminates due to error
Transform
Writes a message when a t ransform starts and
completes or terminates
Custom transform
Writes a message when a custom transform
starts and completes successfully
Custom function
Writes a message of all user invocations of the
AE_LogMessage function from custom C code
SQL functions
Writes data retrieved before SQL functions:
•
Every row retrieved by the named query
before the SQL is submitted in the
key_generation function
•
Every row retrieved by the named query
before the SQL is submitted in the lookup
function (but only if PRE_LOAD_CA CHE is
not specified)
•
When mail is sent using the mail_ to function
SQL transforms
Writes a message (using the Table Comparison
transform) about whether a row exists in the
target table that corresponds to an input row
from the source table
SQL readers
Writes the SQL query block that a script, query
transform, or SQL function submits to the
system and writes the SQL results
© Copyright. Al l rights reserved.
69 ~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 4: Batch Job Troubleshooting
Trace
Description
SQL loaders
Writes a message when the bulk loader starts,
submits a warning message, or completes
successfully or unsuccessf ully
Memory source
Writes a message for every row retrieved from
the memory table
Memory target
Writes a message for every row inserted into
the memory table
Optimized data flow
For SAP BusinessObjects consulting and
technical support use
Tables
Writes a message when a table is created or
dropped
Scripts and script f unctions
Writes a message when a script is called, a
function is called by a script, and a script
successfully completes
Trace parallel execution
Writes messages describing how data in a data
flow is parallel processed
Access Server communication
Writes messages exchanged between the
Access Server and a service provider
Stored procedure
Writes a message when a stored procedure
starts and f inishes. and includes key values
Audit data
Writes a message that collects a statistic at an
audit point and determines if an audit rule
passes or fails
Set Trace Options
1. From the project area. right-click t he job name and do one of these actions:
•
To set trace options for a single instance of the job, select Execute from the menu.
•
To set trace options for every execution of the job, select Properties from the menu .
Depending on which option you selected, the Execution Properties d ialog box or the Properties
d ialog box d isplays.
2. Select the Trace tab.
3. Under the name colum n, select a trace object name .
The Value dropdown list is enabled when you select a trace object name.
4. From the Value dropdown list, select Yes to turn the trace on.
5. Select OK.
70
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Va lidating and Tracing Jobs
Log Files
As a job executes. Data Services produces 3 log f iles, which are viewed from the Monitor tab of
the project area. The log fil es are, by default, also set to display automatically in the workspace
when a job is executed.
Select t he Trace, Monitor. and Error icons to view the log files, which are created during job
execution.
Examining Trace Logs
Use the trace logs to determine where an execution failed, whether the execution steps occur in
the order you expect. and which parts of t he execution are t he most time consuming as shown in
the figure Using the Trace Log.
Trace Log
~ JQJ.21
-,.,..
(!')2?96
I
-
R.e.-.g ~ <39f8t lf4_6962_'431)t_b0ec_2.:: lb93ea9ee2
1132
J08
2jl5(20(811~:17PM
1132
J08
2#1$l200911:06:17PM
Q)2?96
ll~
J08
Zil.$J2DOB.1 l:Cl6:17PM
is <:12.0.0.AXICIO'>.
Cl.rrri '*"edory d ;:b <39fSc:Jf4_.f962.,.~...b0ec.,,7r..
02'96
1132
J08
2Jt5(200S I l~Otkl7PM
Stlrth,J j)b,on ~ ~ bo!lt <l.Ell.aotr'IM.>. part <JSr
1132
1132
Z/lSf2t/01! 11 :06;17 PM
)Ob <Demo_Jrb>
2:fl$12G011 l :06:17flM
1132
J08
2/15#'21JOS; 11 :06;18 PM
Ptoc:e:ssh;I )ob- <Otmo_lob>.
........,)Ob - . ) o b>.
02'96
""'
""'
1132
J08
2/15(2D08I1:06;18PM
Job <DeMo_Job> is~
\!>2196
1132
·:!)eoo
WOOJll'l.OW
OAT/IA.OW
ZllSIZOOll 11:06:11 PM
1"24
0808
1124
OAlAflOW
2#15/200S 11 :06!19PM
WOrkAow <Ocn=ao_Wf>is :Jtortcd.
Ptoc.e:ss to «X«L.t• dM Row ~JJi' > is st.artcd.
a.ta l'bil <Delbo.Pf> iS sttrted.
Caiche MiSbcs: deter. . . tNt data ftow <Di!cno_C.-> u
to) the WWot memory <16106127"6> ~~for
O«a flow <f:>«rto.J'#> ~ SN"9<lft'r' Cact..
<!)2?96
0 2?96
(!)2?96
...
~
#1$l2009 11:06:19PM
"'nri:I 4'00ll02152306172196113:>
1124
OAJAFLOW
2/l5(21X8 I t:06:19PM
800
1"24
#IS/20(8 I l :06;19PM
Q;leoo
Q)eos
(!°)808
....
....
OAf»'l.OW
OATAA.OW
OATAR.OW
21t5'200811:06:20PM
1132
OAfPrfl.OW
WORICROW
Z/lsnt»8I1;06:20PM
WQrkfb# ~-WF> iJ ~ sutt~ .
1132
J08
t/IS/2l'IOS I 1~06~20PM
Jcb <Demo.Job> IS ccetPet.ed 'SAJ«ffWth,
(!)2'196
(!)2'196
1124
•I
·1
2t1snoos 11~06:1tPM
Z/t5f2Q)S 11 :06;20 PM
oata flow <:DfJltttoJJF> liS ~ed si.uessfl.ly.
Ffocesis. to «le(l..te det.o Aow <DemoJlf> is OOl•ir*ted·
.!.l
Figure 20: Using the Trace Log
Examining Monitor Logs
Use monitor logs to quantify the activities of the components of the job as shown in the figure
Using the Monitor and Error Logs. It lists the time spent in a given component of a job and the
number of data rows that streamed t hrough the component.
© Copyr ight. All r ights reserved.
71
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 4: Batch Job Troubleshooting
Monitor Log
I
~mr ..!.I
..........
I
• !Ut•
!ID-Jlff(Moy_"'''-1!3-Jl'f(Moy
Row (Oll'lt I
12
12
STOP
STOP
Tme tHC\ 1 AbfoMo?Wtsec\
0.016
0.000
I
s.121
3.10S
Error Log
]!!]J1JIOJ
!!!
(!IJ2900
2900
2900
(!IJ2900
2900
2900
I
I ·~
I
I
I
°""°
3069
3068
\IAt.'°30118
2/15/;?00(l IJ:l6:S.PM
ISESSION O.."°_.lobllYOIOO'lOW
VAl.'°301 18
~ 1Sl20081 1 :16:54PM
OlnQ <O««lowO...OP> wth <O>
3068
VAl-OlOl 18
211512008 11:16:54 PM
ci-#1'\eters.
3068
3068
3068
VAl.-4JOl 18
VAL'°'°J 18
211Sl2009 l I:16:$$PM
2/ISl2008 1l :16:SSPM
IS6SION """"'-""llYOIOO'LOW 0 OlnQ <01t1ll"'""""'-"'' .... <0>
\IAL-<BOl 18
2115/2008 11:16:SSPM
paremeteit.
Figure 21: Using the Monitor and Error Logs
Examining Error Logs
Use the error logs to determine how an execution failed. If the execution completed without error.
the error log is blank.
Using the Monitor Tab
The Monitor tab lists the trace logs of all current or most recent executions of a job.
The traffic-light icons in t he Monitor tab indicate:
•
A green light indicates that the job is running.
Right-click and select Kill Job to stop a job that is still running. After selecting Kill Job. the job
icon becomes a yellow triangle with an exclamation point in the m iddle.
•
A red light indicates that the job has stopped.
Right-click and select Properties to add a description for a specific trace log. This description
is saved with the log which can be accessed later from the Log tab.
•
A red cross indicates that the job encountered an error.
Using the Log Tab
Select the Log tab to view a job's log history.
To view log files from the project area:
1. In the project area. select the Log tab.
2. Select the job for which you want to view the logs.
3. In the workspace, in the+ dropdown list. select the type of log you want to view.
4. In the list of logs, double-click the log to view detai ls.
72
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Va lidating and Tracing Jobs
5. To copy log content from an open log, select one or more lines and use t he key command
[CTRL+C].
Determining the success of the job
The best measure of the success of a job is the state of the target data. Always examine data to
make sure that the data movement operation produces the results expected.
Be sure that:
•
Data is not converted to incompatible types or truncated.
•
Data is not duplicated in the target.
•
Data is not lost between updates of the target.
•
Generated keys have been properly incremented.
•
Updated values were handled properly.
If a job fails to execute. check the job server icon in the status bar to verify that the job service is
running. Check that the port number in the Designer matches the number specified in the Server
Manager. If necessary, use the Server Manager Resync button to reset the port number in the
Local Object Library.
© Copyr ight. All r ights reserved.
73
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 4: Batch Job Troubleshooting
74
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 4
Exercise 4
Set Traces and Annotations
Business Example
You are sharing your jobs with other developers during the project. so you want to make sure that
you identify the purpose of the job you created. You also want to ensure that the job is handling
the movement of each row appropriately.
1. Add an annotation to the workspace of the job you have already created.
2. Execute the Alpha_NACustomer_Job after enabling the tracing of rows.
© Copy right . All r ights r eserved.
75
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 4
Solution 4
Set Traces and Annotations
Business Example
You are sharing your jobs with other developers during the project. so you want to make sure that
you identify the purpose of the job you created. You also want to ensure that the job is handling
t he movement of each row appropriately.
1. Add an annotation to the workspace of the job you have already created.
a) Open t he Alpha_NA_Customer_Job workspace.
b) In the tool palet te, choose t he Annotation Type icon.
c) To add the annotation, c lick the workspace beside t he data fl ow.
d) In t he text box, enter an explanation of the purpose of t he job, for example: "The purpose
of t his job is to move records of Nort h American cust omers from the Customer table in t he
Alpha datast ore to a t emplate table, Alpha_customers in the Delta staging dat astore".
e) To save your work, choose Save All.
2. Execute the Alpha_NACustomer_Job after enabling t he tracing of rows.
a) Right click the Alpha_NACustomer_Job, and choose Execute .
b) In the Execution Properties dialog box, choose the Trace tab and choose Trace rows .
c) In the f ield Value, use the drop down list to change the value from No to Yes.
d) In the Execution Properties dialog box, choose OK.
In the Trace log, you should see an ent ry for each row added to the log to indicate how it is
being handled by the dat a flow.
e) Close the t race window.
76
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Va lidating and Tracing Jobs
LESSON SUMMARY
You should now be able to:
•
Validate jobs
•
Trace jobs
© Copyright. All rights reserved.
77
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 4
Lesson 3
Debugging Data Flows
LESSON OVERVIEW
Use the Interactive Debugger to examine what happens to data after each transform or object in
a flow and to troubleshoot any issues that arise when executing your jobs.
LESSON OBJECTIVES
After completing this lesson, you will be able to:
•
Use the interactive debugger
The View Data Feature
Business Example
Your company has recognized how useful it can be to integrate people. information, and business
processes in a heterogeneous system landscape and would like to benefit from it. Practice has
shown, though, that loading large data sets makes considerable demands on hardware and
system performance. It is therefore necessary to examine if it is possible, and how the data
records can be loaded into SAP NetWeaver Business Warehouse with a delta process. It is
important to understand the modes of operation and the different variants of a delta loading
process.
Jobs can be debugged in Data Services by using the View data and interactive debugger features.
Use View data to view samples of source and target data for jobs. Use the interactive debugger to
exam ine what happens to the data after each transform or object in the flow.
After this Unit
After completing this unit. you are able to:
•
Use View data with sources and targets
•
Set filters and breakpoints for a debug session
•
Use the interactive debugger
Using View Data with Sources and Targets
As shown in the figure View Data in Data Flow, by using the View data feature. it is possible to
check the status of data at any point after metadata is imported for a data source. The data is
checked before or after data flows are processed. Check the data when jobs are designed and
tested to ensure that the design returns the results expected.
78
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Debugging Data Flows
•
ods_«istomer<OEt.<O .
[iii.-~~
c.ust_lemp(DEC\10_Ta...
,~-,~Ljl
Query
,__-·I~ ti.-_...
SJ
8J
§1
.!f.
({°"od;_ct.1Stome({OEM-O_Sou:ce.de(l1.o_od';)
::: li:;;i t!LI.., IJi;\\J l liil @ll'tlil®lo:
Sl>Jl2
Sl>Jll
POOO
P002
pp
PO
SW
DE
P001
WA
WA
Pio
OR
M.e' 1 R~seill-11
OF
J:11tt.., U,ifw\1kl
<103
KT02
•
1:.rn1
Record: ~I
He~v Tin&t
T1ust
Ut0bl~ Sa!t~.Ye
1
22 Cai<
t.la1YJfacl\11e11 11 Lon:
I
S1?1-.-ioei
!ili
of 12(001tedJ
cuSLtel'Jlp(DEMO_Tl)ll)!ll der,rio_tai"g'.lt)
:::
l5;1 l~.L~Jl ~ l liil ~ lfi!!i
DT02
DT03
132Sn
KI O!
nse~
KT02
« S
KI OO
110Du
P001
P002
66 S~
~r..t> Rn •
'
.x
• """''
Rec« d
Sal~Aidfi~
OR
PD
OE
OR
WA
WA
DE
!'M
~I
Pou Ctramics
e 1i
.......
l!llll!l
998er
121 Bt
5581Cf
"'
erS~vitct
tAlltSle.!I
0tM As1ocib!et
11 ~..,,..., Snflw...1~
66Si:I
110 D
44Slo
77Sd
,~~
, '!E!J o1 12
Figure 22: View Data in Data Flow
Data Details
View data allows you to see source data before a job is executed. Use data details to:
•
Create higher quality job designs
•
Scan and analyze imported table and file data from t he Local Object Library
•
View the data for those same objects within existing jobs
•
Refer back to the source data after a job is executed
View data also allows for the checking of target data before a job is executed. Once the job is
executed. view t he changed data. In a data flow. it is possible to use one or more View data panels
to compare data between transforms and within source and target objects.
View data displays data in the rows and columns of a data grid. The path for the selected object
d isplays at the top of the pane.
View Data in a Data Grid
The number of rows displayed is determined by a combination of several condit ions:
•
Sample size:
The number of rows sampled in memory. Default sample size is 1000 rows for imported
source. targets. and t ransforms.
•
Filtering:
The filtering options t hat are selected. If t he original data set is smaller or if f ilters are used. the
number of returned rows could be less than the default.
It is only possible to have two View data windows open at any t ime. When a third window is
selected to be opened, a prompt appears and one of the windows is selected to be closed.
Source and Target Tables
© Copyright . All right s reserved.
79
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 4: Batch Job Troubleshooting
To use View data in source and target tables:
•
On the Datastore tab of the Local Object Library. right-click a table and select View Data from
the menu.
The View Data dialog box displays.
View Data Pane
To open a View data pane in a data flow workspace:
1. In the data flow workspace. select the magnifying glass button on a data flow object.
A large View Data pane appears beneath the current workspace area.
2. To compare data, select the magnifying glass button for another object.
A second pane appears below the workspace area, and the first pane area shrinks to
accommodate it.
When both panes are f illed and another View Data button is selected. a small menu appears
containing window placement icons. The black area in each icon indicates the pane that you want
to replace with a new set of data. When a menu option is selected, the data from the latest
selected object replaces the data in the corresponding pane.
The Interactive Debugger
The Designer includes an interactive debugger that allows for the troubleshooting of jobs by
placing filters and breakpoints on lines in a data flow diagram. The interactive debugger enables
t he examining and modifying of data, row by row during a debug mode job execution.
The interactive debugger is also used without filters and breakpoints. Running the job in debug
mode and then navigating to the data flow while remaining in debug mode enables you to drill into
each step of the data flow and view the data.
When a job is executed in debug mode, the Designer d isplays several additional windows that
make up the interactive debugger: Call Stack, Trace, Variables. and View Data panes. which are
shown in t he figure The Interactive Debugger.
80
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Debugging Data Flows
.. .
' '
'
'
'
.. .
••
-
•
t . . . . ... .
Figure 23: The Interact ive Debugger
View Data Pane
The left View Data pane shows the data in a source table, and the right pane shows the rows that
have been passed to the query up to the breakpoint.
Start the Interactive Debugger
1. In the project area. right-click the job and select Start Debug from the menu.
The Debug Properties dialog box displays.
2. Set properties for the execution:
Specify many of the same properties when executing a job without debugging. In addition,
specify the number of rows to sample in the Data Sample Rate field.
3. Select OK.
The debug mode begins. While in debug mode, all other Designer features are set to readonly. A Debug icon is visible in the task bar while the debug is in progress.
4. If you have set breakpoints, in the interactive debugger tool bar. select Get Next Row to move
to the next breakpoint.
5. To exit the debug mode, from the Debug menu, select Stop Debug.
Filters and Breakpoints
It is possible to set filters and breakpoints on lines in a data flow diagram before starting a
debugging session that allows for the exam ining and modifying of data row-by-row during a
debug mode job execution. This action is shown in the figure Setting Filters and Breakpoints.
A debug filter functions the same as a simple Query transform with a WHERE clause. Use a filter
when reducing a data set in a debug job execution. The debug filter does not support complex
expressions.
© Copyright . All rights reserved.
81 ~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 4: Batch Job Troubleshooting
A breakpoint is the location where a debug job execution pauses and returns control. A
breakpoint can be based on a condition, or it can be set to break after a specific number of rows.
Place a filter or breakpoint on the line between a source and a transform or two transforms. If a
filter and a breakpoint are set on the same line. Data Services applies the f ilter f irst, which means
that the breakpoint applies to the filtered rows only.
ods_customer(OEMO..•
~~
SJ
I °""·
Queiy
CUSLtemp(OEMO_Ta.
set&'~
l'----+!~UJ
P..emo't'B&e~it
fMble&'c~
~fi'.l·er...
~fn'.er
Enatllt
t(:'
1j~Fler/6r~ht...
q, Yieiy O~tJ;!:.
F9
ll:@mfill11.1..444w;:g.1.n1J411,13:1iwa,:;91.1gu1.euqc,c
Toac&e o fita, ~le<t .&coUr.ilina!l'lCI oodcpet«Of;
thcn erter o v.rii.ie.
l.QDD
CC4)Cateoate aS oondtionsuWJO:
<:Olmn
ods OJStomet.~ IO
,)tot
..
<::'.J
,:1r.. fi
Toar.c;/$_~C·f>"'<11~
diX;~~t:vlll.lf!
lllY.I
~~al(crY.iU:ns:ustY,i
.::1a1~
:de¢.e r;d;itm m:mi:
llll'C• 3
VM
I
Figure 24: Setting Filters and Breakpoints
Set Filters and Breakpoints
1. In the Data Flow workspace, right-click the line that connects two objects and select Set Filter/
Breakpoint from the menu.
2. In the Breakpoint window in the Column dropdown list, select the column to which the filter or
breakpoint applies.
3. In the Operator dropdown list, select the operator for the expression .
4. In the Value field, enter the value to complete the expression.
The cond ition for filters/breakpoints does not use a delimiter for strings.
5. If you are using mu ltiple conditions. repeat step 3 to step 5 for all conditions and select the
appropriate operator from the Concatenate all conditions using the dropdown list.
6. Select OK.
82
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 4
Exercise 5
Use the Interactive Debugger
Business Example
To ensure that your job is processing the data correctly, you wan to run the job in debug mode. To
minimize the data you have to review in the interactive debugger, you set the debug option
process to show only records from an individual Country/D field value.
Note:
When the data values include XX, replace XX with the number that your instructor has
provided to you.
1. In the Cloud/WTS environment, the Designer will not allow multiple users to share the
interactive debugger port. Change the interactive debugger port in Designer options.
2. Execute the Alpha_NACustomer_Job in debug mode wit h a subset of records. In the
workspace for the Alpha_NACustomer_Job, add a filter between the source and the Query
transform to filter the records, so that only customers from the USA are included in the debug
session.
3. Once you have confirmed that the structure appears correct, you execute another debug
session with all records, breaking after every row.
Execute the Alpha_NACustomer_Job again in debug mode using a breakpoint to stop the
debug process after a number of rows.
© Copy right . All r ights r eserved.
83 ~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 4
Solution 5
Use the Interactive Debugger
Business Example
To ensure t hat your job is processing t he data correctly, you wan to run the job in debug mode. To
minimize the data you have to review in t he interactive debugger, you set the debug option
process to show only records from an individual CountrylD f ield value.
Not e:
When the dat a values include XX. replace XX with the number that your instruct or has
provided t o you.
1. In the Cloud/WTS environment, the Designer wil l not allow mult iple users to share the
interactive debugger port. Change the interactive debugger port in Designer options.
a) From the main menu. choose Tools -
Options.
b) From the Designer opt ions. choose Environment.
c) In the Interactive Debugger f ield, enter port number 60XX.
A d ialog box wit h the message "Overwrite job server opt ion parameters (BOD I 1260099)"
opens. To continue. choose Yes.
d) To save changes, choose OK.
2. Execute the Alpha_NACustomer_Job in debug mode wit h a subset of records. In t he
workspace for t he Alpha_NACustomer_Job , add a filt er between the source and the Query
transform to filter the records, so that only customers f rom the USA are included in the debug
session.
a) Open the workspace for t he Alpha_NACustomer_DF.
b) Right click the connect ion between the source table and the Query Transform and choose
Set Filter/Breakpoint.
c) In the Filter window. select the Set checkbox.
d) In the Column f ield, from the drop-down list, choose customer. COUNTRYID.
e) In t he Operator field, from t he drop-down list. choose = (Equals operator).
f) In the Value field enter 1.
This represents the cou nt ry U.S .A.
84
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Debugging Data Flows
g) Choose OK.
h) In the Project Area, right-click the Alpha_Customers_Job and choose Start debug.
i) In the Debug Properties dialog box. choose OK.
Debug mode begins and all other Designer features are set to read-only. A Debug icon is
visible in the task bar while the debug is in progress.
You can specify many of the same properties as you can when executing a job without
debugging. In addition, you can specify the number of rows to sample in the Data sample
rate field.
When the job is finished, a dialog box opens. with the question "Do you want to exit the
Debugger now?"
j) To stay in debug mode. choose No.
k) To close the trace window and return to the job workspace. in the tool bar, choose Back.
I) To open the data flow workspace. double click the data flow.
m) Choose the magnifying glass between the Query transform and the target table.
You should see that only five records are returned to the template table.
n) Close the display.
o) To exit debug mode, from the menu, choose Debug -
Stop Debug.
3. Once you have confirmed that the structure appears correct. you execute another debug
session with all records, breaking after every row.
Execute the Alpha_NACustomer_Job again in debug mode using a breakpoint to stop t he
debug process after a number of rows.
a) In the workspace for the Alpha_NACustomer_DF, right click the connection between the
source table and the Query transform, and choose Remove Filter
b) Right click the connection between the source table and the Query transform, and choose
Set Filter/Breakpoint.
c) In the Breakpoint window. select the Set checkbox.
d) To enable breaking t he debug session during processing, select the checkbox Break after
number of rows.
e) In field to the right of Break after number of rows enter 20 .
f) Choose OK.
g) In the Project Area, right-click the Alpha_NACustomer_Job and choose Start debug.
h) In the Debug Properties dialog box, choose OK.
i) Save your work.
Debug mode begins. and stops after processing 20 rows.
j) In the data view, select the A// checkbox.
© Copyright . All rights reserved.
85 ~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 4: Batch Job Troubleshooting
You see 21 records.
k) Deselect the All checkbox.
You see only the 21st record.
I) To discard the record from the table, select it and choose Discard.
The record field values now appear as if a line has been drawn through each value.
m) To continue processing, choose Debug
-+
Continue.
The next row is displayed
n) Continue until you get a message that the job is finished.
o) To exit debug mode, choose Debug
-+
Stop Debug.
p) To remove the breakpoint from the data flow, right-click the connection, and choose
Remove Breakpoint.
q) In the data flow workspace, choose the magnifying glass between the Query transform
and the target table to v iew the table records. Note that only 24 of 25 rows were returned,
because you rejected one record .
r) Close the display.
s) Save your work.
86
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Debugging Data Flows
LESSON SUMMARY
You should now be able to:
•
Use the interactive debugger
© Copyright . All rights reserved.
87
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 4
Lesson 4
Auditing Data Flows
LESSON OVERVIEW
Set up audit rules to ensure the correct data is loaded to the target when executing jobs.
LESSON OBJECTIVES
After completing this lesson. you will be able to:
•
Audit data flows
Auditing
It is possible to collect audit statistics on the data that flows out of any Data Services object. such
as a source, transform. or target. If a transform has multiple distinct or different outputs. for
example. a Validation or Case. audit each output independently.
Setting Up Auditing
When data flows are audited:
1. Define audit points to collect runtime statistics about the data that flows out of objects. These
audit statistics are stored in the Data Services repository.
2. Define ru les with these audit statistics to ensure that the data extracted from sources.
processed by transforms. and loaded into targets is expected.
3. Generate a runtime notification that includes the audit rule that failed and the values of the
audit statistics at the time of failure.
4. Display the audit statistics after the job execution to help identify the object in the data flow
t hat might have produced incorrect data.
Audit Points
An audit point represents t he object in a data flow where statistics are collected. Audit a source, a
transform. or a target in a data flow.
When audit points are defined on objects in a data flow. specify an audit function. An audit
function represents t he audit statistic that Data Services collects for a table, output schema. or
column.
Table 14: Audit Functions
Choose from these audit functions:
88
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Aud iting Data Flows
Data Object
Function
Description
Table or output schema
Count
This function collects two
statistics:
•
Good count for rows that
were successfully
processed
•
Error count for rows that
generated some type of
error if you enabled error
handling
The datatype for this function
is integer.
Column
Sum
Sum of the numeric values in
the column. This function only
includes the good rows. This
function applies to columns
with a datatype of integer,
decimal, double, and real.
Column
Average
Average of the numeric values
in the column. This function
only includes the good rows.
This function applies to
columns with a datatype of
integer, decimal, double, and
real.
Column
Checksum
Detects errors in the values of
the column by using the
Checksum value. This function
applies only to columns with a
datatype of varchar.
Audit Labels
An audit label represents the unique name in the data flow that Data Services generates for audit
statistics. Audit labels are collected for each defined audit function. Use these labels to define
audit rules for the data flow as shown in the f igure Using Auditing Points Label and Functions.
©Copyright . All rights reserved.
89 ~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 4: Batch Job Troubleshooting
Audit Points and Labels
'""" lrwe 1
1:l ?l;r:,_ru
Audt ftfttlon
Audit Functions
........, """'I
"""' '''""''"
IIC<Mt_
ii
iiiiil ..._ _
eo.rt: I:==~~~-==::::::::::::::::::::::.~
£r10t
°""'"'I ~ror_«ls_o.litomcr
Dbtotype:
Figure 25: Using Auditing Points Label and Functions
Count Audit Function
If the audit point is on a table or output schema. these two labels are generated for the Count
Audit function:
$Count_objectname
$CountError_objectname
If the audit point is on a column, the audit label is generated wit h this format:
$auditfunction_objectname
Note:
An audit label can become invalid if an object that had an audit point defined on it is
deleted or renamed. Invalid labels are listed as a separate node on the Labels tab. To
resolve the issue, re-create t he labels and delete the invalid items.
Audit Rules
Use auditing rules when comparing audit statistics for one object against another object. For
example. use an audit rule to verify that the count of rows from the source table is equal to the
rows in the target table.
An audit rule is a Boolean expression, which consists of a left-hand-side (LHS), a Boolean
operator, and a right-hand-side (RHS). The LHS can be a single audit label. multiple audit labels
that form an expression with one or more mathematical operators. or a function with aud it labels
as parameters. In addition to the LHS. the RHS can also be a constant.
Examples of Audit Rules
90
•
$Count CUSTOMER = $Count CUSTDW
•
$Sum ORDER US + $Sum ORDER EUROPE - $Sum ORDER DW
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Auditing Data Flows
•
round($Avg_ORDER_TOTAL) >= 10000
Audit Actions
Defining Audit Actions
Choose any combination of the actions listed for notification of an audit failure:
•
E-mail to List:
Data Services sends a notification of which audit rule failed to the e-mail addresses listed in
this option. Use a comma to separate the list of mail addresses. and specify a variable for the
mail list.
This option uses the smtp_to function to send an e-mail. Define the server and sender for the
Simple Mail Tool Protocol (SMTP) in the Data Services Server Manager.
•
Script:
Data Services executes the custom script created in this option.
•
Raise exception:
When an audit rule fails, the Error Log shows the rule that failed . The job stops at the first audit
rule that fails.
This is an example of a message in the Error Log:
Audit rule failed <($Checksum_ ODS_ CUSTOMER= $Count_CUST_DIM)> for or Demo_DF>.
This action is the default. If the action is cleared and an audit rule fails, the job completes
successfully and the audit does not write messages to the job log.
If all three actions are chosen, Data Services executes them in the order presented.
Table 15: Audit Status
Audit status can be viewed in one of these locations:
Places where you can view audit information
Action on Failure
Job Error Log, Metadata Reports
Raise an exception
E-mail message, Metadata Reports
E-mail to list
Wherever the custom script sends the audit
messages. Metadata Reports
Script
Defining Audit Rules and Actions
•
Auditing is disabled at execution t ime in the Job Execution Properties dialog box.
•
Set Trace Audit Data to Yes to view the results of the audit in the log in the Job Execution
Properties dialog box.
©Copyright. All rights reserved.
91
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 4: Batch Job Troubleshooting
~Audit: Oemo_OF
label
R!Ae
-•
I
AudlinQ Rules -
Action on f•bo -
add
1_e1e1.,
I P' B.also oxcoptlon
I r ~~ma11_t_ou_st_ _ __
function<.
r-
I
r
i;ustom
31 31
I _J
3
Ellll~tlons.
[ill
I
I
J
0os.
I
~
Figure 26: Defining Audit Rules and Actions
Define Audit Points and Rules in a Data Flow
1. On t he Data Flow tab of the Local Object Library, right-click a data f low and select Audit from
t he menu.
The Audit dialog box displays with a list of the objects that can be audited, with any audit
functions and labels for those objects.
2. On t he Label tab, right-click the object you want to audit and select Properties from the menu.
The Schema Properties dialog box displays.
3. In the Audit tab of the Schema Properties dialog box, in the Audit Function dropdown list,
select the audit function to be used against this data object type. The audit functions
displayed in the dropdown menu depend on the data object type t hat is selected.
Default values are assigned to the audit labels, which are changed if required.
4. Select OK.
5. Repeat step 2 to step 4 for all audit points.
6. On t he Rule tab, under Auditing Rules , select Add.
The expression editor activates and the custom options become available for use. The
expression editor contains three dropdown lists where the audit labels for t he audited objects
are specified, choose the Boolean expression to use between these labels.
7. In the left-hand-side dropdown list in the expression editor. select t he audit label for the object
to be audited .
8. In the Operator dropdown list in the expression editor, select a Boolean operator.
9. In the right-hand-side dropdown list in the expression editor, select the audit labe l for t he
second object to be audited.
92
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Aud iting Data Flows
If you want to compare audit statistics for one or more objects against statistics for multiple
other objects or a constant. select the Custom Radio button, and select the ellipsis button
beside functions. This opens up the full-size smart editor where you can drag different
functions and labels to use for auditing.
10. Repeat step 7 to step 10 for all audit rules.
11. Under Action on Failure, select the required action .
12. Select Close.
Trace Audit Data
1. In the project area. right-click the job and select Execute from the menu.
2. In the Execution Properties window, select the Trace tab.
3. Select Trace Audit Data .
4. In the Value dropdown list. select Yes.
5. Select OK.
The job executes and the job log displays the audit messages based on the audit function that
is used for the audit object.
Considerations when Choosing Audit Points
Choosing Audit Points
When choosing audit points. consider:
•
The Data Services optimizer cannot push down operations after the audit point. Therefore. if
the performance of a query that is pushed to the database server is more important than
gathering audit statistics from the source, define the first audit point on the query or later in
the data flow.
For example, a data f low has a source, a Query transform, and a target. The query has a WHERE
clause that is pushed to the database server that significantly reduces the amount of data that
returns to Data Services. Define the first audit point on the query. rather than on the source, to
obtain audit statistics on the results.
•
If a pushdown_sql function is after an audit point, Data Services cannot execute it.
•
The auditing feature is disabled when a job is run with a debugger.
•
If the CHECKSUM audit function is used in a job that executes in parallel. Data Services
disables the Degrees of Parallelism (DOP) for the whole data flow. The order of rows is
important for the result of CHECKSUM, and DOP processes the rows in a d ifferent order than
in the source.
•
Select audit points carefully, they can affect pushdown.
©Copyright . All rights reserved.
93
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 4: Batch Job Troubleshooting
94
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 4
Exercise 6
Use Auditing in a Data flow
Business Example
You must ensure that all records from the Customer table in the Alpha database are being moved
to the Delta staging database using the audit logs.
In the Local Object Library, replicate the Alpha_NACus tomer_DF data flow. Name the replicated
data flow Alpha_AuditCus tomer_DF. Add the replicated data flow to a new job,
Alpha_AuditCustomer_Job. Set up auditing on the data f low Alpha_AuditCustomer_DF by
adding an audit rule to compare the total number of records in the source and target tables.
1. Replicate the Audit_NACustomer_DF data f low.
2. Create a new batch job Alpha_Audi tCustomer_Job.
3. Addthe Alpha_Audit Customer_DF tothe Alpha_AuditCustomer_Job.
4. Add audit labels in the Alpha_Audit Cus t omer_DF data flow to count the total number of
records in the source and target tables.
5. Construct an audit rule that an exception must be entered into the log if the count from both
tables is not the same.
6. Enable audit ing for the execution of the Alpha_Audi tCustomer_Jo b .
7. Modify the data flow to send customers outside North America to a second template table,
Alpha_Other_customer.
8. Add an Audit label for the new target table and create a custom audit rule to verify the sum of
t he count of the two target tables is equal to t he count of the source table.
9. Save all changes and execute the job with auditing enabled and Trace Audit Data set to Yes.
10. Remove the audit feature from the dataflow.
© Copy right . All r ights r eserved.
95
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 4
Solution 6
Use Auditing in a Data flow
Business Example
You must ensure that all records from the Customer table in the Alpha database are being moved
to t he Delta staging database using the audit logs.
In the Local Object Library, replicate the Alpha_NACustomer_DF data flow. Name the replicated
data flow Alpha_AuditCustomer_DF. Add the replicated data flow to a new job,
Alpha_Audi tCustomer_Job. Set up auditing on the data flow Alpha_Audi tCustomer_DF by
adding an aud it rule to compare the total number of records in the source and target tables.
1. Replicate the Audit_NACustomer_DF data f low.
a) In the Local Object Library Data Flow tab right click the Alpha_NACustomer_DF data f low
and choose Replicate .
b) Rename the copied data flow Alpha_AuditCustomer_DF.
2. Create a new batch job Alpha_Audi tCustomer_Job.
a) Right click the Omega project in the Project Area.
b) Choose New Batch Job
c) Name the new jobAlpha_AuditCustomer_Job.
3. Addthe Alpha_AuditCustomer_DFtothe Alpha_AuditCustomer_Job.
a) Drag the Alpha_Audi tCustomer_DF from the Local Object Library to the
Alpha_Audi tCus tomer_Job workspace.
4. Add audit labels in the Alpha_Audi tCustomer_DF data flow to count the total number of
records in the source and target tables.
a) In the Local Object Library. choose the Data Flow tab.
b) Right click the data flow Alpha_AuditCustomer_DF and choose Audit.
The Audit dialog box displays with a list of the objects that you can audit with any audit
functions and labels for those objects.
c) On the Label tab, right click the source table, customer, and choose Count.
d) On the Label tab, right click the target table, Alpha_NA_customer, and choose Count.
5. Construct an audit rule that an exception must be entered into the log if the count from both
tables is not the same .
a) In the Rule tab, under Auditing Rules, choose Add.
96
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Auditing Data Flows
The expression editor opens. It contains three drop-down lists where you specify the audit
labels for the objects that you want to audit and choose the expression to use between
these labels.
b) In the left drop down list, choose the audit label $Count_ customer for the source table.
c) In the operator drop down list. choose the operator equal (=).
d) In the right d rop down list. choose the audit label $count_Alpha_NA_customer for the
target table.
e) Under Action on failure. select t he Raise exception checkbox.
f) Choose Close.
6. Enable auditing for the execution of the Alpha_Audi tCustomer_Job.
a) Right-click the Alpha_Audi tCustomer_Job.
b) Choose Execute.
c) In the Execution Properties dialog box. choose the Execution Options tab, and select the
Enable auditing checkbox.
d) Choose the Trace tab and choose Trace Audit Data.
e) In the Value field, use the drop down to change the value from No to Yes.
f) Choose OK.
You see the audit rule fail.
7. Modify the data flow to send customers outside North America to a second template table.
Alpha_Other_customer.
a) In the Designer workspace, open the Alpha_Audi tCustomer_DF data flow.
b) In the tool palette, choose the Template table icon.
c) Click in the data flow workspace to add a new template table below the
Alpha_NA_customer target table.
d) Name the new template table Alpha_Other_customer.
e) Create the table in data store Del ta.
f) Choose OK.
g) Add a second Query transform to the data flow
h) Connect the source table to the second Query transform and the new target table
Alpha_Other_customer.
i) To open the Query Editor. double-click the new Query transform.
j) Map all columns from the source to the target.
k) In the Query Editor, define a where clause: not (customer . COUNTRYID in (1, 2 , 11))
I) Save all changes
©Copyright. All rights reserved.
97
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 4: Batch Job Troubleshooting
8. Add an Audit label for the new target table and create a custom audit rule to verify the sum of
the count of the two target tables is equal to the count of the source table.
a) In the Local Object Library Data Flow tab, right c lick the Alpha_Audi tCustomer_DF data
flow and choose Audit
b) On the Label tab of the Audit d ialog, right click Alpha_Other_customer and choose
Count.
c) In the Audit editor, choose the Rule tab.
d) To remove the existing audit rule. choose Delete
e) Choose Add and select Custom.
f) Define the custom audit rule $Count_customer
$Count_Alpha_Other_customer)
=
($Count_Alpha_NA_customer +
The Action on failure should be defined as Raise exception.
g) Choose Close.
9. Save all changes and execute the job with auditing enabled and Trace Audit Data set to Yes.
a) Right-click the Alpha_AuditCustomer_Job and choose Execute.
b) In the Execution Properties d ialog box, in the Execution Options tab. select the Enable
auditing checkbox.
c) In the Trace tab, choose Trace Audit Data.
d) In the Value field, using the drop down list. change the value to Yes .
e) Choose OK.
f) Verify that the audit rule passes.
10. Remove the audit feature from the dataflow.
a) In the Local Object Library.choose the Data Flow tab.
b) Right click the data flow Alpha_AuditCustomer_DF and choose Audit.
c) In the Rule tab, under Auditing Rules, choose the auditing rule that you created, and
choose Delete.
d) In the Label tab, right click the source table Customer and choose Count.
This action toggles the label off.
e) In the Label tab, right click the source table Alpha_NA_customers and choose Count.
f) In the Label tab. right click the source table Alpha_Other_customer and choose Count.
g) Close and save your work.
98
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Auditing Data Flows
LESSON SUMMARY
You should now be able to:
•
Audit data flows
©Copyright. All rights reserved.
~
99
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 4: Batch Job Troubleshooting
100
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 4
Learning Assessment
1. When a repository object is imported and exported, the description is also imported or
exported .
Determine whether this statement is true or false.
D
True
D
False
2. How do you add an annotation to the workspace?
Arrange these steps into the correct sequence.
D
D
D
D
Add text to t he annotation.
Select the cursor outside of the annotation to commit the changes.
Double-click the annotation.
In the workspace, from the too/ palette, select the Annotation icon and then select the
workspace.
3 . Select the correct statements about job validation.
Choose the correct answers.
D
D
A You cannot validate objects while you are creating the job.
B You cannot validate jobs automatically before execut ion.
D
C Trace log files show error messages associated with the job server.
D
D Trace settings can be changed temporarily.
© Copy right . All r ights r eserved.
101
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 4: Learning Assessment
4. Match the trace options w ith their descriptions.
Match the item in the first column to the corresponding item in the second column.
SQL transforms
SQL readers
SQL loaders
Write a message using the
Table Comparison transform
about whether a row exists in
the target table that
corresponds to an input row
from the source table.
Write the SQL query block that
a script, Query transform, or
SQL function submits to the
system and write the SQL
results.
Write a message when the bulk
loader starts, submits a
warning message, or
completes successfully or
unsuccessfully.
5. View data allows you to import table and fi le data f rom the Local Object Library.
Determine whether this statement is true or false.
D
D
True
False
6. It is possible to have multiple View data windows open at any t ime.
Determine whether this statement is true or false.
D
D
True
False
7. The interactive debugger places filters and breakpoints on lines in a data flow diagram.
Determine whether this statement is true or false.
D
D
102
True
False
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 4: Learning Assessment
8. What is an audit point?
9. Match these audit f unctions with their brief descriptions.
Match the item in the first column to the corresponding item in the second column.
Sum
Average
Checksum
Detects errors in the values of
the column.
Sum of the numeric values in
the column.
Average of the numeric values
in the column.
©Copyright. All rights reserved.
103
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 4
Learning Assessment - Answers
1. When a repository object is imported and exported, the description is also imported or
exported.
Determine whether this statement is true or false.
0
True
D
False
2. How do you add an annotation to the workspace?
Arrange these steps into the correct sequence.
~Add text to the annotation.
~Select the cursor outside of the annotation to commit the changes.
~Double-click the annotation .
[!]In the workspace, from the too/ palette, select the Annotation icon and then select the
workspace.
3. Select the correct statements about job validation.
Choose the correct answers.
D
D
0
0
104
A You cannot validate objects while you are creating the job.
B You cannot validate jobs automatically before execution.
C Trace log files show error messages associated with the job server.
D Trace settings can be changed temporarily.
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 4: Learning Assessment· Answers
4. Match the trace options with their descriptions.
Match the item in the first column to the corresponding item in the second column.
SQL transforms
SQL readers
SQL loaders
Write a message using the
Table Comparison transform
about whether a row exists in
the target table that
corresponds to an input row
from the source table.
Write the SQL query block that
a script, Query transform, or
SQL function submits to the
system and write the SQL
results.
Write a message when the bulk
loader starts, submits a
warning message, or
completes successfully or
unsuccessfully.
5. View data allows you to import table and file data from the Local Object Library.
Determine whether this statement is true or false.
~ True
D
False
6. It is possible to have multiple View data windows open at any time.
Determine whether this statement is true or false.
D
True
~ False
7. The interactive debugger places filters and breakpoints on lines in a data flow diagram .
Determine whether this statement is true or false.
~ True
D
False
© Copyright . AlI rights reserved .
105
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 4: Learning Assessment· Answers
8. What is an audit point?
An aud it point represents the object in a data flow where statistics are collected.
9. Match these audit f unctions with their brief descriptions.
Match the item in the first column to the corresponding item in the second column.
Sum
Average
Checksum
Sum of the numeric values in
the column.
Average of the numeric values
in the column.
Detects errors in the values of
the column.
106
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Functions, Scripts, and
Variables
Lesson 1
Using Built-In Functions
Exercise 7: Use the search_replace Function
Exercise 8: Use the lookup_ext() Function
Exercise 9: Use Aggregate Functions
108
117
121
125
Lesson 2
130
141
Using Variables, Parameters. and Scripts
Exercise 10: Create a Custom Function
UNIT OBJECTIVES
•
Use functions in expressions
•
Use variables, parameters, and scripts
© Copy right . All r ights r eserved.
107
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 5
Lesson 1
Using Built-In Functions
LESSON OVERVIEW
Perform complex operations using functions and extend the flexibility and reusability of built-in
functions using other Data Services features.
LESSON OBJECTIVES
After completing this lesson, you will be able to:
•
Use functions in expressions
Operation of a Function
A function's operation determines where you can call the function. A function can only be called
from a script or a conditional where the context supports how these functions operate. Some
functions can only be called from within a Query t ransform because the transform is the only
context that supports the operation of the function.
Context Support
The following examples show that some f unctions can be called from custom functions or scripts
but other f unctions cannot be called from custom functions or scripts:
•
An iterative function. for example, the lookup function, can only be called from within a Query
transform. It cannot be used in or called from a cust om function or script.
•
An aggregate function. for example. max, requires a set of values to operate and can only be
called from within a Query transform. It cannot be used in or called from a custom function or
script.
•
Stateless functions, for example, a conversion function like to_ char. operate independent ly in
each iteration. Stateless functions can be used anywhere that expressions are allowed.
Types of Functions
Table 16: Built-in Functions
The following table lists the operations performed by the built -in function types:
108
Function Type
Description
Aggregate
Performs calculations on numeric values, dates, strings, and
other data types.
Conversion
Converts values to specific data types.
Custom
Performs functions defined by the user.
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using Built-In Functions
Function Type
Description
Database
Performs operations specific to databases.
Date
Calculates and converts date values.
Environment
Performs operations specific to your Data Services
environment.
Lookup
Looks up data in other tables.
Math
Performs complex mathematical operations on numeric
values.
Miscellaneous
Performs various operations.
String
Performs operations on alphanumeric strings of data.
System
Performs system operations.
Validation
Validates specific types of values.
Optional Functions
In addition to the built-in functions, you can use the following optional functions:
•
Database and Application Functions
Import the metadata for database and application functions and use them in Data Services
applications, for example, stored procedures from DB2, Microsoft SQL Server, Oracle, and
Sybase databases, stored packages from Oracle and stored functions from SQL Server.
•
Custom Functions
Create your own functions by writing script functions using the Data Services scripting
language.
•
Cryptographic Functions
Encrypt and decrypt data using the AES algorithm. Specify the key length used for the
encryption as a parameter (128, 192, or 256). Based on the passphrase, a key with the
required length will be generated. The passphrase is required to decrypt the data again.
•
Gen_UUID Functions
Generate universally unique identifiers that are unique across host space, process space,
thread space, and time. The generated ID is a varchar that is 32 characters long.
Functions in Expressions
Use functions in expressions to map return values as new columns. New columns that were not in
the initial input data set are now specified in the output data set.
Function Columns
Add columns based on a value like a lookup function or on generated key fields. You can use
functions in the following ways:
©Copyright. All rights reserved .
109
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 5: Functions, Scr ipts, and Var iables
•
Transforms
The Query, Case, and SQL transform support functions.
•
Scripts
Single-use objects to call functions and assign values to variables in a work flow.
•
Conditionals
Single-use objects to implement branch logic in a work flow.
•
Custom Functions
Create your own functions as required.
A function's operation must make sense in the expression you are creating. For example, the max
function cannot be used in a script or a conditional where there is no collection of values on which
to operate. Use the Smart Ed itor or the Function wizard to add existing functions in an
expression.
Smart Editor
The Smart Editor offers many options, including variables, data types and keyboard shortcuts.
Use the Smart Editor in the following applications:
•
Query Editor
•
Script Editor
•
Conditional Editor
•
Case Editor
•
Function Wizard
ggm4414a.1114gac•u• ,&,r.1,:1& 1;;s14111111rtm =
j
~
,. ® ~~ "= ,., I ca ?'
ij)- .. FM
() e-!!il ...._..,_ ...
t
f
<I
..
I' '
I ""
Figure 27: Functions Sm art Editor
Smart Editor Interface
The Smart Editor has a user friendly interface as shown in the f igure Functions Smart Editor. Use
drag and d rop to move objects and use the ellipsis icons( ... ) to select activation options. Follow
these steps to use the Smart Editor:
110
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using Built-In Functions
1. Open the object in which you want to use an expression.
2. Select the ellipsis icon( ... ) to show the Smart Editor.
3. Select the Functions tab and expand a function category.
Function Wizard
Define parameters for complex functions with the Function Wizard. Follow these steps to use the
Function Wizard:
1. Open the object in which you want to use an expression.
2. Select Functions to open the dialog box.
3. Select a category from the list of f unctions in the dialog box.
Fl#lttion Name: <upper>
Figure 28: Funct ion Wizard
Function Wizard Interface
The Function Wizard's user prompt interface is shown in the figure Function Wizard. The interface
is similar to the Smart Editor and contains the following features:
•
Drop down menus
•
Parameters tailored for each function
•
A Function button to activate functions.
Aggregate Functions
Aggregate functions are one set of built-in functions within SAP Data Services.
Aggregate Functions Overview
•
Perform calculations on numeric values and some non-numeric data.
©Copyright. All rights reserved .
111
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 5: Functions, Scripts, and Variables
•
Require a set of values to operate.
•
Generate a single value from a set of values.
•
Are most commonly called from within a Query transform, not from custom functions or
scripts.
•
Use the data set specified by the expression in the Group By tab of a query.
Table 17: Built-in Aggregate Functions
The table lists the names and descriptions of built-in aggregate functions:
Function
Description
avg
Calculate the average of a g iven set of values.
count
Count the number of values in a table column.
count_ distinct
Count the number of distinct non-null values in a table column.
max
Return the maximum value from a list.
min
Return the minimum value from a list.
sum
Calculate the sum of a g iven set of values.
The search_replace Function
Perform a simple search and replace based on a string value, word value, or an entire field.
Search_replace Overview
Input parameters and replace by matching criteria and values specified by a search table:
•
Specify search and replace values with an internal table, an existing external table or file, or
with a custom SQL command.
•
Values are loaded into memory to optimize performance while performing the operation.
•
Run search_replace as a separate process or as a sub data flow to improve performance if
your data flow contains multiple resource-intensive operations.
•
Use search_rep/ace as a function call in the query output or create a new blank output column
and select Functions on the Mapping tab.
•
Use the Function Wizard interface to fill in the function's parameters. and return to the wizard
at any t ime to change the parameters.
Note:
While you can use the search_rep/ace function in a script or in regular mapping, the
syntax is difficult to read and to maintain. The Function Wizard is a user friendly way
to perform the search_replace function.
112
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using Built-In Functions
Lookup Functions
Use lookup functions to look up values in other tables and to populate columns in the target table.
The source table contain s
these co lumns:
Th e target table contains
these related column s:
Finance
SalaryReport
EmployeelO
EmployeeName
Salary
Salary
The lookup fun ctio n uses a third table that translates the v alues
from the so urce table to the valu es you want in the t arget table:
Employee
Finance
~
EmployeeID
Salary
..>
Employee!D
EmployeeName
SalaryRep ort
~
..:>
EmployeeName
Salary
Figure 29: Using the Lookup Function
The figure Using the Lookup Function shows how values from the source table are populated in
the target table.
Lookups are useful for values that rarely change and they store reusable values in memory to
speed up the process. The lookup, lookup_seq, and lookup_ ext functions provide a specialized
type of join, similar to an SQL outer join. While an SQL outer join may return multiple matches for
a single record in the outer table, lookup functions always return exactly the same number of
records that are in the source table.
Lookup Function Options
While all lookup functions return one row for each row in the source, they differ in how they
choose which of several matching rows to return:
•
Lookup does not provide additional options for the lookup expression.
•
Use lookup_ext to specify an order by column and return policy, for example min or max. to
return the record with the lowest or highest value in a given f ield .
•
Use lookup_seq in matching records to return a field from the record where the sequence
column. for example, effective_date. is closest to but not greater than a specified sequence
value, for example, a transaction date.
Lookup_ext Functionality
Retrieve a value in a table or file based on the values in a different source table or file:
•
Return multiple columns from a single lookup.
•
Choose from additional operators to specify a lookup condit ion.
•
Specify a return policy for your lookup.
•
Perform multiple lookups.
©Copyright. All rights reserved .
113
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 5: Functions, Scr ipts, and Var iables
•
Call lookup_ext in scripts and custom functions.
•
Reuse the lookups packaged inside scripts.
•
Define custom SQL using the SQL_override parameter to populate the lookup cache and to
narrow large quantities of data to the sections relevant for your lookup.
•
Dynamically execute SQL.
•
Call lookup_ext in the Function Wizard query output mapping to return multiple columns in a
Query transform.
•
Design jobs without hard coding the translation file name at design time.
•
Use lookup_ext with memory datastore tables.
Hint:
There are two ways to use the lookup_ext function in a query output schema.
1. Map to a single output column in the output schema. The lookup_ext is limited to
returning values from a single column from the lookup table.
2. Specify a New Output Function Call in the query output schema to open the
Function Wizard. Configure the lookup_ext to return multiple columns from the
lookup table from a single lookup.
Lookup_ext Expression
Follow these steps to create a lookup_ext expression:
1. Open the Query transform.
2. Ensure that the Query transform has at least one main source table and one lookup table, and
that it is connected to a single target object.
3. Select the output schema column for wh ich the lookup function is being performed.
4. Select Functions in the Mapping tab to open the Select Function window.
5. Select Lookup Functions from the Function list.
6. Select lookup_ext from the Function name list.
7. Select Next.
The Lookup_ext - Select Parameters dialog box displays as shown in the figure Lookup_ext
Function Parameters.
114
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using Built-In Functions
..J.gj.l!J
~utte:fO.._jouot.DeO.OOS..>fQ. ~
.
Od!e.-:ll'M_LOl'O_CAOC:
:::J
r' ~-·~Cepuets
-
~-
9llll-Gl f!!l MTOO«R
'
I«•• 1-.1
OJSTOICRIO
,.....,,
I
I
COHI~
"""llCmnt
I CllY
I ,,.,.,,..,
I '°"""..,.
I aummo
·-
I '"'
191•-
I
'
'Si*lfv«hlt~iithtSookCl'lOtoftlUln.._.h~.-.Rllhd.
~ ... - .. '\oot~tci.ia.;A:,..-· . .•. 5ti.ct~~fd'ott.C""colAift
~ tt'tftP9MIOl'ltNl lltoAd ta.~ • .,. ttbafty•
«O.MI ..... cl tl'lrt (OUiln. tttu11'11td.
cu........
.....
~~·~~~~~~~~~~""P*a
•
coturM-.~tMllc
JMAX
::J
Figure 30: Lookup_ext Function Parameters
The lookup_ ext function sets the cache parameter to the value PRE_LOAD_CACHE by default.
Data Services uses the records of the lookup table in the cache to improve the performance of
the lookup job.
The Decode Function
Use the decode function as an alternative to nested conditions like IF, THEN or ELSE.
Use the decode function to return an expression based on the first condition in the specified list of
conditions and expressions that evaluates as TRUE. Apply multiple conditions when you map
columns or select columns in a query, for example, to put customers into different groupings.
Decode Function Format
The syntax of the decode function uses the following format:
decode(condition_ and_ expression_ li st, defau l t _ expression)
When writing nested conditions you must ensure that the parentheses are in the correct places
as in the following example:
ifthenelse ( (EMPNO
ifthe nelse((EMPNO
ifthe nelse((EMPNO
ifthe nelse((EMPNO
'NO_ ID '))))
- 1) ,' 111 ',
- 2) ,' 222 ',
- 3) ,' 333 ',
- 4),'444',
Decode functions list conditions as shown in the following example:
de code((EMPNO = 1) ,' 111 ',
(EMPNO - 2) ,' 222 ' I
(EMPNO = 3) ,' 333 ' I
(EMPNO = 4),'444' I
'NO_ ID ')
©Copyright. All rights reserved .
115
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 5: Functions, Scr ipts, and Var iables
Decode functions are less prone to error than nested functions. Data Services pushes the decode
function to the database server when possible to improve performance.
116
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 5
Exercise 7
Use the search_replace Function
Business Example
When evaluating the customer data for Alpha Acquisitions, you discover a data entry error. The
contact title of "Account Manager" has been entered as "Accounting Manager". You must correct
these ent ries before it is moved to the data warehouse.
1. In the Alpha_NACustomer_DF workspace, delete an existing expression for t he Title column
in t he Query transform.
2. Using the Function wizard, create a new expression for t he column using the search_replace
function found under the category of "String" functions.
3. Execute the Alpha_NACustomer_Job with the default execut ion properties after saving all of
t he objects t hat you have created.
© Copy right . All r ights r eserved.
117
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 5
Solution 7
Use the search_replace Function
Business Example
When evaluating the customer data for Alpha Acquisitions. you discover a data entry error. The
contact t itle of "Account Manager" has been entered as "Accounting Manager" . You must correct
these entries before it is moved to the data warehouse.
1. In the Alpha_NACustomer_DF workspace, delete an existing expression for the Title column
in the Query transform.
a) In the Alpha_NACustomer_DF workspace, to open the Query Editor, double c lick the
Query transform.
b) In the Query Editor. in the output schema. choose the field CONTACTTITLE.
c) To delete existing expression, in the Mapping tab highlight the expression and press the
Delete button on your keyboard.
2. Using the Function wizard, create a new expression for the column using the search_replace
function found under the category of "String" functions.
a) In the Query Editor, in the Mapping tab, choose Functions .
b) In the Select Function dialog box, choose String Functions.
c) From the list of function names, select search_replace and choose Next.
d) In the Search_replace Select Parameters dialog box, select the drop down arrow next to
the f ield Input expression.
e) In the drop down list, choose the field customer.CONTACTTITLE.
f) In the Search replace table, in the Search value column. enter Accounting Manager.
g) In the Replace value column, enter Account Manager.
h) Choose Finish.
3. Execute the Alpha_NACustomer_Job with the default execution properties after saving all of
the objects that you have created.
a) In the Omega project, right click the Alpha_Customers_Job.
b) Choose Execute.
Data Services prompts you to save any objects that have not been saved by selecting the
OK button in the Save all changes and execute dialog box.
c) To use the default execution properties, choose OK.
118
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using Built-In Functions
d) To return to the job workspace, on the toolbar, choose the Back icon in the toolbar.
e) To open the data flow workspace, double click the data f low.
f) To view your data, right click the target table and choose View Data.
Note that the titles for the affected contacts are changed .
g) Close the display.
©Copyright. All rights reserved .
119
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 5: Functions, Scr ipts, and Var iables
120
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 5
Exercise 8
Use the lookup_ext() Function
Business Example
In the Alpha Acquisitions database. the country for a customer is stored in a separate table and
referenced with a code. To speed up access to information in t he data warehouse, this lookup
should be eliminated.
Use the lookup_ ext function to exchange the ID for the country name in the customers table for
Alpha with the actual value from the country table.
1. In the Alpha_NACustomer_DF workspace. delete an existing expression for t he Country
column in the Query transform.
2. Use the Functions wizard to create a new lookup expression using the lookup_ext f unction.
3. Execute the Alpha_NACustomer_Job with the default execution properties after saving all
objects you have created .
© Copy right . All r ights r eserved.
121
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 5
Solution 8
Use the lookup_ext() Function
Business Example
In the Alpha Acquisitions database. t he country for a customer is stored in a separate table and
referenced with a code. To speed up access to information in t he data warehouse, this lookup
should be eliminated.
Use the lookup_ ext function to exchange the ID for the country name in the customers table for
Alpha with the actual value from the country table.
1. In the Alpha_NACustomer_DF workspace, delete an existing expression for t he Country
column in the Query transform.
a) In the Alpha_NACustomer_DF workspace. to open the transform editor, double click the
Query transform.
b) In the Query Editor, in the out put schema, choose the field Country.
c) In the Mapping tab for t he country field. delete t he existing expression.
2. Use the Functions wizard to create a new lookup expression using the lookup_ext f unction .
a) In the Mapping tab, choose Functions.
b) In the Select Function dialog box, choose Lookup Functions.
c) Choose t he lookup_ext function and choose Next.
d) In the Lookup_ext - Select Parameters dialog box, enter the following parameters:
Field/Opt ion
Value
Lookup table
Alpha.dbo.country
Condition
Colum n in lookup table
COUNT RYID
Op.(&)
--
Expression
customer.COUNTRYI D
Output
Colum n in look table
COUNTRYNAME
e) To close the editor. choose Finish.
3. Execute the Alpha_NACus tomer_Job with the default execut ion properties after saving all
objects you have created.
122
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using Built-In Functions
a) Right click the Alpha_NACustomer_Job in the Omega project and choose Execute.
Data Services prompts you to save any objects that have not been saved. Choose OK.
b) To use the default execution properties, choose OK.
c) To return to the job workspace, on the toolbar, choose the Back icon.
d) To open the data flow workspace, double click the data f low.
e) Right click the target table and choose View Data.
Note that the country codes are replaced by the country names.
f) Close the display.
© Copyright. All rights reserved.
123
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 5: Functions, Scr ipts, and Var iables
124
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 5
Exercise 9
Use Aggregate Functions
Business Example
You must calculate the total value of all orders, including their discounts, for reporting purposes.
Currently these details are found in different tables.
Use the sum and decode functions to calculate the tot al value of orders in the Order Details
table.
1. Create a new batch job called Alpha_Order_Sum_Job with a data flow
Alpha_Order_Sum_DF.
2. In the transform editor for the Query t ransform, propose a join between t he two source tables.
3. In the Query transform, create a new output column TOTAL_VALUE, which will hold the new
calculation.
4. Map the TOTAL_VALUE column using the sum function. The value is the product of the
quantity from the order_details table and the cost from the products table, multiplied by
t he discount from the order details table.
5. Now that the expression can calculate the total of the order values, make it possible for the
Query to begin at the first order through the end of t he records in the table by using the Group
By tab.
6. Execute the Alpha_Order_sum_J ob with the default execution properties after saving all of
t he objects t hat you have created.
© Copy right . All r ights r eserved.
125
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 5
Solution 9
Use Aggregate Functions
Business Example
You must calculate the total value of all orders. including their discounts, for reporting purposes.
Currently these details are found in d ifferent tables.
Use the sum and decode functions to calculate the total value of orders in the Order Details
table.
1. Create a new batch job called Alpha_Order_Sum_Job with a data flow
Alpha_Order_Sum_DF.
a) In the Project area, right click your Omega project and choose New batch job.
b) Enter the job name Alpha_Order_Sum_Job.
c) In the Alpha_Order_sum_Job workspace, from the toolbar. choose the Data Flow icon.
d) To add the data flow to your new job, click in the workspace, and enter the name
Alpha_Order_Sum_DF.
e) In the Local Object Library, choose the Oatastores tab.
f) From the Alpha datastore, select the Order_Details table, drag it to the
Alpha_Order_sum_OF workspace and choose Make Source.
g) From the Alpha datastore, select the product table, drag it to the
Alpha_Order_sum_OF workspace and choose Make Source.
h) From the tool palette. choose the Template Table icon.
i) To place the template table, click in the Alpha_Order_sum_DF workspace.
j) In the Create Template dialog box. in the Table name f ield, enter order_sum. and change
the In datastore field to Del ta.
k) From the tool palette. select the Query Transform icon.
I) To place the Query Transform, click in the Alpha_Order_Sum_DF workspace.
m) To connect the Order_Details table to the Query Transform. select the source table.
hold down the mouse button. drag it to the Query Transform, and release the mouse
button.
n) To connect the Product table to the Query Transform. select the source table. hold down
the mouse button. drag it to the Query Transform, and release the mouse button.
126
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using Built-In Functions
o) To connect the Query Transform. select the Query Transform. hold down the mouse
button, drag it to the order_sum table, and release the mouse button.
2. In the transform editor for the Query transform. propose a join between the two source tables.
a) To open the Query Editor double click the Query.
b) Choose the Where tab or the From tab.
c) Choose the Propose Join button.
The Designer should enter the following code: product. PRODUCTID order details.PRODUCTID.
3. In the Query transform, create a new output column TOTAL_VALUE, which wil l hold the new
calculation.
a) To map the ORDERID column from the input schema to t he same field in the output
schema. select ORDERID and drag to the output schema
b) In the output schema. right click ORDERID and choose New output column.
c) Choose Insert Below.
d) Enter the name TOTAL_VALUE with a data type of decimal, a precision of 1 o and a scale
of 2.
e) Choose OK.
4. Map the TOTAL_VALUE column using t he sum function. The value is the product of the
quantity from the order_details table and the cost from the products table, multiplied by
the discount from the order details table.
a) On the Mapping tab of the TOTAL_VALUE column, enter the expression:
sum((order_details.QUANTITY*product.COST)*order_details.DISCOUNT)
Note:
If you validate the expression, the validation will fail. Once you complete the
next step, the validation will pass.
5. Now that the expression can calculate the total of the order values, make it possible for the
Query to begin at the first order through the end of the records in the table by using the Group
By tab.
a) In the Query Editor. select the Group By tab.
b) In the Schema In column. select the ORDERID field from the ORDER_DETAILS table and
drag it to the Group By tab.
c) Close the Editor.
6. Execute the Alpha_Order_sum_Job with the default execution properties after saving all of
the objects that you have created.
a) In the Omega project. right click the Alpha_Order_Sum_Job.
©Copyright. All rights reserved .
127
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 5: Functions, Scr ipts, and Var iables
b) Choose Execute.
Data Services prompts you to save any objects that have not been saved. Choose OK.
c) To use the default execution properties, choose OK.
d) Return to the job workspace,
e) Open the data flow workspace.
f) Right click the target table and choose View data.
g) Confirm that order 11146 has 204000.00 as a total value.
h) Close the display.
128
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using Built-In Functions
LESSON SUMMARY
You should now be able to:
•
Use functions in expressions
©Copyright. All rights reserved .
~
129
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 5
Lesson 2
Using Variables, Parameters, and Scripts
LESSON OVERVIEW
Apply decision-making and branch logic to work flows. Use a combination of variables,
parameters and scripts to calculate and pass information between the objects in your jobs.
LESSON OBJECTIVES
After completing this lesson, you will be able to:
•
Use variables, parameters. and scripts
Scripts, Variables, and Parameters
Assign values to variables and call functions. Use standard string and mathematical operators to
transform data and manage work flow wit h the Data Services scripting language.
Variables
A variable is a common component in scripts t hat acts as a placeholder to represent values with
the potential to change each time a job is executed. To make variables easy to identify in an
expression, variable names start with a dollar sign ($).They can be of any data type supported by
Data Services.
Use variables in expressions in scripts or transforms to facilitate decision making or data
manipulation. Use a variable in a LOOP or in an IF statement to check a variable's value and to
decide which step to perform.
Variables as File Names
Use variables to enable the same expression to be used for multiple output files. Variables can be
used as file names for the following:
•
Flat file sources and targets
•
XML file sources and targets
•
XML message targets executed in the Designer in test mode
•
Document file sources and targets in an SAP ERP environment
•
Document message sources and targets in an SAP ERP environment
In addition to scripts, you can use variables in a catch or a conditional.
A catch is part of a serial sequence called a try/catch block. The try/catch block allows you to
specify alternative work flows if errors occur while Data Services is executing a job.
130
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using Variables, Parameters, and Scripts
A conditional is a single-use object available in work flows that allows you to branch the execution
logic based on the results of an expression. The conditional takes the form of an if/then/else
statement.
Parameters
A parameter is another type of placeholder t hat calls a variable.
This call allows the value from the variable in a job or a work flow to be passed to the parameter in
a dependent work flow or data flow. Parameters are most commonly used in WHERE clauses.
Variables versus Parameters
•
Global Variables
Set at job level
Available for assignment or reference in child objects
Can be assigned in job and/or execution properties
Repository wide variable ($G_)
•
Local Variables
Not available in referenced objects
Object related variable ($L_)
•
Parameters
Can be input or output one way or two way
Assigned on t he Calls tab of Variables and Parameters
Calls a variable ($P_)
Table 18: Naming Convention
Start all names w ith a dollar sign ($),and use the prefixes in the table as a naming convent ion to
ensure consistency across projects:
Type
Naming Convent ion
Global variable
$G_
Local variable
$L_
Paramet er
$P_
Global versus Local Variables
Local variables are restricted to the job or work flow in w hich they are created. Use parameters to
pass local variables to the work flows and data flows in the object. A local variable is included as
part of t he definition of the work flow or data flow, and so it is portable between jobs.
Global variables are also restricted to the job in which they are created. However, they do not
require parameters to be passed to work flows and data flows in that job. You can reference the
g lobal variable directly in expressions for any object of the job.
©Copyright. All rights reserved.
131
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 5: Functions, Scripts, and Variables
Global variables can simplify your work. Set values for g lobal variables in script objects or using
external job execution, or schedule properties. For example, during production, you can change
values for default g lobal variables at runtime from a job's schedule without having to open a job in
the Designer.
Since a global variable is part of the definition of the job to which the work f low or data flow
belongs, it is not included when t he object is reused.
Whether you use global variables or local variables and parameters depends on how and where
you need to use the variables. Create a g lobal variable if you need to use t he variable at multiple
levels of a specific job.
Table 19: Variables and Parameters for Object Types
The table summarizes t he variables and parameters you can create for each type of object:
Object
Type
Used By
Job
Global variable
Any object in the job.
Job
Local variable
A script or conditional in the job.
Work flow
Local variable
This work flow or passed down to other work
flows or data flows using a parameter.
Work flow
Parameter
Parent objects to pass local variables. Work
flows may also return variables or parameters
to parent objects.
Data flow
Parameter
A WHERE clause, column mapping, or function
in the data flow. Data flows cannot return
output values.
Define a Global Variable, Local Variable or a Parameter
1. Select the object in t he project area.
The object must be a job for a global variable. The object can be a job or a work flow for a local
variable. The object can be a work flow or a data flow for a parameter.
2. Select Variables from the toolbar.
The Variables and Parameters dialog box appears.
Create a relationship between a local variable and the parameter by specifying that the name of
t he local variable is the property value of the parameter in the Calls tab.
Define the Relationship between a Local Variable and a Parameter
1. Select the dependent object in the project area.
2. Select Variables from the Tools menu to open the Variables and Parameters dialog box.
3. Select the Calls tab. The Calls tab displays any parameters that exist in dependent objects.
132
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using Variables, Parameters, and Scripts
4. Right-click the parameter and select Properties from the menu to show the Parameter Value
dialog box.
5. Enter the name of the local variable you want the parameter to call or a constant value in the
Value field. A variable must be of the same datatype as the parameter.
6. Select OK.
Global Variables Using Job Properties
In addition to setting a variable inside a job using a script, you can set and maintain global variable
values outside a job using properties.
Values set outside a job are processed the same way as those set in a script. If you set a value for
the same variable both inside and outside a job, the value from the script overrides the value from
the property.
Values for global variables can be set as a job property or as an execution or schedule property.
All values defined as job properties are shown in the Properties window. By setting values outside
a job, you can rely on the Properties window for viewing values that have been set for global
variables and easily edit values when testing or scheduling a job.
Global Variable Value as Job Property
Follow these steps to set a global variable value as a job property:
1. Right-click a job in the Local Object Library or project area and select Properties from the
menu.
The Properties dialog box appears.
2. Select the Global Variable tab.
All global variables for the job are listed.
3. Enter a constant value or an expression in the Value column for the g lobal variable.
4. Select OK.
You can also view and edit these default values in the Execution Properties dialog box of the
Designer. This allows you to override job property values at runtime. Data Services saves values
in the repository as job properties.
Substitution Parameters
Substitution parameters define parameters that have a constant value for one environment. but
the value can be changed for use in other environments. Make a change in one location to affect
all jobs. You can override the parameter for particular job executions.
The typical use case is for file locations that are constant in one environment. but change when a
job is m igrated to another environment. for example, migrating a job from test to production.
As with variables and parameters, the name can include any alpha or numeric character or
underscores. but cannot contain blank spaces. Follow the naming convention and begin the name
for a substitution parameter with double dollar signs($$) and an S_ prefix to differentiate from
out-of-the-box substitution parameters.
©Copyright. All rights reserved.
133
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 5: Functions, Scripts, and Variables
Table 20: Substitution Parameters and Variables
Global Variables
Substitution Parameters($$)
Can be used in transforms, functions and
scripts.
Can be used in transforms. functions and
scripts.
Assigned defau lt values can be overridden at
execution time.
Assigned default values can be overridden at
execution time.
Defined at the job level.
Defined at the repository level.
Cannot be shared across jobs.
Available to all jobs in a repository.
Data type specific.
All strings with no data type.
Value can change during execution .
Fixed value set prior to job execution.
Select Substitution Parameter Configurations from t he Tools menu to create a substit ution
parameter configuration.
Note:
When exporting a job to a file or a repository, the substitution parameter
configurations are not exported with them. Export substitution parameters via a
separate command to a text file and use this text f ile to import into another
repository.
Scripts
A script is a single-use object that is used to call functions and assign values in a work flow.
Execute a script before data flows for initialization steps and use a script with conditionals to
determine execution paths. You may also use a script after work flows or data flows to record
execution information such as time, or to record a change in the number of rows in a data set.
Use a script to calculate values that are passed on to other parts of the work flow or to assign
values to variables and execute functions.
Script Statements
A script can include these statements:
134
•
Function calls
•
If stat ements
•
While statements
•
Assignment statements
•
Operators
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using Variables, Parameters, and Scripts
Data Services Scripting Language
Wit h Data Services scripting language, you can assign values to variables, call functions. and use
standard string and mathematical operators. The syntax can be used in both expressions such as
WHERE clauses and scripts.
Script Usage
•
Job initialization
•
File existence
•
Email alert s
•
Status table checks and updates
Expressions are a combination of constants, operators. functions. and variables that evaluate to a
value of a given datatype. Use expressions inside script statements or add expressions t o data
flow objects.
Basic Syntax
Follow these basic syntax rules when you are creating an expression with t he Data Services
scripting language:
•
End each statement with a semicolon(:) .
•
St art variable names with a dollar sign($).
•
Enclose string values in single quotation marks(' ').
•
Start comments wit h a number sign (#).
•
Specify parameters in function calls, even if the function call does not use parameters.
•
Substitute t he value of t he expression with square brackets, for example:
Print( 'The v a l u e o f t he sta rt da t e i s : [ sysda t e () +S]' ) ;
•
Quote the value of the expression in single quotation marks with curly brackets. for example:
$Sta rtDa t e = sql ( ' demo_ t a r get ', ' SELECT Extra ctHigh FROM
J ob_ Execut ion_ Sta tus WHERE J obName = {$ J obName}' ) ;
Syntax for Column and Table References in Expressions
Expressions can be used inside data fl ow objects and can contain column names.
The Data Services scripting language recognizes column and table names without special syntax.
For example, you can indicate the start_date column as the input to a function in the Mapping tab
of a q uery as to_ c har (start_date, ' dd . mm . yyyy') .
The column start_date must be in the input schema of the query. If there is more than one
column with the same name in the input schema of a query, indicate which column is included in
an expression by qualifying the column name with the table name. For example, indicate the
column start_date in the table status as status . start date .
© Copyright. All rights reserved.
135
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 5: Functions, Scr ipts, and Variables
Column and table names as part of SQL strings may require special syntax based on the RDBMS
that the SQL is evaluated by. For example, select all rows from the LAST_NAME column of the
CUSTOMERtableasSQL('ORACLE_DS' ,' SELECT CUSTOMER.LAST NAME FROM CUSTOMER') .
Table 21: Operators
You can use the operators listed in this table in expressions. Operators are listed in order of
precedence.
Operator
Description
+
Addition
-
Subtraction
*
Multiplication
I
Division
-
Comparison, equals
<
Comparison, is less than
<=
Comparison, is less than or equal to
>
Comparison, is greater than
>=
Comparison, is greater than or equal to
!=
Comparison, is not equal to
11
Concatenate
AND
Logical AND
OR
Logical OR
NOT
Logical NOT
IS NULL
Comparison, is a NULL value
IS NOT NULL
Comparison, is not a NULL value
Note:
When operations are pushed to an RDBMS to perform, the precedence is determined
by the rules of the RDBMS.
Quotation Marks
Quotation marks, escape characters. and trailing blanks can all have an adverse effect on your
script if used incorrectly.
The type of quotation marks to use in strings depends on whether you are using identifiers or
constants.
Identifiers
136
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using Variables, Parameters, and Scripts
•
An ident ifier is the name of t he object like a table. column, data flow, or function.
•
Use double quotation marks in identifiers if they contain non-alphanumeric characters.
•
For example, use a double quote in the string "compute large numbers" because the string
contains blanks.
Constants
•
A constant is a fixed value used in computation.
•
There are two types of constants:
String constants. for example, 'Hello' or '2007.01.23'
Numeric constants, for example. 2.14
•
Use single quot ation marks in string constants and no quotation marks in numeric constants.
Table 22: Escape Characters
Special characters like a single quote or a backslash must be preceded by an escape character to
be evaluated properly in a string. Data Services uses the backslash as t he escape character as
shown in this table:
Character
Example
Single quote(')
'World\'s Books'
Backslash (\)
'C:\\temp'
NULLs, Empty Strings, and Trailing Blanks
To conform to t he ANSI VARCHAR standard when dealing with NULLs, empty strings, and trailing
blanks, Data Services:
•
Treats an empty string as a zero length varchar value, instead of as a NULL value.
•
Returns a value of FALSE when you use the operators Equal ( =) and Not Equal ( <>) to
compare to a NULL value.
•
Provides IS NULL and IS NOT NULL operators to test for NULL values.
•
Treats trailing blanks as regular characters when reading f rom all sources, instead of trimming
them.
•
Ignores trailing blanks in comparisons in transforms and functions.
NULL Values
Type the word NULL to represent NULL values in expressions, for example, check whether a
column (COLX) is null or not with these expressions:
COLX IS NULL
COLX IS NOT NULL
© Copyright. All rights reserved.
137
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 5: Functions, Scripts, and Variables
Use the function NVL to remove NULL values. Data Services does not check for NULL values in
data columns.
NULL Values and Empty Strings
Data Services uses two rules with empty strings:
•
When you assign an empty string to a variable, Data Services treats the value of the variable as
a zero-length string.
An error results if you assign an empty string to a variable that is not a varchar. Use the NULL
constant to assign a NULL value to a variable of any type.
•
As a blank constant(' '), Data Services treats the empty string as a varchar value of zero
length.
Use the NULL constant for the null value.
Rules in Conditionals
Data Services uses these three rules with NULLS and empty strings in conditionals:
•
Rule 1:
The Equals (=)and Is Not Equal to (<>)comparison operators against a NULL value always
evaluate to FALSE. This FALSE result includes comparing a variable that has a value of NU LL
against a NULL constant.
•
Rule 2:
Use the IS NULL and IS NOT NU LL operators to test the presence of null values. for example.
when assuming a variable assignment $varl = NULL;
•
Rule 3:
Test for NULL when comparing two variables. In this scenario, you are not testing a variable
with a value of NULL against a NULL constant as in Rule 1. Test each variable and branch
accordingly or test in the conditional.
Script, Variable and Parameter Combinations
Consider an example where you start w ith a job, work flow, and data f low. You want the data flow
to update only those records that have been created since the last time the job executed.
1. Create a variable for the update time at t he work flow level, and a parameter at the data flow
level that calls the variable.
2. Create a script within the work flow that executes before the data f low runs. The script
contains an expression that determines the most recent update time for the source table.
3. The script assigns the update time value to the variable, identifies what that value is used for
and allows it to be reused in other expressions.
4. Create an expression in the data f low t hat uses the parameter to call the variable and find out
the update time. The data flow compares the update time to the creation date of the records
and identify which rows to extract from the source.
138
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using Variables, Parameters, and Scripts
Custom Function Scripting
If the built-in funct ions that are provided by Data Services do not meet your requirements. you
can create your own custom functions using the Data Services scripting language.
Create your own functions by writ ing script functions in the Data Services scripting language
using the Smart Editor. Saved custom functions appear under the Custom Functions category in
the Function Wizard and the Smart Editor. Custom functions are also displayed on t he Custom
Functions tab of the Local Object Library.
Edit and delete custom functions from the Local Object Library.
Custom Function Guidelines
Consider these guidelines when you create your own functions:
•
Functions can call ot her functions.
•
Functions cannot call themselves.
•
Functions can not participate in a cycle of recursive calls. for example, function A can not call
function 8 if function 8 calls function A.
•
Functions return a value.
•
Functions can have parameters for input, output, or both. However. data f lows cannot pass
parameters of type output or input/output.
You must know the input, output and return values and data types before you create a custom
function. The return value is predefined to be Return.
Custom Function Creation
Follow these steps to create a custom function:
1. On the Custom Functions tab of the Local Object Library, right-click the white space and
select New from the menu.
The Custom Function dialog box displays.
2. Enter a unique name for the new function in the Function name field.
3. Enter a description in the Description field .
4. Select Next.
Define the return type, parameter list, and any variables to be used in the function using t he
Smart Editor.
Stored Procedures
A stored procedure describes an executable object or a named entity that is stored in a database
and can be invoked using input and output parameters.
A stored procedure is one or more precompiled SQL statements. By calling a stored procedure
from within Data Services, you can invoke business logic you have already coded to develop data
extraction and data management tasks quickly and conveniently.
Stored Procedure Usage
© Copyright. All rights reserved.
139
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 5: Functions, Scripts, and Variables
Use stored procedures to:
•
Maintain business logic rules and provide a single point of control to ensure that rules are
accurate and enforced.
•
Significantly reduce network overhead with client/server applications because:
Procedures are stored on the database server.
Compiled execution plans for procedures are retained in the data d ictionary.
Data Services supports stored procedures for DB2, ODBC, Oracle, Microsoft SQL Server, SAP
HANA, SAP Sybase SQL Anywhere, SAP Sybase ASE, SAP Sybase IQ. and Teradata databases.
Queries, scripts, conditionals, and custom functions can all be configured to include stored
procedures, stored functions, and packages. Stored procedures must exist in the database
before you can use the procedure in Data Services.
Create a stored procedure in a database using the client tools provided with the database, such
as Oracle SQL * Plus. When a stored procedure is created, it can be called by users who have
execution privileges for the procedure. When stored procedures are imported into Data Services,
they can be used in Data Services jobs like functions.
Stored procedures include parameters. Each parameter has a name. a data type, and a mode, for
example IN, INOUT. or OUT. A stored procedure can use a NULL value or a default parameter
value for its input and can produce more than one output parameter value.
Stored Procedure Requirements
To use stored procedures with Data Services:
140
•
The client and server versions must match.
•
Only user-defined stored procedures can be used. Data Services does not support stored
procedures provided by a database system.
•
The return type must be a Data Services supported data type, such as varchar, date, or int.
•
The name of the stored procedure must be unique because Data Services only imports the
f irst procedure or function with a particular name.
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 5
Exercise 10
Create a Custom Function
Business Example
The Marketing department would like to send special offers to customers who have placed a
specified number of orders. This can be done by creating a custom function that must be called
when a customer order is placed. You want to create a custom function to accept the input
parameters of the Customer ID. and the number of orders required to receive a special order,
check the Orders table. and then create an initial list of eligible customers.
1. In the Local Object Library, create a new customer function called CF_MarketingOffer.
2. Create a new batch job and data flow. called Alpha_Marketing_Offer_Job and
Alpha_Marketing_Offer_OF respectively, and a new global variable $G_Num_to_Qual.
3. In the job workspace, define a script to define the global variable and attach the script to the
data flow.
4. Define the data flow with the customer table from the Alpha datastore as a source, a
template table as a target and two Query transforms between the source and target.
5. Execute Alpha_Marketing_Offer_Job with the default properties and view the results.
© Copy right . All r ights r eserved.
141
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 5
Solution 10
Create a Custom Function
Business Example
The Marketing department would like to send special offers to customers who have placed a
specified number of orders. This can be done by creating a custom function that must be called
when a customer order is placed. You want to create a custom f unction to accept the input
parameters of the Customer ID, and the number of orders required to receive a special order.
check the Orders table. and then create an init ial list of eligible customers.
1. In the Local Object Library, create a new customer function called CF_MarketingOffer.
a) In the Local Object Library, choose the Custom Functions tab.
b) Enter the name CF_MarketingOffer, and choose Next.
c) In the Smart Editor. choose the Variables tab.
d) Right click Parameters and choose Insert.
e) In the Parameter Properties dialog box. enter the name $P_ CustomerID.
f) In the Data type field, enter int
g) In the Parameter type field, enter Input.
h) Choose OK.
i) Right click Parameters and choose Insert.
j) In the Parameter Properties dialog box, enter the name $P_Orders.
k) In the Data type field, enter int.
I) In the Parameter type field, enter Input.
m) Choose OK.
n) In the workspace of the Smart Editor, enter the following code on three separate lines:
If (SQL ( 'Alpha ' , 'SELECT COUNT(*) From orders WHERE CUSTOMERID =
[$P_CustomerID] ') >= $P_Orders)
Return 1 ;
Else return O;
142
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using Variables, Parameters, and Scripts
Note:
Do not use the ifthenelse function. Type in the if function.
This code defines the custom f unction as a conditional clause. The conditional clause
specifies that. if the number of rows in t he orders table is equal to the value of the
parameter $P_Orders for the Customer ID, the function should return a 1. Otherwise, it
should return o.
o) Choose Validate, and make any necessary corrections.
Note:
If your function contains syntax errors, Data Services displays a list of those
errors in an embedded pane below the editor. To see where the error occurs in
the text. double-click an error. The Smart Editor redraws to show you the
location of t he error.
p) Choose OK.
2. Create a new batch job and data flow. called Alpha_Marketing_Offer_Job and
Alpha_Marketing_Offer_DF respectively, and a new global variable $G_Num_to_Qual.
a) In t he project area, right click the Omega project and choose New batch job
b) Enter the name Alpha_Marketing_Offer_Job.
c) In t he tool palette, select the Data Flow icon and c lick in the workspace.
d) Enterthe name Alpha_Marketing_Offer_DF.
e) Select the job Alpha_Marketing_Offer_Job and choose Tools
-+
Variables.
f) Right click Global Variables and choose Insert.
g) Right click t he new variable and choose Properties.
h) In t he Global Variable Properties box, enter the name $G_ Num_to_Qual .
i) In t he Data type field, enter int.
j) Choose OK.
k) Close the display.
3. In the job workspace, define a script to define the global variable and attach t he script to the
data fl ow.
a) In t he project area, choose t he Alpha_Marketing_Offer_Job.
b) From the tool palette, choose the Script icon .
c) To place the script, click in t he workspace to the left of the data f low.
© Copyright. All rights reserved.
143
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 5: Functions, Scripts, and Variables
d) Name the script CheckOrders.
e) To open t he script, double-click it.
f) Enter the expression $GNwn_to_Qual = s;
This creates an expression that defines the g lobal variable as five orders to qualify for the
special marketing campaign.
g) Close t he script and return to the job workspace.
h) To connect the script to the data flow, select it and, while holding the mouse button, drag it
to the data flow. Release the button to create the connection.
4. Define the data flow with the customer table from the Alpha datastore as a source, a
template table as a target and two Query transforms between the source and target.
a) From the Local Object Library, choose the Datastores tab.
b) In the Alpha Datastore. select the customer table, drag it into the data flow workspace,
and choose Make Source.
c) In the tool palette, select the Template Table icon, and click in the workspace.
This adds the template table to your data flow.
d) Name the table offer_mailing_list, choose the Delta datastore, and choose OK.
e) From the tool palette, select the Query Transform icon, and click in t he data flow
workspace.
f) From the tool palette, again select the Query Transform icon and click in the data f low
workspace.
g) To connect the source table to the first query, select the table, and, while hold ing down the
mouse button, drag it to t he query. Release the button to create t he connection.
h) To connect the f irst query to the second query, select the first query and, while holding t he
mouse button, drag it to the second query. Release the button to create the connection.
i) To connect the target table to the second query, select t he second query and, while
holding the mouse button, drag it to the target table. Release the button to create the
connection.
j) Open the Query Editor for the first query, and select the following input columns from the
Schema In and drag them to the Schema Out on the Query node:
•
CONTACTNAME
•
ADDRESS
•
CITY
•
POSTALCODE
k) Right click POSTALCODE, choose New Output Column, and choose Insert Below.
I) Enter the column name OFFER STATUS.
144
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using Variables, Parameters, and Scripts
m) In the Datatype field. enter int and choose OK.
n) On the Mapping tab of the OFFER_STATUS colum n, choose Functions.
o) In the Select Function. choose category Custom Functions. your custom function
CF_MarketingOffer, and choose Next.
p) In the Define Input Parameter(s) dialog box, in the $P_ CustomerID field, choose the
CUSTOMER table and then choose OK.
q) From the list of table fields, select CUSTOMERID and choose OK.
You will be returned to the Function Wizard.
r) In the $P_Orders field, choose t he Smart Editor icon (the button with the three dots).
s) On the Variables tab, expand the node for Global Variables, and then the node for your job.
t ) Right click the g lobal variable $G_ Nwn_to_Qual and choose Enter.
u) To return to the Function Wizard, choose OK.
The expression should look like this:
CF_MarketingOffer (CUSTOMERID , $G Num_to_Qual)
v) Close the Query transform.
w) Open the second Query and in the Query Editor. select the following input columns from
the Schema In and drag them to the Schema Out:
•
CONTACTNAME
•
ADDRESS
•
CITY
•
POST ALCODE
x) In the WHERE tab, enter an expression to select only t hose records where OFFER_STATUS
has a value of one.
y) From the Schema In, select the input column OFFER_STATUS , drag it into the WHERE tab
workspace. and enter =l .
The expression should be: Query . OFFER_STATUS
=
1.
This will select only those records where OFFER_STATUS has a value of one.
5. Execute Alpha_Marketing_Offer_Job with the default properties and view the results.
a) In the project area. select Alpha_Marketing_Offer_Job and choose Execute.
If you have unsaved changes, a Save All Changes and Execute dialog box opens. To
continue, choose, Yes.
b) To accept the default execution properties, choose OK.
c) Return to t he job workspace.
d) Open the data flow workspace.
© Copyright. All rights reserved.
145
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 5: Functions, Scr ipts, and Var iables
e) Right click the target table to choose View Data .
You should have one output record for contact Lev M. Melton in Quebec.
f) Close the display.
146
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using Variables, Parameters, and Scripts
LESSON SUMMARY
You should now be able to:
•
Use variables. parameters. and scripts
©Copyright. All rights reserved.
~
147
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 5: Functions, Scr ipts, and Var iables
148
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 5
Learning Assessment
1. All functions can be called from custom functions or scripts.
Determine whether this statement is true or false.
D
D
True
False
2. Lookup_ ext has more functionality than lookup.
Determine whether this statement is true or false.
D
D
True
False
3. What is a variable?
4. What is a script?
© Copy right . All r ights r eserved.
149
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 5
Learning Assessment - Answers
1. All functions can be called from custom functions or scripts.
Determine whether this statement is true or false .
D
True
0
False
2. Lookup_ ext has more functionality than lookup.
Determine whether this statement is true or false.
0
True
D
False
3. What is a variable?
A variable is a common component in scripts that acts as a placeholder to represent values
with the potential to change each time a job is executed.
4. What is a script?
A script is a single-use object that is used to call functions and assign values in a work flow.
150
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Platform Transforms
Lesson 1
Using Platform Transforms
153
Lesson 2
Using the Map Operation Transform
156
Exercise 11: Use the Map Operation Transform
159
Lesson 3
Using the Validation Transform
163
Exercise 12: Use the Validation Transform
169
Lesson 4
184
Using the Merge Transform
Exercise 13: Use the Merge Transform
187
Lesson 5
Using the Case Transform
197
Exercise 14: Use the Case Transform
201
Lesson 6
Using the SQL Transform
207
Exercise 15: Use the SQL Transform
209
UNIT OBJECTIVES
•
Describe platform transforms
•
Use the Map Operation transform in a data f low
•
Use the validation transform
© Copy right . All r ights r eserved.
151
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6: Platform Transforms
152
•
Use the merge transform
•
Use the case transform
•
Use the SQL transform
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6
Lesson 1
Using Platform Transforms
LESSON OVERVIEW
A platform transform enables you to control how data sets change as they move from source to
target in a data flow.
Transforms operate by changing input data sets or by generating new data sets. Add transforms
as components to your data flow and specify the function of the transform. Edit the input data.
the output data. and the parameters in a transform.
LESSON OBJECTIVES
After completing this lesson. you will be able to:
•
Describe platform transforms
Platform Transforms
Business Example
Your company extracts data from external systems using f lat files. The data volume from the
various external systems has increased continually in the recent past. making management of the
jobs for flat file extraction difficult. You can optimize this process by using Data Services to
extract data directly from an external system.
Transforms are optional objects in a data flow that allow data to be transformed as it moves from
source to target as shown in the f igure Data Services Transforms.
I
al ~
Iii
CJ
Iii CJ
Iii CJ
i la
Case
Map
Ooeration
Merge
t><l
Query
I
4
Row
Generation
~
I
D
I
SQL
I
SQL
Iii
Iii
Validation
Figure 31: Data Services Transforms
Platform Transfer Features
After completing this unit. you can:
© Copyright . All r ights reserved.
153
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6: Platform Transforms
•
Explain transforms
•
Describe the platform transforms available in Data Services
•
Add a transform to a data flow
•
Describe the transform editor window
Explaining Transforms
Transforms are objects in data f lows that operate on input data sets by changing them or by
generating one or more new data sets. The Query transform is the most commonly used
transform.
Transforms are added as components to a data flow in the same way as source and target
objects. Each transform provides different options that are specified based on the transform's
function. Choose to edit the input data, output data. and parameters in a transform.
Some transforms, for example, the Date Generation and SQL transforms, are used as source
objects, and do not have input options.
It is possible to use transforms in combination to create the output data set. For example, the
Table Comparison, History Preserve, and Key Generation transforms are used as independent
transforms as well as being combining for slowly changing dimensions.
Transforms and Functions
Transforms are similar to functions in that they can produce the same or similar values during
processing. However. transforms and functions operate on a different scale:
•
Functions operate on single values. such as values in specific columns of a data set.
•
Transforms operate on data sets by creating, updating, and deleting rows of data.
Function
Name
Transform
Sequence
Month
Qt
Dolly
1
Jan
1 100
Dolly
2
Feb
500
Dolly
3
Mar
900
Joe
1
Jan
500
Joe
2
Feb
1200
Joe
3
Mar
300
Sid
1
Jan
900
Figure 32: Comparison of Tra nsforms and Functions
Table 23: Describing Platform Transforms
The following platform transforms are available on the Transforms tab of the Local Object Library:
154
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using Platform Transforms
Transform
Description
Case
Case divides the data from an input data set
into multiple output data sets based on IFTHEN- E LSE branch logic
Map Operation
Allow conversions between operation codes
and mapping of non-normal rows
Merge
Unifies rows from two or more input data sets
into a single output data set
Query
Retrieves a data set that satisfies conditions
that are specified. A Query transform is similar
to an SQL SELECT statement
Row Generation
Generates a column filled with integers starting
at zero and incrementing by one to the end
value specified
SQL
Performs the indicated SQL query operation
Validation
Specify validation criteria for an input data set.
Data that fails validation can be filtered out or
replaced
LESSON SUMMARY
You should now be able to:
•
Describe platform transforms
© Copyright. Al l rights reserved.
155
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6
Lesson 2
Using the Map Operation Transform
LESSON OVERVIEW
Change the operation code for records wit h the Map Operation transform.
LESSON OBJECTIVES
After completing this lesson. you will be able to:
•
Use the Map Operation transform in a data flow
The Map Operation Transform
Business Example
Your company extracts data from external systems using flat files. The data volume from various
external systems has increased continually in the recent past. making it difficult to manage jobs
for flat file extraction. You can optimize this process by using Data Services to extract data
directly from an external system. Control how the data is to be loaded into the target and explore
t he capabilit ies of the Map Operation transform to control the target updating.
Using the Map Operation Transform
The Map Operation transform enables you to change the operation code for records.
Table 24: Describi ng Map Operations
The operation codes indicate how each row in the data set is applied to a target table when the
data set is loaded into a target. The operation codes are:
156
Operation Code
Description
NORMAL
Creates a new row in the target. All rows in a
data set are flagged as NORMAL when they are
extracted by a source table or f ile. If a row is
flagged as NORMAL when loaded into a target
table or f ile. it is inserted as a new row in the
target. Most transforms operate only on rows
flagged as NORMAL
INSERT
Creates a new row in the target. Only History
Preserving and Key Generation transforms can
accept data sets with rows f lagged as INSERT
as input
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using the Map Operation Transform
Operation Code
Description
DELETE
Is ignored by the target. Rows flagged as
DELETE are not loaded. Only the History
Preserving transform deletes rows when the
update rows option is selected. Can accept
data sets with rows flagged as DELETE
UPDATE
Overwrites an existing row in the target table.
Only History Preserving and Key Generation
transforms can accept data sets with rows
flagged as UPDATE as the input
Explaining the Map Operation Transform
The Map Operation transform allows for the changing of operation codes on data sets to produce
the desired output. For example, if a row in the input data set has been updated in some previous
operation of the data flow. use this transform to map the UPDATE operation to an INSERT. The
result could be to convert UPDATE rows to INSERT rows to preserve the existing row in the
target.
Introduct ion to the Map Operat ion Transform
As shown in the figure Introduction to the Map Operation Transform:
•
Map Operation transform :
Explicit override of operation codes
Discard rows by operation codes
•
Operation codes:
Associated with each row
Determines how a row affects the target
•
Why override?
Subsequent transform compatibility
Control effect on targets
Note:
You can use mapping expressions on only top-level schemas. They do not work on
nested schemas.
© Copyright . All rights reserved .
157
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6: Platform Transforms
Jn put:
u-
10
FlrstN0111e
LdstN<1nl('
Address
Op Code
10
20
Ct\&l'tle
Davis
1 1 Man StreQt
Daisy
Gannon
30
Dolly
Richards
13 Netson Avenue
9 Park u.no
•O
Joe
Y.ate"5
17 M•ller Street
so
MOQ.n
O'&riin
2 L Bra"ld Avenue
Normal
Normal
Normal
Ui>dalB
.a
Output:
JO
FlrslNon1l!
to'ilNt1n1l1
Addr~ss
Op Cod£•
10
20
30
Charlie
Oa11is
11 Main Stroot
Normal
Oaisv
Dolly
Gannon
13 Nolson Avenue
In5ert
Richards
9 Park Lan&
tnsert
40
Joe
tams
77 Miiier Street
NMnlll
so
MeQ.n
o·anan
21 Brand Avenue
Normal
Figure 33: Introduction to Map Operation Transfor m
Pushing Map Operation Transforms to the Source Database
Data Services can push Map Operation transforms to the source database.
The next section gives a brief description of the function, data input requ irements, options, and
data output results for the Map Operation transform. Input for the Map Operation transform is a
data set with rows flagged with any operation codes. It can contain hierarchical data.
Use caution when using columns of datatype real in this transform, as comparison results are
unpredictable for this datatype. Output for the Map Operation transform is a data set with rows
flagged as specified by the mapping operations.
The Map Operation transform enables the setting of the output row type option to indicate the
new operations desired for the input data set. Choose from the following operation codes:
INSERT, UPDATE. DELETE. NORMAL, or DISCARD.
158
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6
Exercise 11
Use the Map Operation Transform
Business Example
Users of employee reports have requested that employee records in the data mart contain only
records for current employees. You use the Map Operation transform to change the behavior of
loading so the result ing target conforms to this business requirement by removing any employee
records that contain a value in t he discharge data column of the source data.
1. Create a new batch job Alpha_Employees_Current_Job with a data flow
Alpha_Employees_Current_DF, which contains a Map Operation transform.
2. Add the Map Operation transform to the data flow, change the output operation code of
NORMAL to DELETE, save all objects and execute the job.
3. Save all objects and execute the Alpha_Employees_Current_Job.
© Copyright . All r ights reserved.
159
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6
Solution 11
Use the Map Operation Transform
Business Example
Users of employee reports have requested that employee records in the data mart contain only
records for current employees. You use the Map Operation transform to change the behavior of
loading so the resulting target conforms to this business requirement by removing any employee
records that contain a value in the discharge data column of the source data.
1. Create a new batch job Alpha_Employees_Current_Job with a data flow
Alpha_Employees_Current_DF, which contains a Map Operation transform.
a) In the project area right click the Omega project. choose New Batch job, and change the
name to Alpha_Employees_Current_Job.
b) In the workspace for the job, from the tool palette. select the Data Flow icon and click in
the workspace. Enter the name Alpha_Employees_Current_OF.
c) Open the data flow workspace and, from the Alpha datastore in the Local Object Library,
select the Employee table. drag it into the workspace, and choose Make Source.
d) From the HR_datamart datastore, select the EMPLOYEE table. drag it into the workspace,
and choose Make Target.
e) From the tool palette. choose the Query Transform icon and click in the workspace.
f) Connect the source table to the Query transform.
g) To open the Query Editor, double-click the Query.
h) To map the EMPLOYEEID and DISCHARGE_DATE columns from the input schema to the
output schema, in the Schema In, select each column, and drag it to the Schema Out.
i) Select the WHERE tab.
j) From the Schema In pane. drag the DISCHARGE_DATE column into the WHERE tab
workspace.
k) Complete the expression by entering is not null .
The entire expression should be:
employee.discharge_date is not null
This will select only those rows where the discharge date field is not empty.
2. Add the Map Operation transform to the data flow. change the output operation code of
NORMAL to DELETE, save all objects and execute the job.
a) In the Local Object Library, select the Transform tab, and open the node Platform.
160
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using the Map Operation Transform
b) Choose the Map Operation transform and drag it into the data flow workspace.
c) Connect the Query transform to t he Map Operation transform. and connect the Map
Operation transform to the target table.
d) Open the Map Operation Transform Editor and, in the Map Operation tab, change t he
settings so that rows with Input row type Normal have an Output row type Delete.
3. Save all objects and execute the Alpha_Employees_Current_Job.
a) In the project area, rig ht click Alpha_Employees_Current_Job and choose Execute .
A Save all changes and execute dialog box opens. To continue, choose Yes .
b) To use the default settings, in the Execution Properties dialog box, choose OK.
c) Return to the job workspace.
d) In the data flow workspace, choose the magnifying glass button on the source table.
A large View Data pane appears beneath the current workspace area.
e) Select the magnifying g lass button on t he target table.
Two rows were f iltered from the target table . Both of these records have discharge_date
field entries.
f) Close both displays.
© Copyright . All rights reserved.
161
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6: Platform Transforms
LESSON SUMMARY
You should now be able to:
•
162
Use the Map Operation t ransform in a data flow
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6
Lesson 3
Using the Validation Transform
LESSON OVERVIEW
The Validation transform enables you to create validation rules and move data into target objects
based on whether they pass or fail validations.
LESSON OBJECTIVES
Aft er completing this lesson, you will be able to:
•
Use the validation transform
The Validation Transform
Business Example
Your company extracts data from external systems using flat files. The data volume from the
various external systems has increased continually in the recent past, making management
difficult for the jobs of flat f ile extraction . You can optimize this process by using Data Services to
extract data directly from an external system . Order data is stored in multiple formats with
different structures and different information. You want to know how to use the Validation
transform to validate order data from flat file sources and the database tables before merging it.
The Validation transform enables the creation of validation rules and the moving of data into
target objects based on whether they pass or fail validation.
Explaining the V alidation Transform
Use the Validation transform in your data flows to ensure that the data meets the criteria. For
example, set the transform to ensure that all values:
•
Are within a specific range
•
Have the same format
•
Do not contain a NULL value
Introduction to the Validation Transform
•
Qualifies data sets:
Based on rules
Several rules per column
Simple (column) or complex (expression)
New validation report output
© Copyright . All r ights reserved.
163
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6: Platform Transforms
• Actions on Failure:
Send to Pass schema
Send to Fail schema
Send to Both
Optional substitution to Pass schema
•
Collect statistics:
Validation reports in Management Console
As shown in the f igure Validation Transform Editor. the Validation transform allows the defining of
a reusable business rule to validate each record and column. The Validation transform qualifies a
data set based on rules for input schema columns, it filters out or replaces data that fails your
criteria. The available outputs are pass and fail. For example, if you want to load only sales
records for October 2010, set up a validation rule that states: Sales Date is between 10/1/2010 to
10/31/2010.
Data Services views the date field in each record to validate if the data meets the requirements. If
it does not. choose to pass the record into a Fail table, correct it in the Pass table, or do both.
----''""""~~!~""""""~='-~__..!;C«CMC""'-'"'"~•l..,~~-~-~·~~l~~~~~~__._.!r~__,
~
.
a cnv~.n
PIUMAAY,.XCQKlAA.,. Addtfff
I
POSrCootJwJU... Postcode
loutrtVlflN'<..BE,.. lowility
lOC'.M.lf't?JIAl!'E..111!... Lot*y
<11.CAHl2ATION
I PO$JCCO!'.
I crrr
Rf'QON1,11C-it.Cc:H"••• Rega1
Co.tftllYJ'A,ME.8E$,,,
I
'°"""Y
qu.l!fV,.((a~..
--- J
··-
I @ii!<!!!i&>
.fl !I.
I
I
f.QAN~TION
jUQON.,.$TAtt
COl.lfl'l\V
I ""'"'eoo<I
Separate
window to
add/edit new
rules
..., I ,.. I -1
~~·~
I.8=:'~,!l!l-"---­
List of all substitution values (for "send to PAss• and •send to both"
Figure 34: Validation Transform Editor
Validation Rule
The validation rule consists of a condition and an action on failure as shown in the figure Add/Edit
Rule Editor:
•
Use the condition to describe what is requ ired for valid data.
For example, specify the condition IS NOT NULL if you do not want any NULLS in data passed
to the specified target.
164
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using the Va lidation Transform
•
Use the Action on Failure area to describe what happens to invalid or failed data.
Continuing with the above example, for any NULL values, select the Send to Fail option, to
send all NULL values to a specified FAILED target table.
:ft@§§biiil....- . - .111111111..
• ,,.,.,
~
·
: ""'"" "'"
; "'"''""'
Y~doh:r.
· r
·
1'fllld~Adct~s
A vahdation
function can
have
multiple
i:>arameters
-
l~i xil_ _ _~.,.
Each
Score defines if the
parameter is
bound to a
column
rule is considered
as failed for this
column
"
Custom Function
c;i
I
<='
f
( l.1S«;o
Custom RJnctlcos
0
fw l>EJ>...
~ DE_nme
o
~....,.,.• ,,llll~ll"':d.l.~\ · ----''--1
COUM V~ddti!n
Cdun'n:
• x
Local Object Libl"~ry
&ni.:hQs:
-f
C~:
•.,.o•••Mz•no:::J ..~
,,..._,"'-.,-..,-.-3-.
1~
<
,,,
<o
<>
Rules can still be
defined without
using validation
rules (no change)
1·
Valdalbnf-.tlCtiOM
f
tmriorted fto-n tr/ormt)l;icn Stt"'<rd
G ,f Locall)- created
1
f@ V<'lidJlddtCSS
fW Vaid SSN
0
0
•iS NU.l
- -lrs llOTl\l,IU.
1,Jl:f;:
WVil!EN
l>btch P~tem
•
"" I
•
· - - - - - - - 1!11Sfl
Exists In f~c
Cu$i.emCondt"bn
Figure 35: Add Edit Rule Editor
Conflict of Validation Rules
•
Two columns validated
• Action on Failure for one column is Send to Pass
• Action on Failure of the other is Send to Fail
•
What happens to these four rows?
Table 25: Validation Rules and Actions
Rules
Action
Passes both ru les
Sent to Pass
Passes the Send to Fail rule, fails the Send to
Pass rule
Sent to Pass
Passes the Send to Pass rule, fails the Send to
Fail rule
Sent to Fail
Fails both rules
Sent to Fail
Create a Customer Validation Function
Create a custom Validation function and select it when you create a validation rule.
The next section gives a brief description of the function, data input requirements, options, and
data output results for the Validation transform. Only one source is allowed as a data input for the
Validation transform.
© Copyright. All rights reserved .
165
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6: Platform Transforms
The Validation transform outputs up to two different data sets based on whether the records
pass or fail the validation condit ion specified. Pass and fail data can be loaded into multiple
targets.
Pass and Fail Output Schema
The Pass output schema is identical to the input schema. Data Services adds two columns to the
Fail output schemas:
•
The Dl_ERRORACTION column indicates where failed data is sent:
The letter Bis used for both Pass and Fail outputs.
The letter F is used for data sent only to the Fail output.
If you choose to send failed data to the Pass output. Data Services does not track the results.
It is advisable to substitute a value for failed data that is sent to the Pass output because Data
Services does not add columns to the Pass output.
•
The Dl_ERRORCOLUMNS column displays all error messages for columns with failed rules.
The names of input columns associated with each message are separated by colons. For
example, <VALIDATIONTRANSFORMNAME> FAILED RULE (S) : Cl : C2 .
If a row has conditions set for multiple columns and the Pass action. Fail action. and both
actions are specified for the row, then the precedence order is Fail, Both. Pass. For example. if
one column's action is Send to Fail and the column fails, then the whole row is sent only to the
Fail output. Other actions for other validation columns in the row are ignored.
Table 26: Validation Transform Creating the Validation Rule
When using the Validation transform, select a column in the input schema and create a validation
rule in the validation transform editor. The Validation transform offers several options for creating
this validation rule:
166
Option
Description
Enable validation
Turn the validat ion rule on and off for the
column
Do not validate when NULL
Send all NULL values to the Pass output
automatically. Data Services does not apply
the validat ion rule on this column when an
incoming value for it is NULL
Condition
Define the condition for the validation rule
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using the Va lidation Transform
Option
Description
Action on Fail
Define where a record is loaded if it fails the
validation rule:
•
Send to Fail
•
Send to Pass
•
Send to Both
If you choose Send to Pass or Send to Both,
you can choose to substitute a value or
expression for the failed values that are sent to
the Pass output
As shown in the figure Rule Violation Statistics. the rule violation table lists all of the rules and
columns failed for each record. The f ield Row_ID,, which is also added to the fail table, allows you
to make the link back to the original data. In this example row 1 and 2 each failed for one
validation rule (validZIP and validPhone). Row 3 failed both ru les. Using the rule violation table it is
possible to create queries and reports to show all rows that failed for a particular rule and count
the number of failures per rule.
NAME
Bob
on
Scott
!ZIP
PHONE
Dl_ERRORACT Dl_ERRORCOLUM NS
Dl_ROWI D
alidation failed rule(s): ZIP
F
408 555 1122
1
alidation failed rule(s): PHONE
560 1
2
F
alidation failed rule(s): PHONE: ZIP
F
14 2
3
194
9600
3434
CLEA»_CU$10MfRS_S ...
.... !il-l!j
SJ
Dl_ROWI D Dl_RULENAME
1
2
...,
3
3
ValidZIP
Valid Phone
Dl_COLUMNAME
ZIP
PHONE
ZIP
PHONE
F~I
Figure 36: Rule Violation Statistics
Creat e a Validation Rule
1. Open the data flow workspace.
2. Add the source object to the workspace.
3. On the Transforms tab of the Local Object Library, select and drag the Validation transform to
the workspace to the right of the source object.
4. Add the target objects to the workspace.
©Copyright. All rights reserved .
167
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6: Platform Transforms
A target object may be required for records that pass validation, and an optional target object
for records that fail validation, depending on the option selected .
5. Connect the source object to the transform.
6. Double-click the Validation transform to open the transform editor and configure the
validation rules.
Validation Reminders
•
Action on Failure only applies when row fails the validation rule
•
Send to Fail takes precedence over others
•
Pass output can use substituted values
•
Fail output adds two columns
•
Col lect statistics to be viewed in the Management Console:
Disable at execut ion time for better performance
168
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6
Exercise 12
Use the Validation Transform
Business Example
Order data is stored in multiple formats with different structures and different information. You
need to learn how to use the Validation transform to validate order data from f lat file sources and
the Alpha Orders table before merging it.
Create a flat file format
Create a flat file format called Order_Shippers_Format for flat files containing order delivery
information
1. Create a flat file format called Order_Shippers_Format.
2. Adjust the datatypes for the columns proposed by the Designer based on their content.
Create a new batch job
Create a new batch job called Alpha_Orders_Validated_Job and two data fl ows, one named
Alpha_Orders_Files_DF and Alpha_Orders_DB_DF in the Omega project.
1. Create a new batch job Alpha_Orders_Validated_Job with a new data flow called
Alpha_Orders_Files_DF in the Omega project.
2. Create a new data flow called Alpha_Orders_DB_DF in the Alpha_Orders_Valiated_Job
workspace,
Design the data flow Alpha_Orders_Files_DF.
Design the data flow Alpha_Orders_Files_DF with file formats, a Query transform, a Validation
transform and target template tables.
1. In the workspace for Alpha_Orders_Files_DF, add the file formats Orders_Format and
Orders_Shipper_Format as source objects.
2. Create a new template table Orders_Files_Work in the Del ta datastore as the target
object.
3. Create a new template table Orders_Files_No_Fax in the Delta datastore as t he target
object.
4. Create new template table Orders_Rule_Viola ti on in the Del ta datastore as t he target
object.
5. Add the Query transform to the workspace and connect both sources t o it.
6. Add the Validation transform to the workspace to the right of the Query transform and
connect them.
© Copyright . All r ights r eserved.
169
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6: Platform Transforms
7. Add a validation rule to re-assign orders taken by former employees to a current employee.
8. Add a validation rule for the shipper's fax to replace any NULL values with 'No Fax'.
9. Edit the source file formats in the data flow to use all three related orders and order shippers
flat files.
10. Edit the source file formats in the data flow to use all three related orders and order shippers
f lat files.
11. Complete the data f low Alpha_Orders_Files_DF by connecting t he pass, fail, and rule
violation outputs from the Validation Transform to the target template tables.
Design the data flow Alpha_Orders_Files_DB_DF
Design the data f low Alpha_Orders_DB_DF with the Orders table from the Alpha datastore, a
Query Transform, a Validation Transform and target template tables.
1. In the workspace for Alpha_Orders_DB_DF, add the Orders table from the Alpha
datastore, as a source object.
2. Create a new template table Orders_DB_Work in the Alpha_Orders_DB_DF workspace as a
target object.
3. Create a new template table Orders_DB_No_Fax in the Delta datastore as a target object.
4. Create a new template table Orders_DB_Rule_Violation in the Delta datastore as the
target object.
5. Add the Query transform to the workspace and connect it to the source table.
6. Add the Validation transform to t he workspace to the right of the query and connect them.
7. Add a validation rule to assign orders to a current employee if not already assigned. To open
the Transform Editor, double-click the Validation . In the input schema, choose the field
ORDER ASSIGNED TO. In the Validation Rules area. choose Add. Enter the name
Orders_Assigned_To. In the Rules area. select the Enabled checkbox. if it is not already
selected.To open the Rule Editor. select the Column Validation radio button. In the Column:
f ield, choose Query. ORDERS_ASSIGNED_TO. In the Condition: f ield, choose Exists in
table. In the next f ield. choose the HR datamart datastore, and double-click it to see the
tables. Double-click the EMPLOYEE table. choose t he EMPLOYEEID field and choose OK.
You see the expression HR_Da tamart. dbo. EMPLOYEE . EMPLOYEE ID
•
In the Action on Fail field. choose Send to Both.
This sends the field both the Pass and Fail tables.
•
170
To close the Rule Editor. choose OK.In the If any rule fails and Send to Pass, substitute with:
section. select Enabled.In the Column field, use the drop-down list to select
QUERY. ORDERS_ASSIGNED_To. In the Expression field. select the ellipsis(... ) icon and in
the Smart Editor, enter the expression • 3Cla5 •
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using the Va lidation Transform
Note:
You must use t he single quotation marks before and after the string.
8. Add a validation rule for the shipper's fax to replace any NULL values with "No Fax".
9. Complete the data flow Alpha_Orders_DB_DF by connecting the pass, fail , and rule violation
outputs from the Validation transform to the target template tables.
10. Execut e the Alpha_Orders_Validated_Job and view the differences between passing and
failing records.
©Copyright. All rights reserved.
171
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6
Solution 12
Use the Validation Transform
Business Example
Order data is stored in multiple formats with different structures and different information. You
need to learn how to use the Validation transform to validate order data from flat file sources and
the Alpha Orders table before merging it.
Create a flat file format
Create a f lat file format called Order_Shippers_Format for flat files containing order delivery
information
1. Create a flat file format called Order_Shippers_Format.
a) In the Local Object Library, choose the tab Formats, right-click Flat Files and choose New.
b) In the File Format Editor in the Type field, enter Delimited.
c) In the Name field, enter Order_Shippers_Format.
d) In the Data File(s) section, in the Location field, enter Job Server.
e) In the Root Directory field, enter D: \ CourseFiles \ DataServices \ Ac ti vi ty_Source.
f) In the File name(s) field, enter Order_Shippers_04_20_07 . txt.
An SAP Data Services Designer message opens: "Overwrite the current schema w it h the
schema from the f ile you selected?" To close the message, choose Yes
g) In the Delimiters section. in the Column field. enter Semicolon.
h) When prompted t o overwrite the schema. choose Yes.
i) In t he Input/Output section, in the Skip row header f ield, enter Yes .
j) When prompted t o overwrite the schema, choose Yes .
2. Adj ust the datatypes for the columns proposed by the Designer based on their content.
a) In the Column Attributes pane, change these field datatypes:
172
Colum n
Dat atype
ORDER ID
int
SHIPPERNAME
varchar(SO)
SHIPPERADDRESS
varchar(SO)
SHIPPERCITY
varchar(SO)
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using the Va lidation Transform
Column
Datatype
SH I PPERCOU NTRY
int
SHIPPERPHONE
varchar(20)
SHIPPERFAX
varchar(20)
SHIPPERREGION
int
SHIPPERPOSTALCODE
varchar(15)
b) Choose Save & Close.
Create a new batch job
Create a new batch job called Alpha_Orders_Validated_Job and two data flows. one named
Alpha_Orders_Files_DF and Alpha_Orders_DB_DF in the Omega project.
1. Create a new batch job Alpha_Orders_Validated_Job with a new data flow called
Alpha_Orders_Files_DF in the Omega project.
a) In the Project area. right click the Omega project name and choose New Batch job.
b) Enter the name Alpha_Orders_Validated_Job.
The job should open automatically. If it does not, double-click it
c) In the tool palette. choose the Data Flow icon and click in the job workspace
d) Enterthe name Alpha_Orders_Files_DF.
2. Create a new data flow called Alpha_Orders_DB_OF in the Alpha_Orders_Valiated_Job
workspace.
a) In the tool palette, choose the Data Flow icon and click in the job workspace where you
want to add the data f low.
b) Enter the name Alpha_Orders_DB_DF and. on your keyboard, press t he Enter key.
Design the data flow Alpha_Orders_Files_DF.
Design the data flow Alpha_Orders_Files_DF with f ile formats, a Query transform. a Validation
transform and target template tables.
1. In the workspace for Alpha_Orders_Files_DF, add the f ile formats Orders_Format and
Orders_Shipper_Format as source object s.
a) To open the Alpha_Orders_Files_DF workspace double-click it
b) In the Local Object Library, choose t he Formats tab, select the f ile format Orders_Format.
drag it to the data flow workspace and choose Make Source.
c) In the Formats tab. select the file format Order_Shippers_Format, drag it to the data flow
workspace and choose Make Source.
2. Create a new template table Orders_Files_Work in the Delta datastore as t he target
object
©Copyright. All rights reserved .
173
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6: Platform Transforms
a) To add a new template table to the data f low. in the tool palette, choose the Template
Table icon and click the workspace.
b) In t he Create Template dialog box. enter t he name Orders_Files_Work.
c) In the In datastore field. enter Del ta.
d) Choose OK.
3. Create a new template table Orders_Files_No_Fax in t he Delta datastore as the target
object.
a) To add a new template table to the data flow. in the tool palette, choose the Template
Table icon and click t he workspace.
b) In t he Create Template dialog box. enter t he name Orders_Files_No_Fax.
c) In t he In datastore field, enter Del ta.
d) Choose OK.
4. Create new template table Orders_Rule_Viola ti on in t he Del ta datastore as the target
object.
a) To add a new template table to the dataflow, in the tool palette. choose the Template Table
icon and click the workspace.
b) In t he Create Template dialog box. enter t he name Orders_Files_Rule_Viola ti on.
c) In t he In datastore field, enter Del ta.
d) Choose OK.
5 . Add the Query transform to the workspace and connect both sources to it.
a) To add a query to t he workspace. in the tool palette, choose the Query Transform icon and
click in the data flow workspace.
b) To connect the source file formats Orders_Format and Order_Shippers_Format to the
Query, select the sources, hold down t he mouse button, drag the cursor to t he Query
transform. and release the mouse button .
c) To open the Query Editor, double-click the Query.
d) In t he Query Editor, choose t he WHERE tab.
e) In t he Schema In workspace, select the f ield ORDER_SHIPPERS_FORMAT .ORDERID and
drag it into the WHERE workspace.
f) Enter the equal sign =.
g) To complete the expression, in the Schema In workspace. select the field
ORDERS_FORMAT . ORDERID and drag it into the WHERE workspace.
The expression shou Id be Order_Shippers_Fonnat. ORDERID =
Orders Format.ORDERID.
This will join the data in the formats on the OrderlD values.
174
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using the Va lidation Transform
h) In t he Query Editor. in the Schema In workspace, choose the following fields and drag them
to the Schema Out workspace:
Input Schema
Field
Output Schema
Orders_Format
ORDERID
ORDERID
Orders_Format
CUSTOMER ID
CUSTOMER ID
Orders_Format
ORDERDATE
ORD ERDATE
Orders_Shippers_Format
SHIPPERNAME
SHIPPERNAME
Orders_Shippers_Format
SHIPPERADDRESS
SHIPPERADDRESS
Orders_Shippers_Format
SHIPPERCITY
SHIPPERCITY
Orders_Shippers_Format
SHIPPERCOUNTRY
SH IPPERCOU NTRY
Orders_Shippers_Format
SHIPPERHONE
SHIPPERPHONE
Orders_Shippers_Format
SHIPPERFAX
SHIPPERFAX
Orders_Shippers_Format
SHIPPERREGION
SHIPPERREGION
Orders_Shippers_Format
SHIPPERPOSTALCODE
SHIPPERPOSTALCODE
This creates the necessary mapping
i) In the Schema Out workspace, right-click the field ORDERDATE, choose New Output
Column, and choose Insert Above.
j) Enter the field name ORDER_TAKEN_BY , with a datatype of varchar and a length of 15
and choose OK.
k) To map ORDER_TAKEN_BY to Orders_Format. EMPLOYEEID, in the input schema, select
Orders_Format. EMPLOYEE ID and drag it to the ORDER_TAKEN_BY field in the output
schema.
I) In the Schema Out workspace, right-click the field ORDERDATE, choose New Output
Column, and choose Insert Above
m) Enter the field name ORDER_ASSIGNED _To , with a datatype of varchar and a length of
15 , and choose OK.
n) To map ORDER_ASSIGNED_TO to Orders_Format .EMPLOYEEID, in the input schema,
select Orders_Format. EMPLOYEE ID and drag it t o the ORDER_ASSIGNED_TO field in the
output schema.
o) Close the edit or.
6. Add the Validation transform to the workspace to the right of the Query transform and
connect them.
a) In the Local Object Library, choose the Transforms tab.
© Copyright. All rights reserved.
175
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6: Platform Transforms
b) In the Platform node, select Validation and drag it to the right of the Query in the data flow
workspace.
c) To connect the Query transform to the Validation, choose the Query, and hold down the
mouse button while dragging t he cursor to the Validation . Release the mouse button.
7. Add a validation rule to re-assign orders taken by former employees to a current employee.
a) To open the Validation Editor, double-click the Validation.
b) In the input schema, select the field ORDER_ASSIGNED_TO, and in the Validation Rules
area . choose Add.
c) Enter the name Orders_Assigned_To.
d) Select the Enabled checkbox, if it not already selected.
e) Select the Column Validation radio button.
f) In the Column: field choose Query. ORDERS_ASSIGNED_TO.
g) In the Condition: field, choose Exists in table.
The Ru/es Editor opens.
h) In the field, select the HR datamart datastore and double-click to see its tables.
i) Double-click the table EMPLOYEE to see its fields, choose the EMPLOYEEID field and
choose OK.
The result ing expression should be HR_DATAMART .DBO. EMPLOYEE .EMPLOYEEID.
j) In the Action on Fail field, set the action Send to Both .
This sends to both Pass and Fail tables.
k) To close the Rule Editor. choose OK.
I) In the If any rule fails and Send to Pass, substitute with: section, select Enabled.
m) In the Column field, use the drop-down list to choose the field
QUERY.ORDERS_ASSIGNED_TO.
n) In the Expression field, choose the ellipsis(. .. ) icon, and, in the Smart Editor, enter the
expression ' 3Cla5 ' and choose OK.
Note:
You must use the single quotation marks before and after the string.
8. Add a validation rule for the shipper's fax to replace any NULL values with 'No Fax'.
a) In the input schema area, select the field SHIPPERFAX and, in the Validation Rules area,
choose Add.
b) Enter the name Shipper_Fax.
c) Select the Enabled checkbox if it not already selected.
176
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using the Va lidation Transform
d) To open rule editor, select the Column Validation radio button.
e) In the Column: field choose Query. SHIPPERFAX.
f) In the Condition: field, choose IS NOT NULL.
g) In the field Action on Fail, set the action Send to Both .
h) In the If any rule fails and Send to Pass. substitute with: section. select the check box
Enabled.
i) In the Column f ield, use the drop-down list to choose t he field QUERY.SHIPPERFAX.
j) In the Expression field. choose the ellipsis(.. .) icon. In t he Smart Editor. enter the
expression 'No Fax' and choose OK.
Note:
You must use the single quotation marks before and after the string.
k) Choose OK and close the editor.
9. Edit the source file formats in t he data f low to use all three related orders and order shippers
flat files.
a) Return to the Alpha_Orders_Files_DF dat a flow work space.
b) To edit t he Orders_Format source object. double-click it.
c) In the Data Fife(s) section change the Fife name(s) field t o orders*. txt.
~,
A .,
Note:
The asterisk character acts as a wildcard.
d) In the Error handling section, change the Capture Data Conversion Errors option to Yes .
Note:
Do not change any other setting in t he Error Handling section.
e) Close t he editor.
10. Edit the source file formats in the data f low to use all three related orders and order shippers
flat files.
a) To edit t he Orders_Shippers_Format source object, double-click it.
b) In the Data Fife(s) section change the Fife name(s) field to Order_Shippers* . txt.
©Copyright. All rights reserved .
177
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6: Platform Transforms
»
Note:
The asterisk character acts as a wildcard.
c) In the Error handling section. change the Capture Data Conversion Errors option to Yes.
'
H
Note:
Do not change any other setting in the Error Handling section.
d) Close the editor.
11. Complete the data f low Alpha_Orders_Files_DF by connecting the pass , fail, and rule
violation outputs from the Validation Transform to the target template tables.
a) Return to the data f low workspace.
b) Select the Validation Transform and drag it to the target template table
Orders Files Work.
c) Release the mouse and choose Pass.
d) Select Validation Transform and drag it to the target template table
Orders Files No Fax.
e) Release the mouse and choose Fail.
f) Select Validation Transform and drag it to the target template table
Orders Files Rule Violation.
g) Release the mouse and choose Rule Violation .
Design the data flow Alpha_Orders_Files_DB_DF
Design the data flow Alpha_Orders_DB_DF with the Orders table from the Alpha datastore, a
Query Transform, a Validation Transform and target template tables.
1. In the workspace for Alpha_Orders_DB_DF, add the Orders table from the Alpha
datastore, as a source object.
a) In the Local Object Library, choose the Datastores tab.
b) In the Alpha datastore, select the Orders table. drag it to the data f low workspace, and
choose Make Source.
2. Create a new template table Orders_DB_Work in the Alpha_Orders_DB_DF workspace as a
target object.
a) To add a new template table to the data flow, in the tool palette, choose the Template
Table icon and click in the workspace.
b) In the Create Template dialog box. enter the name Orders_DB_Work.
c) In the In datastore field. choose the Del ta datastore as the template table destination.
178
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using the Va lidation Transform
d) Choose OK.
3. Create a new template table Orders _DB_No_Fax in the Del ta datastore as a target object.
a) To add a new template table to t he data f low, in the tool palette, choose the Template
Table icon and click in the workspace.
b) In the Create Template dialog box. enterthe name Orders_DB_No_Fax.
c) In the In datastore field, choose the Del ta datastore as the t emplate table destination .
d) Choose OK.
4. Create a new template t able Orders_DB_Rule_Violation in the Delta datastore as the
target object.
a) To add a new template table to the dataflow, in the tool palette, choose the Template Table
icon and click in t he workspace.
b) In the Create Template dialog box, enterthe name Orders_DB_Rule_Violation.
c) In the In datastore field, enter Del ta.
d) Choose OK.
5. Add the Query transform to the workspace and connect it to the source table.
a) To add a Query transform to the data flow, in the tool palette, select the Query Transform
icon and click in the workspace.
b) To connect the source table to the query. select the table. and, holding down the mouse
button, drag the cursor to the query. Then release the mouse button.
c) To open t he Query Editor, double-click the query.
d) In the Query transform. to map all of the columns, except for EMPLOYEEID, from the input
schema to the output schema, select the input schema field and drag it to the
corresponding output schema field
e) In the Query Editor, change the names of the following output schema columns:
Old Column Name
New Output Name
SH IPPERCITYI D
SHIPPERCITY
SH IPPERCOUNTRYI D
SH IPPERCOUNTRY
SHIPPERREGIONID
SH IPPERREGION
f) In the output schema, right-click the f ield ORDERDATE, choose New Output Column. and
choose Insert Above.
g) Name the new field ORDER_TAKEN_BY and choose datatype varchar and length 15.
h) Select Orders_Format.EMPLOYEEID in the input schema and drag it to the
ORDER_TAKEN_BY f ield in the output schema.
This maps the new ORDERS_TAKEN_BY field to t he orders. EMPLOYEEID field .
©Copyright. All rights reserved .
179
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6: Platform Transforms
i) In the output schema, right-click the field ORDERDATE and choose New Output Column,
and choose Insert Above.
j) Name the new field ORDER_ASSIGNED _TO and choose datatype var char and length 15.
k) Select Orders_Format. EMPLOYEE ID in the input schema and drag it to the
ORDER_ASSIGNED_TO field in the output schema.
This maps the new ORDERS_ASSIGNED_TO field to t he Orders_Format. EMPLOYEE ID
field.
I) Close the editor.
6. Add the Validation transform to the workspace to the right of the query and connect them.
a) In the Local Object Library, choose the Transforms tab.
b) Drag the Validation transform from the Platform node to the data flow workspace to the
right of the query.
c) To connect the query to the Validation transform, select the query, hold down the mouse
button. drag the cursor to the Validation transform, and release the mouse button.
7. Add a validation rule to assign orders to a current employee if not already assigned . To open
the Transform Editor, double-click the Validation . In the input schema, choose the field
ORDER ASSIGNED TO. In the Validation Rules area, choose Add. Enter the name
Orders_Assigned_To. In the Rules area, select the Enabled checkbox. if it is not already
selected.To open the Rule Editor. select t he Column Validation radio button. In the Column:
field, choose Query. ORDERS_ASSIGNED_To.I n the Condition: field, choose Exists in
table. In the next field, choose the HR datamart datastore, and double-click it to see the
tables. Double-click the EMPLOYEE table, choose t he EMPLOYEEID field and choose OK.
You see the expression HR_Da tamart. dbo. EMPLOYEE. EMPLOYEE ID
•
In the Action on Fail field, choose Send to Both.
This sends the field both the Pass and Fail tables.
•
To close the Rule Editor, choose OK.In the If any rule fails and Send to Pass, substitute with:
section, select Enabled.In the Column field, use the drop-down list to select
QUERY. ORDERS_ASSIGNED_To. In the Expression field, select t he ellipsis(. .. ) icon and in
the Smart Editor, enter the expression ' 3Cla5 '
Note:
You must use the single quotation marks before and after the string.
8. Add a validation rule for t he shipper's fax to replace any NULL values wit h "No Fax" .
a) In the input schema area, choose t he field SHIPPERFAX.
b) In the Validation Rules area choose Add.
c) Enter t he name Shipper_Fax.
180
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using the Va lidation Transform
d) In the Rules area, select the Enabled checkbox if it not already selected.
e) To open t he Rule Editor, select the Column Validation radio box.
f) In the Column: field, choose Query. SHIPPERFAX.
g) In the Condition: field, choose rs NOT NULL.
h) In the Action on Fail f ield, choose Send to Both.
i) To close the Rule Editor, choose OK.
j) In the If any rule fails and Send to Pass, substitute with: section, select the check box
but ton for the field Enabled.
k) In the Column f ield, use the drop-down list to select the field QUERY. SHIPPERFAX.
I) In the Expression field, select the ellipsis(. ..) icon and in t he Smart Editor, enter the
expression ' No Fax' and choose OK. and close the editor.
Note:
You must use the single quotation marks before and after the string.
9. Complete the data flow Alpha_Orders_DB_DF by connecting t he pass, fail, and rule violation
output s from the Validation transform to the t arget template tables .
a) Return to the data f low workspace.
b) Select the Validation transform and drag it to t he target template table Orders_DB_Work.
Release the mouse button and choose Pass .
c) Select the Validation transform and drag it to t he target template table
Orders DB No Fax.
Release the mouse button and choose Fail .
d) Select the Validation transform and drag it to the target template table
Orders DB Rule Violation.
Release the mouse button and choose Rule Violation.
10. Execute t he Alpha_Orders_Validated_Job and view the differences between passing and
failing records.
a) In the Omega project area, right-click on the Alpha_Orders_Validated_Job and choose
Execute.
Data Services prompts you to save any objects t hat have not been saved. Choose OK.
b) In the Execution Properties dialog box, choose OK.
©Copyright. All rights reserved .
181
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6: Platform Transforms
»
Note:
The job should execute successfully, but will cause several errors to appear in
the Error Log. These errors are records containing values which the Designer
could not convert because of faulty data. Opening the Error Log will display the
values which could not be converted. Consequently, these records are not
moved to the t arget tables.
c) To return to the job workspace, choose Back.
d) To open t he Al.pha_Orders_DB_DF data flow workspace, double-click it.
e) Right click the target tables and choose View data.
You see the differences between the passing and failing records.
f) Close the data displays, and return to t he job workspace.
g) Open t he Alpha_Orders_Files_DF data flow workspace
h) Right click t he target tables to choose View data.
You see the differences between the passing and faili ng records.
i) Close the data displays.
182
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using the Va lidation Transform
LESSON SUMMARY
You should now be able to:
•
Use the validation transform
©Copyright. All rights reserved .
~
183
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6
Lesson 4
Using the Merge Transform
LESSON OVERVIEW
Use the Merge transform to combine incoming data sets with the same schema structure. The
merge produces a single output data set with the same schema as the input data sets.
LESSON OBJECTIVES
After completing this lesson, you will be able to:
•
Use the merge transform
The Merge Transform
Business Example
Your company extracts data from external systems using flat files. The data volume from the
various external systems has increased continually in the recent past. making management of the
jobs for flat file extraction difficult. You can optimize this process by using Data Services to
extract data directly from an external system.
Explaining the Merge Transform
The Merge transform combines incoming data sets with the same schema structure to produce a
single output data set with the same schema as the input data sets. For example. use the Merge
transform to combine two sets of address data as shown in the figure The Merge Transform.
Input 1:
Output:
Name
Address
Charlie
11 Crazy Street
Name
Address
Daisy
13 Nelson Avenue
Charlie
11 Main Street
Megan
21 Brand Avenue
Daisy
13 Nelson Avenue
Dolly
9 Parl< Lane
Joe
77 Miller Street
Megan
21 Brand Avenue
Sid
8 Andrew Crescent
Input 2 :
Name
Address
Dolly
9 Park Lane
Joe
77 Miller Street
Sid
8 Andrew Crescent
Figure 37: The Merge Transform
Next Section
The next section gives a brief description the function. data input requirements. options. and data
output resu lts for the Merge transform.
184
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using the Merge Transform
Input/ Output
The Merge transform performs a union of the sources. All sources must have the same schema,
including:
•
Number of columns
•
Column names
•
Column data types
If the input data set contains hierarchical data, the names and datatypes must match at every
level of the hierarchy.
The output data has the same schema as the source data. The output data set contains a row for
every row in the source data sets. The transform does not strip out duplicate rows. If columns in
the input set contain nested schemas. the nested data is passed without change.
Hint:
If you want to merge tables that do not have the same schema, add the Query
transform to one of the tables before the Merge transform to redefine the schema to
match the other table.
The Merge transform does not offer any options.
© Copyright. Al l rights reserved.
185
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6: Platform Transforms
186
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6
Exercise 13
Use the Merge Transform
Business Example
Your company extracts data from external systems using f lat files. The data volume from the
various external systems has increased continually in the recent past, making management of t he
jobs for flat file extraction difficult. You can optimize this process by using Data Services to
extract data directly from an external system .
You want to use the Merge transform to combine incoming data sets with the same schema
structure to produce a single output data set with the same schema as the input data sets.
The Orders data has now been validated, but the output is for two different sources, the f iles and
database tables. The next step in the process is to modify the structure of those data sets so they
match and then merge them into a single data set for further processing. You want to explore
using the Merge transform for this task.
Modify Column Names and Data Ty pes
Use the Query transform to modify any columns names and data types. and to perform lookups
for any columns that reference other tables. Use the Merge transform to merge the validated
orders data.
1. In the Omega project, create a new batch job called Alpha_Orders_ Merged_Job , containing
a data flow called Alpha_Orders_Merged_OF .
2. In the workspace for Alpha_Orders_Merged_DF, add the orders_file_work and
orders_db_ work tables from the Del t a datastore as the source objects.
3. Add two Query transforms to the workspace connecting each source object to its own Query
transform.
4. In the Query Editor for the query connected to the orders_ files_work table, create output
columns and map input columns to output columns.
5. For the SHIPPERCITY output column, change the mapping to perform a lookup of CITYNAME
from the City table in the Alpha datastore.
6. For the SHIPPERCOUNTRY output column, change the mapping to perform a lookup of
COUNTRYNAME from the Region table in the Alpha datastore.
7. For the SHIPPERREGION output column, change the mapping to perform a lookup of
REGIONNAME from the Region table in the Alpha datastore.
8. In the Query Editor for the query connected to the orders_db_work table, create output
columns and map input columns to output columns.
© Copyright . All r ights reserved.
187
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6: Platform Transforms
9. For the SHIPPERCITY output column. change the mapping to perform a lookup of CITYNAME
from the City table in the Alpha datastore.
10. For the SHIPPERCOUNTRY output column, change the mapping to perform a lookup of
COUNTRYNAME from the Country table in the Alpha datastore.
11. For the SHIPPERREGION output column, change the mapping to perform a lookup of
REGIONNAME from the Region table in the Alpha datastore.
Merge the data from the Query transforms
Merge the data from the Query transforms into a template table called Orders_Merged from the
Del ta datastore using a Merge transform.
1. Add a Merge transform the data flow and connect both Query transforms to the Merge
transform .
2. Execute the Alpha_Orders_Merged_Job with the default execution properties.
188
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6
Solution 13
Use the Merge Transform
Business Example
Your company extracts data from external systems using f lat files. The data volume from the
various external systems has increased continually in the recent past, making management of t he
jobs for flat file extraction difficult. You can optimize this process by using Data Services to
extract data directly from an external system .
You want to use the Merge transform to combine incoming data sets with the same schema
structure to produce a single output data set with the same schema as the input data sets.
The Orders data has now been validated, but the output is for two different sources, the f iles and
database tables. The next step in the process is to modify the structure of those data sets so they
match and then merge them into a single data set for further processing. You want to explore
using the Merge transform for this task.
Modify Column Names and Data Ty pes
Use the Query transform to modify any columns names and data types. and to perform lookups
for any columns that reference other tables. Use t he Merge transform to merge the validated
orders data.
1. In the Omega project, create a new batch job called Alpha_Orders_Merged_Job , containing
a data flow called Alpha_Orders_Merged_OF.
a) In the Project area, right-click the Omega project name and choose New Batch Job.
b) Enter the job name Alpha_Orders_Merged_Job and, on your keyboard, press the Enter
key.
c) To add the data flow. in the tool palette, choose the Data Flow icon and click in the
workspace.
d) Enter the data f low name Alpha_Orders_Merged_DF and, on your keyboard, press the
Enter key.
e) To open the data flow workspace, double-click it.
2. In the workspace for Alpha_Orders_Merged_DF, add the orders_file_work and
orders_db_work tables from the Del ta datastore as the source objects.
a) In the Local Object Library, choose the Datastores tab, and expand the Del ta datastore.
b) Select the orders_file_ work table, drag it to the data flow workspace, and choose Make
Source.
c) Select the orders_db_work table, drag it to the data f low workspace, and choose Make
Source.
© Copy right . All r ights r eserved.
189
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6: Platform Transforms
»
Note:
It is not necessary to designate these template tables as sources, because
once they are loaded with data successfully, they can be used only as source
tables in other data flows.
3. Add two Query transforms to the workspace connecting each source object to its own Query
transform .
a) To add the query to the data f low, in the tool palette, choose the Query Transform icon and
click in the workspace.
b) Add a second query to the workspace.
c) To connect the source table orders_files_work to the f irst query, select the source
table, hold down the mouse button, drag the cursor to the query, and release the mouse
button.
d) Connect the source table orders_db_work to the second query.
4. In the Query Editor for the query connected to the orders_files_work table, create output
columns and map input columns to output columns.
a) To open t he Query Editor, double-click the query.
b) In t he Schema In workspace, select each field, and drag it to the Schema Out workspace.
This creates output columns, and also maps the input schema columns to output schema
columns.
c) Change the datatype for the following Schema Out columns:
Column
Type
SHIPPERCOUNTRY
varchar(SO)
SHIPPERRREGION
varchar(SO)
SHIPPERADDRESS
varchar(lOO)
SHIPPERPOSTALCODE
varchar(SO)
5. For the SHIPPERCITY output column, change the mapping to perform a lookup of CITYNAME
from the City table in the Alpha data store.
a) In t he output schema, select SHIPPERCITY, and, in the Mapping tab, delete the existing
expression by highlighting it and using the Delete button on your keyboard.
b) Choose the Functions .. button.
The Select Function dialog box opens.
c) In t he Function Categories field, choose Lookup Functions, in the Function Names fi eld,
choose lookup_ext. and choose Next.
d) In t he Lookup_ext - Select Parameters dialog box, enter the following parameters:
190
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using the Merge Transform
Field/Option
Value
Lookup table
ALPHA.SOURCE.COUNTRY
Condition:
Columns in lookup table
COUNTRY ID
Op.(&)
-
Expression
ORDERS FILE WORK.SHIPPERCOUNTRY
Output
Column in lookup table
COUNTRYNAME
e) Choose Finish.
6. For the SHIPPERCOUNTRY output column. change the mapping to perform a lookup of
COUNTRYNAME from the Region table in the Alpha datastore.
a) In the output schema, select SHIPPERCOUNTRY and, in the Mapping tab, delete the
existing expression.
b) Choose the Functions .. button.
c) In the Function Categories field, choose Lookup Functions, in the Function Names field,
choose /ookup_ext, and choose Next.
d) In the Lookup_ext - Select Parameters d ialog box, enter the following parameters:
Field/Option
Value
Lookup table
Alpha.dbo.country
Condition:
Columns in lookup table
COUNTRY ID
Op.(&)
Expression
Orders Files Work.SHIPPERCOUNTRY
Output
Column in lookup table
COUNTRYNAME
e) Choose Finish.
7. For the SHIPPERREGION output column. change the mapping to perform a lookup of
REGIONNAME from the Region table in the Alpha datastore.
a) In the output schema, select SHIPPERREGION and, in the Mapping tab, delete the existing
expression.
b) Choose the Functions .. button .
c) In the Function Categories field, choose Lookup Functions, in the Function Names field,
choose /ookup_ext, and choose Next.
© Copyright. Al l rights reserved.
191
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6: Platform Transforms
d) In the Lookup_ext - Select Parameters d ialog box, enter the following parameters:
Field/Option
Value
Lookup table
Alpha.dbo.region
Condition:
Columns in lookup table
REGIONID
Op.(&)
-
Expression
Orders Files Wo rk.SHIPPERREGION
Output
Column in lookup table
REGIONNAME
e) Choose Finish .
f) Close t he editor.
8. In the Query Editor for the query connected to the orders_db_wo rk table, create output
columns and map input columns to output columns.
a) To open t he Query Editor, double-click the query.
b) In the Schema In workspace, select each field and drag it to the Schema Out workspace.
This creates output columns, and also maps input schema columns to output schema
columns.
c) Change the datatype for the Schema Out columns as follows:
Column
Type
ORDERDAT E
Dat e
SHIPPERRCITY
varchar(SO)
SHIPPERCOUNTRY
varchar(SO)
SHIPPERREGION
varchar(SO)
SHIPPERFAX
varchar(20)
The SHIPPERFAX column is in a d ifferent position in the orders_db_work table than it is
in the Orders File Work table.
d) To move t he SHIPPERFAX column, select t he column in the Schema Out, right-click and
choose Cut.
e) Right-click SHIPPEREGION, choose Paste and choose Insert Above.
~~
Ar
192
Note:
Ensure you cut the column and not t he column name
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using the Merge Transform
9 . For the SHIPPERCITY output column, change the mapping to perform a lookup of CITYNAME
from the City table in the Alpha datastore.
a) In the output schema, select SHIPPERCITY, and. in the Mapping tab, delete the existing
expression.
b) Choose the Functions .. button.
c) In the Function Categories f ield, choose Lookup Functions, in the Function Names field,
choose lookup_ ext. and choose Next.
d) In the Lookup_ext - Select Parameters dialog box, enter the following parameters:
Field/Option
Value
Lookup table
Alpha.dbo.city
Condition :
Columns in lookup table
CITYID
Op.(&)
Expression
Orders DB Work.SHIPPERCITY
Output
Column in lookup table
CITYNAME
e) Choose Finish .
10. For the SHIPPERCOUNTRY output column, change the mapping to perform a lookup of
COUNTRYNAME from the Country table in the Alpha datastore.
a) In the output schema, select SHIPPERCITY, and. in the Mapping tab, delete the existing
expression.
b) Choose the Functions .. button.
c) In the Function Categories field, choose Lookup Functions, in the Function Names field,
choose lookup_ ext. and choose Next.
d) In the Lookup_ext - Select Parameters dialog box, enter the following parameters:
Field/Option
Value
Lookup table
Alpha.dbo.country
Condition :
Columns in lookup table
COUNTRY ID
Op.(&)
Expression
Orders DB Work.SHIPPERCOUNTRY
Output
Column in lookup table
COUNTRYNAME
e) Choose Finish .
© Copyright. Al l rights reserved.
193
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6: Platform Transforms
11. For the SHIPPERREGION output column. change the mapping to perform a lookup of
REGIONNAME from the Region table in the Alpha datastore.
a) In the output schema. select SHIPPERREGION, and, in the Mapping tab. delete the
existing expression by highlighting it and using the Delete button on your keyboard.
b) Choose the Functions .. button.
c) In the Function Categories f ield. choose Lookup Functions. in the Function Names f ield,
choose lookup_ ext. and choose Next.
d) In the Lookup_ ext - Select Parameters dialog box, enter the following parameters:
Field/Option
Value
Lookup table
Alpha.dbo.region
Condition:
Columns in lookup table
REGION ID
Op.(&)
-
Expression
Orders DB Work.SHIPPERREGION
Output
Column in lookup table
REGIONNAME
e) Choose Finish.
f) Close the editor.
Merge the data from the Query transforms
Merge the data from the Query transforms into a template table called Orders_Merged from the
Del ta datastore using a Merge transform.
1. Add a Merge transform the data flow and connect both Query transforms to the Merge
transform.
a) In the Local Object Library, choose the Transforms tab.
b) Expand the Platform node, select Merge, and drag it to the data flow workspace, to the
right of the Query transforms.
c) To connect both Query transforms to the Merge transform, select each query and, holding
down the mouse button. d rag the cursor to the Merge transform and release the mouse
button.
d) To open the Transform Editor, double-click the Merge transform.
Note:
At this point, check to make sure that the order of fields in both input schemas
is identical in order. This is a prerequisite for the Merge transform to merge the
schemas.
194
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using the Merge Transform
e) Close the editor.
f) To add a new Query transform to the data flow, in the tool palette, choose the Query
Transform icon. and click in the data flow workspace.
g) Connect the Merge transform to the Query transform .
h) To open t he Query Editor. double-click the query.
i) To map all the columns from the input schema to the output schema, in the Schema In
select all columns and drag them the Schema Out.
j) Close t he editor.
k) To add a new template table to the data f low, choose the Template Table icon and click in
the data flow workspace.
I) in the Create Template dialog box, enter the table name orders_Merged, enter the Del ta
datastore as the template table destination target. and choose OK.
m) Connect the Query transform to the target template table Orders_Merged.
2. Execute t he Alpha_Orders_Merged_Job with the default execution properties.
a) In t he Omega project area, right-click the Alpha_Orders_Merged_Job and choose
Execute.
Data Services prompts you to save any objects t hat have not been saved.
b) In t he Save all changes and execute dialog box. choose Yes.
The Execution Properties dialog box appears .
c) To execute the job using default properties, choose OK.
d) Go Back to the job data flow workspace.
e) Open t he data flow workspace. right-click the target table. and choose View data.
Note that t he SHIPPERCITY, SHIPPERCOUNTRY and SHIPPERREGION columns for t he
approximately 360 records in the template table have names rather than ID values .
Note:
The number of records forwarded through the Validation transform
determines how many records will be merged.
f) Close t he display.
©Copyright. All rights reserved.
195
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6: Platform Transforms
LESSON SUMMARY
You should now be able to:
•
196
Use the merge transform
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6
Lesson 5
Using the Case Transform
LESSON OVERVIEW
Use the Case transform to simplify branch logic in data flows by consolidating case or decisionmaking logic into one transform. Divide a data set into smaller sets based on logical branches.
LESSON OBJECTIVES
Aft er completing this lesson, you will be able to:
•
Use the case transform
The Case Transform
Business Example
Your company extracts data from external systems using flat files. The data volume from the
various external systems has increased continually in the recent past, making management of t he
jobs for flat file extraction difficult. You can optimize this process by using Data Services to
extract data directly from an external system . You want t o use the Case transform to simplify
branch logic in data flows by consolidating case or decision-making logic into one transform. The
transform allows you to divide a data set into smaller sets based on logical branches.
The Case transform supports separating data from a source into multiple targets based on
branch logic.
Explaining the Case Transform
Use the Case transform to simplify branch logic in data flows by consolidating case or decisionmaking logic into one transform. The transform allows for t he dividing of a data set into smaller
sets based on logical branches.
Case Transform Example
Use the Case transform to read a table that contains sales revenue facts for different regions. It
separates the regions int o t heir own tables for more efficient data access:
•
Simplifies data f low branching logic:
Uses a single transform
Separates input data rows into multiple output data sets
•
Expression table:
Labels identify output paths and connect to target objects
© Copyright . All r ights r eserved.
197
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6: Platform Transforms
Expressions associated with labels: Simple REGIONID - 1 , and Complex
SUBSTR(' F IRST_NAME' I 1 , 3) =FIR
•
Additional parameters:
Default path
True for one case only
Preserves expression order
The next section gives a brief description of the funct ions. data input requirements, options, and
data output results for the Case transform.
Only one data flow source is allowed as a data input for the Case transform. Depending on the
data, only one of multiple branches is executed per row. The input and output schema are also
identical when using the Case transform.
The connections between the Case transform and objects used for a particular case are labeled.
Each output label in the Case transform is used at least once.
Connect the output of the Case t ransform with another object in the workspace. Each label
represents a case expression (WHERE clause).
Table 27: Comparison: Case and Validation
The table illustrates the comparison between the Validation and Case transforms, as both
transforms can result in more than one output data set:
Case Transform
Validation Transform
Schema In = Schema Out
For the Fail output, two additional columns are
added
Multiple outputs allowed
Expressions created manually
•
Only two outputs (Pass/Fail)
•
Integrated with t he dashboard in
Management Console
Rules can be generated automatically
Table 28: Options
The Case transform offers several options:
198
Option
Description
Label
Define the name of the connection that
describes t he path for data if the
corresponding Case condition is true
Expression
Define the Case expression for the
corresponding label
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using the Case Transform
Option
Description
Produce default option w it h label
Specify that the transform uses the expression
in the label w hen all other Case expressions are
false.
Row can be TRUE for one case only
Specify that the transform passes each row to
the first case whose expression returns true.
Create a Case Statement
1. Drag t he Case transform to the workspace to the right of your source object.
2. Add t he target objects to the workspace.
One target object is required for each possible condition in the case statement.
3. Connect the source object to the transform.
4. In t he Parameters area of the transform editor, select Add to add a new expression.
5. In t he Label field, enter a label for t he expression.
6. Select and drag an input schema column to the Expression pane at the bottom of the w indow.
7. Define the expression of the condition.
8. To direct records t hat do not meet any defined conditions to a separate target object. select
the Produce default option with label option and enter the label name in the associated f ield .
9. To direct records t hat meet multiple cond it ions to only one target. select that the row is TRUE
for one case option.
In t his case. records placed in the target are associated w ith the first condition that evaluates
as true.
Case Transform Reminders
•
Default output is similar to CASE .. .ELSE (or OT HERWISE).
•
All possible outputs are mapped downstream.
•
If there is no default output. rows that do not meet any condit ions will be dropped.
•
When the row is [can be] true for one case, check the preserve expression order.
•
Enable preserve expression order to control t he sequence of the test (in a row of overlapping
logic).
Otherwise tests are performed in an order determined by the Data Services optimizat ion
engine, and might not concur with desired business rules.
© Copyright. All rights reserved.
199
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6: Platform Transforms
200
© Copyright . All rights reserved.
Unit 6
Exercise 14
Use the Case Transform
Business Example
The Orders data has been validated and merged from two different sources. flat f iles and
database tables. Now the resulting data set must be partitioned by quarter for reporting
purposes. You must use the Case transform to set up the various conditions to partition the
merged data into the appropriat e quarterly partitions
1. In the Omega project. create a new batch job Alpha_Orders_By_Quarter_Job with a new
data fl ow called Alpha_Orders_By_Quarter_DF.
2. In the workspace for Alpha_Orders_By_Quarter_OF. add the Orders_Merged table from
t he Delta datastore as the source object.
3. Add the Query transform to the data flow workspace between the source and target.
4. In the Query Editor, create output columns and map all columns from input to output.
5. Add the Case transform to the workspace to the right of the Query transform and connect
t hem.
6. In the Case Editor, create the labels and associated expressions for the partitioned fiscal
quarters 4 in the year 2006 and 1-4 in the year 2007.
7. Add six t emplate tables Orders_Q4_2006 , Orders_Q1_2007 , Orders_Q2_2007 ,
Orders_Q3_2007 , Orders_Q4_2007 and default _ouput to the Delta datastore as
output tables for the Case transform and connect t hem to t he Case transform.
8. Execute the Alpha_Orders_By_Quarter_Job with the default execution properties.
© Copyright . All r ights r eserved.
201
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6
Solution 14
Use the Case Transform
Business Example
The Orders data has been validated and merged from two different sources, flat f iles and
database tables. Now t he resulting data set must be partitioned by quarter for reporting
purposes . You must use the Case transform to set up the various condit ions t o partition the
merged data into the appropriate quarterly partitions
1. In the Omega project. create a new batch job Alpha_Orders_By_Quarter_Job with a new
data fl ow called Alpha_Orders_By_Quarter_DF.
a) In the Project area, right-click the Omega project name and choose New Batch Job .
b) Enter the job name, Alpha_Orders_By_Quarter_Job, and. on your keyboard. press the
Enter key.
c) To open the job Alpha_Orders_By_Quarter_Job workspace. double-click it.
d) To add the data flow. in the tool palette. choose the Data Flow icon and click in the job
workspace.
e) Enter t he data fl ow name, Alpha_Orders_By_Quarter_DF and. on your keyboard, press
the Enter key.
f) To open the Alpha_Orders_By_Quarter_DF workspace. double-click it.
2. In the workspace for Alpha_Orders_By_Quarter_DF, add the Orders_Merged table from
t he Delta datastore as the source object.
a) In the Local Object Li brary, choose the Datastores t ab.
b) Expand the Delta datastore. expand Tables, select the Orders_Merged table. drag it to the
workspace. and choose Make Source.
3. Add t he Query transform to the dat a flow workspace between the source and target.
a) To add a query to the data flow, in the tool palette. choose the Query Transform icon and
click in the Alpha_Orders_By_Quarter_DF workspace.
b) To connect t he source table t o the Query. select the source t able. hold down the mouse
button. drag t he cursor t o t he Query and release t he mouse button.
4. In the Query Editor. create output columns and map all colum ns from input to output.
a) To open the Query Editor, double-click t he query.
b) In the Schema In workspace, to select all the fields, select the f irst field, hold down t he
Shift key, and select t he last f ield.
202
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using the Case Transform
c) Drag the selected fields from the Schema In to the Schema Out workspace.
d) In the Schema Out workspace, right-click the last column field and choose New Output
Column.
e) In the dialog box, choose Insert Below. enter the column name ORDERQUARTER with Data
Type int , and choose OK.
f) In the Schema Out workspace, right-click ORDERQUARTER and choose New Output
Column.
g) In t he dialog box, choose Insert Below, enter the item name ORDERYEAR wit h Data Type
varchar ( 4) and choose OK.
h) Select the field ORDERQUARTER in the output schema, and, in t he Mapping tab, choose the
Functions button.
The Select Function dialog box opens.
i) In the Functions Categories field, choose Date Functions. in the Function name field,
choose quarter. and choose Next.
j) In t he Define Input Parameters dialog box, select the dropdown arrow to the right of the
Input date f ield, and select the Orders_Merged table.
k) From the table Orders_Merged table, select t he ORDERDATE field, choose OK button, and,
in the next dialog box, choose Finish.
I) Select the ORDERYEAR field in the output schema. and, in the Mapping tab, choose the
Functions button.
m) In the Functions Categories field, choose Conversion Functions, in the Function name field,
choose to_char and choose Next.
n) In t he Define Input Parameters dialog box, select the dropdown arrow to the right of the
Input date or number f ield, and select the Orders_Merged table
o) From the table Orders_Merged table, select t he ORDERDATE field.
p) In the Format string field, ent er ' YYYY ' and choose OK.
• ..... -....
•]
I ""
'
lj
Hint:
Remember to put in the single quotation marks before and after the string
YYYY.
q) In the next dialog box, choose Finish, and close the editor.
5. Add the Case transform to the workspace to the right of the Query t ransform and connect
them.
a) In t he Local Object Library, choose Transforms tab, and expand t he Platform node.
b) To add the Case transform, select Case and drag it into the data flow workspace.
© Copyright. All rights reserved.
203
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6: Platform Transforms
c) To connect the Query transform to the Case transform. select the Query transform. hold
down the mouse button, drag the cursor to the Case transform and release the mouse
button.
6. In the Case Editor, create the labels and associated expressions for the partitioned fiscal
quarters 4 in the year 2006and1-4 in the year 2007.
a) To open the Case Editor, double-click the Case transform to open the transform editor.
b) To add a new expression, in the Case tab of the Case Editor, choose the Add button.
c) In the Label f ield, enter the label Q42006 for the expression .
d) In the input schema, select the ORDERQUARTER column, drag it to the Expression
workspace at the bottom of t he window and type= 4 and.
e) To complete the expression for the first condition, in the input schema, select the
ORDERYEAR column, drag it to the Expression workspace at the bottom of the window and
type = I 2006 I ,
The expression should appear as:
Query.ORDERQUARTER = 4 and Query.ORDERYEAR = '2006'
f) Repeat steps b to e for the following expressions:
Label
Expression
Q12007
Query.ORDERQUARTER = 1 and
Query.ORDERYEAR = '2007'
Q22007
Query.ORDERQUARTER = 1 and
Query.ORDERYEAR = '2007'
Q32007
Query.ORDERQUARTER = 1 and
Query.ORDERYEAR = '2007'
Q42007
Query.ORDERQUARTER = 1 and
Query.ORDERYEAR = '2007'
g) To direct records that do not meet any defined conditions to a separate target object,
confirm that the Produce default output with label checkbox is selected, and that the label
name default is entered in t he associated field.
h) To direct records that might meet multiple conditions to only one target, confirm that the
Row can be TRUE for one case only checkbox is selected.
In this case, records are placed in the target associated with t he first condition that
evaluates as true.
i) Return to the data flow workspace.
7. Add six template t ables Orders_Q4_2006 , Orders_Q1_2007 , Orders_Q2_2007 ,
Orders_Q3_2007, Orders_Q4_2007 and default_ouput to the Delta datastore as
output tables for the Case transform and connect them to the Case transform.
204
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using the Case Transform
a) To add a new template table to the data f low, in the tool palette, choose the Template
Table icon and click in the workspace.
b) In the Create Template d ialog box, in the Table Name f ield, enter Orders _Q4_2 006.
c) In the In datastore drop-down list, choose the Del ta datastore as the template table
destination target and choose OK.
d) Repeat steps a to c for the next fi ve tables, using the following data:
Label
Template Table Name
Datastore
Q42006
Orders_Q4_2006
Delta
Ql2007
Orders_Ql_2007
Deta
Q22007
Orders_Q2_2007
Delta
Q32007
Orders_Q3_2007
Delta
Q42007
Orders_Q4_2007
Delta
default_output
default_output
Delta
e) Connect the output from the Case transform to the target template tables.
Repeat this step for each of the template tables.
8. Execute the Alpha_Orders _ By_ Quarter_Job with the default execution properties.
a) In the Omega project area, right-click the Alph a_ Orders _By_Qu a r t er_Job and choose
Execute .
Data Services prompts you to save any objects that have not been saved.
b) In the Save all changes and execute dialog box. choose Yes.
The Execution Properties dialog box appears.
c) To execute the job using the defau lt execution properties, in the Execution Properties
d ialog box, choose OK.
d) Return to the job workspace.
e) Open the data flow workspace
f) Right click the target table Orders _ Ql_2 00 7 and choose View Data.
Note that the titles for the affected contacts are changed.
g) Confirm that there are 103 orders that were placed in f iscal quarter one of 2007.
h) Close the data display.
© Copyright. Al l rights reserved.
205
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6: Platform Transforms
LESSON SUMMARY
You should now be able to:
•
206
Use the case transform
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6
Lesson 6
Using the SQL Transform
LESSON OVERVIEW
Use the SQL transform to submit SQL commands that generate data to be moved into target
objects.
LESSON OBJECTIVES
Aft er completing this lesson, you will be able to:
•
Use the SQL transform
The SQL Transform
Business Example
Your company extracts data from external systems using flat files. The data volume from the
various external systems has increased continually in the recent past, making management of t he
jobs for flat file extraction difficult. Optimize this process by using Data Services to extract data
directly from an external system. Use t he SQL transform to submit SQL commands to generate
data to be moved into target objects where other transforms do not meet business requirements.
The SQL transform allows for the submitting of SQL commands to generate data to be moved
into target objects.
Explaining the SQL Transform
Use this transform to perform standard SQL operations when other built-in transforms do not
perform them. The SQL transform can be used to extract for general select statements as well as
stored procedures and views.
Use the SQL transform as a replacement for the Merge transform when dealing with database
tables. The SQL transform performs more efficiently because the merge is pushed down to the
database . However, this functionality cannot be used when source objects include file formats.
The next section gives a brief description the functions, data input requirements, options, and
data output results for the SQL transform.
Inputs/ Outputs
There is no input data set for the SQL transform. There are two ways of defining the output
schema for an SQL transform if the SQL submitted is expected to return a result set:
•
Automatic:
After you type the SQL statement, select Update schema to execute a select statement
against the database t hat obtains column information returned by the select statement and
populates the output schema.
© Copyright . All r ights r eserved.
207
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6: Platform Transforms
•
Manual:
Output columns are defined in the output portion of the SQL transform if the SQL operation is
returning a data set. The number of columns defined in the output of t he SQL transform must
equal the number of columns returned by the SQL query. The column names and data types of
the output columns do not need to match the column names or data types in the SQL query.
Table 29: SQL Transform Options
Option
Description
Datastore
Specify the datastore for the tables referred to
in the SQL statement
Database type
Specify the type of database for the datastore
where there are multiple datastore
configurations
Array fetch size
Indicate t he number of rows retrieved in a
single request to a source database. The
default value is 1000
Cache
Hold the output from this transform in memory
for use in subsequent transforms. Only use the
Cache option if the data set is small enough to
fit in memory
SQL text
Enter the text of the SQL query
Create SQL Statement
1. On the Transforms tab of the Local Object Library, select and drag t he SQL transform to the
workspace.
2. Add your target object to the workspace.
3. Connect the transform to the target object.
4. Double-click the SQL transform to open the transform editor.
5. In the Parameters area, select the source datastore from the Datastore drop-down list.
6. In the SQL text area, enter t he SQL statement.
For example. to copy the entire contents of a table into the target object. use the statement:
Select* from Customers.
7. Select Update Schema to update the output schema with the appropriate values.
208
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6
Exercise 15
Use the SQL Transform
Business Example
Your company extracts data from external systems using f lat files. The data volume from the
various external systems has increased continually in the recent past, making management of t he
jobs for flat file extraction difficult. You can optimize this process by using Data Services to
extract data directly from an external system .
You use the SQL t ransform to submit SQL commands to generate data to be moved into target
objects where other transforms do not meet business requirements.
The contents of the Employee and Department tables must be merged, so you use the SQL
transform to merge the tables.
1. In the Omega project, create a new batch job called Alpha_Employees_Dept_Job
containing a data f low called Alpha_Employees_Dept_DF.
2. Add an SQL transform to the data flow and connect it to the Emp_Dept table from the
HR_datamart datastore as t he target object.
3. In the transform editor for the SQL transform. specify the source dat astore and tables.
4. Execute the Alpha_Employees_Dept_Job w ith the default execution properties.
© Copy right . All r ights r eserved.
209
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6
Solution 15
Use the SQL Transform
Business Example
Your company extracts data from external systems using flat files. The data volume f rom the
various external systems has increased continually in the recent past. making management of the
jobs for flat file extraction difficult. You can optimize this process by using Data Services to
extract data directly from an external system.
You use the SQL transform to submit SQL commands to generate data to be moved into target
objects where other transforms do not meet business requirements.
The contents of the Employee and Department tables must be merged, so you use the SQL
transform to merge the tables.
1. In the Omega project, create a new batch job called Alpha_Employees_Dept_Job
containing a data f low called Alpha_Employees_Dept_DF.
a) In the Project area, right-click the project name and choose New Batch Job.
b) Enterthe job name Alpha_Employees_Dept_Job, and, on your keyboard, press the
Enter key.
c) To open the job Alpha_Employees_Dept_Job, double-click it.
d) To add a new data flow to the Alpha_Employees_Dept_Job, in the tool palette, choose
the Data Flow icon and click in the workspace.
e) Enter the data f low name Alpha_Employees_Dept_DF , and, on your keyboard, press
the Enter key.
f) To open the data flow workspace, double-click the data flow.
2. Add an SQL transform to the data flow and connect it to the Emp_Dept table from the
HR_datamart datastore as the target object.
a) In the Local Object Library, select the Transforms tab, and expand the Platform node.
b) To add the SQL t ransform. select SQL and drag it to the data flow workspace.
c) In the Local Object Library, select the Datastores tab and expand the HR_Datamart
datastore.
d) Select the EMP_DEPTtable, drag it to the data flow workspace, and choose Make Target.
3. In the transform editor for the SQL transform, specify the source datastore and tables.
a) To open the Transform Editor, double-click the SQL transform .
210
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using the SQL Transform
b) On the SQL tab, in the field Datastore field, use the drop-down list to choose the Alpha
datastore.
c) In the Database type field, use the drop-down list to choose Sybase ASE 15 . x .
d) In the SQL text workspace, enter the following expression:
SELECT employee . EMPLOYEEID , employee.FIRSTNAME , employee.LASTNAME ,
department.DEPARTMENTNAME
FROM employee , department
WHERE employee.DEPARTMENTID = department.DEPARTMENTID.
This SQL stat ement selects the last name and first name of the employee from the
Employee table, and the department to which t he employee belongs. It looks up the value
in the Department table based on the Department ID.
e) To create the output schema, choose t he Update schema button.
This creates the output column fields.
f) Right-click t he EMPLOYEEID column and choose Primary Key.
g) Close t he editor.
h) To connect the SQL transform to the target table. select the SQL transform . hold down the
mouse button, drag the cursor to t he target table. and release the mouse button.
4. Execute t he Alpha_Employees_Dept_Job with the default execution properties.
a) In the Omega project area, right -click t he Alpha_Employees_Dept_Job and choose
Execute.
Data Services prompts you to save any objects that have not been saved.K.
b) In the Save all changes and execute dialog box, choose Yes. The Execution Properties
dialog box.
The Execution Properties dialog box appears
c) To execute the job using default properties, choose OK.
d) Go Back to the job workspace.
e) To open t he data flow workspace. double click the data f low.
f) Right click t he target table and choose View data.
You should have 40 rows in your target table, because there were 8 employees in t he
Employee table with department IDs that were not defined in the Department table.
g) Close t he display.
© Copyright. All rights reserved.
211
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6: Platform Transforms
LESSON SUMMARY
You should now be able to:
•
212
Use the SQL transform
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6
Learning Assessment
1. Which one of these statements about transforms is true?
Choose the correct answer.
D
D
D
D
A Transforms are necessary objects in a data flow.
B The Query transform is the most commonly used transform.
C Transforms operate on single values, such as values in specific columns of a data set.
D It is not possible to use transforms in combination to create an output data set.
2. What does the Map Operation transform enable you to do?
3 . You can use mapping expressions on top-level schemas only.
Determine whether this statement is true or false.
D
D
True
False
© Copyright . All r ights r eserved.
213
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6: Learning Assessment
4. Match the Validation transform rules with their descriptions.
Match the item in the first column to the corresponding item in the second column.
Enable validation
Do not validate when NULL
Condition
Define the condition for the
validation rule
Send all NULL values to the
Pass output automatically.
Data Services does not apply
the validation rule on this
column when an incoming
value for it is NULL
Turn the validation rule on and
off for the column
5. The Merge transform combines incoming data sets with the same schema structure to
produce a single output data set with the same schema as the input data sets.
Determine whether this statement is true or false.
D
D
True
False
6 . Which of these statements about Case transforms are correct?
Choose the correct answers.
D
A The Case transform supports separating data from a source into multiple targets
based on branch logic.
D
B The Case transform allows for the dividing of a data set into smaller sets based on
logical branches.
D
D
C Multiple data flow sources are allowed as a data input for the Case transform.
D The connections between the Case transform and objects used for a particular case
are not labeled.
7. When would you use the SQL transform instead of the Merge transform?
214
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6
Learning Assessment - Answers
1. Which one of these statements about transforms is true?
Choose the correct answer.
D
A Transforms are necessary objects in a data flow.
~ B The Query transform is the most commonly used transform.
D
D
C Transforms operate on single values, such as values in specific columns of a data set.
D It is not possible to use transforms in combination to create an output data set.
2. What does the Map Operation transform enable you to do?
The Map Operation transform enables you to change the operation code for records.
3 . You can use mapping expressions on top-level schemas only.
Determine whether this statement is true or false.
~ True
D
False
© Copy right . All r ights r eserved.
215
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 6: Learning Assessment · Answers
4. Match the Validation transform rules with their descriptions.
Match the item in the first column to the corresponding item in the second column.
Enable validation
Do not validate when NULL
Condition
Turn the validation rule on and
off for the column
Send all NULL values to the
Pass output automatically.
Data Services does not apply
the validation rule on this
column when an incoming
value for it is NULL
Define the condition for the
validation rule
5. The Merge transform combines incoming data sets with the same schema structure to
produce a single output data set with the same schema as the input data sets.
Determine whether this statement is true or false.
0
True
D
False
6 . Which of these statements about Case transforms are correct?
Choose the correct answers.
A The Case transform supports separating data from a source into multiple targets
based on branch logic.
B The Case transform allows for the dividing of a data set into smaller sets based on
logical branches.
D
D
C Multiple data flow sources are allowed as a data input for the Case transform.
D The connections between the Case transform and objects used for a particular case
are not labeled.
7. When would you use the SQL transform instead of the Merge transform?
Use the SQL transform as a replacement for the Merge transform when dealing with database
tables. The SQL transform performs more efficiently because the merge is pushed down to
the database. This functionality cannot be used when source objects include file formats.
216
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Error Handling
Lesson 1
218
227
Setting Up Error Handling
Exercise 16: Create an Alternative Work Flow
UNIT OBJECTIVES
•
Explain the levels of data recovery strategies
© Copy right . All r ights r eserved.
217
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 7
Lesson 1
Setting Up Error Handling
LESSON OVERVIEW
Resolve issues if a Data Services job execution is not successful . for example, if a server failure
prevents the completion of the job. Use recoverable work f lows and try/catch blocks to recover
data for sophisticated error handling.
LESSON OBJECTIVES
After completing this lesson, you will be able to:
•
Explain the levels of data recovery strategies
Recovery Mechanisms
Business Example
If a Data Services job does not complete properly, resolve the problems that prevented the
successful execution of the job.
Avoiding Data Recovery Situations
The best solution to data recovery situations is to avoid them but some situations are
unavoidable. such as server failures. However. others can easily be sidestepped by constructing
jobs so that they take into account the issues that frequently cause them to fail.
One example is when an external file is required to run a job. In t his situation. use the wait_for_file
function or a while loop and the file_exists function to check that the f ile exists in a specified
location before executing the job.
The while loop is a single-use object that is used in a work flow. The while loop repeats a
sequence of steps as long as a condition is true.
Typically, the steps done during t he while loop result in a change in the condition so that the
condition is eventually no longer satisfied and t he work flow exits from the while loop. If t he
condition does not change, the while loop does not end.
For example, you m ight want a work flow to wait until the system writes a particular file. Use a
while loop to check for the existence of the file using the file_ exists function. As long as the file
does not exist. the work f low can go into sleep mode for a particular length of time before
checking again.
As the system m ight never write the file, add another check to the loop, such as a counter, to
ensure that the while loop eventually exits. In other words, change t he while loop to check for the
existence of the f ile and the value of the counter. As long as the file does not exist and the counter
is less than a particular value. repeat t he while loop. In each iteration of the loop, put the work flow
in sleep mode and then increment the counter.
218
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Setting Up Error Handling
Describing Levels of Data Recovery Strategies
When a job fails to complete successfully during execution. data fl ows may not have completed .
Some tables may have been loaded, partially loaded. or altered:
•
Recover entire database
•
Automatic recovery: recover from partially loaded jobs
•
Table comparison: recover from partially loaded tables
•
Validation transform: recover missing values or rows
•
Alternative workflows: manage expectations
Data movement jobs are designed so t hat you can recover data by rerunning the job and
retrieving all the dat a without introducing duplicate or missing data.
Data Recovery and Recovery Strategies
There are different levels of dat a recovery and recovery strategies, which allow you to:
•
Recover your entire database:
Use the standard RDBMS services to restore crashed data cache to an entire database. This
option is outside of the scope of this course.
•
Recover a partially loaded job:
Use automatic recovery.
•
Recover from partially loaded tables:
Use the Table Comparison transform. or do a full replacement of the target. or use the autocorrect load feature, or include a preload SQL command to avoid duplicate loading of rows
when recovering from partially loaded tables.
•
Recover missing values or rows:
Use the Validation transform or the Query transform with WHERE clauses to identify missing
values, and use overflow files to manage rows that could not be inserted.
•
Define alternative work flows:
Use condit ionals. try/catch blocks. and scripts to ensure that all exceptions are managed in a
work flow. Depending on the relationships between data flows in your application. you may use
a combination of these techniques to recover from exceptions.
Note:
Some recovery mechanisms are for use in production systems and are not supported
in development environments.
Recovery Units
In some cases, steps in a work flow depend on each other and are executed together. When there
is a dependency, designate the work flow as a recovery unit. This requires t he ent ire work flow to
complet e successfully. If the work flow does not complete successfully, Data Services executes
© Copyright . All rights reserved.
219
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 7: Error Handling
the entire work flow during recovery, including the steps that executed successfully in prior work
f low runs.
Conversely, you may need to specify that a work flow or data flow should only execute once.
When this setting is enabled, the job never re-executes the object. We do not recommend
marking a work flow or data flow as Execute only once if the parent work flow is a recovery unit.
Specify Work Flow as a Recovery Unit
1. In the Project area or on the Work Flows tab of the Local Object Library. right-click the work
flow and select Properties from the menu.
The Properties dialog box displays.
2. On the General tab, select the Recover as a unit check box.
3. Select OK.
Specify Object Executes Only Once
1. In the Project area or on the appropriate tab of the Local Object Library, right-click the work
flow or data flow and select Properties from the menu.
The Properties dialog box displays.
2. On the General tab, select the Execute only once check box.
3. Select OK.
Recovery Mode
If a job with automated recovery enables fails during execution, you can execute the job again in
recovery mode. During recovery mode, Data Services retrieves the results for successfullycompleted steps and reruns incompleted or failed steps under the same condit ions as the original
job.
In recovery mode, Data Services executes the steps or recovery units that did not complete
successfully in a previous execution. This includes steps that failed and steps that generated an
exception but completed successfully, such as those in a try/catch block. As in normal job
execution, Data Services executes the steps in parallel if they are not connected in the work flow
diagrams and in serial if they are connected.
For example. suppose a daily update job that runs overnight successfully loads dimension tables
in a warehouse. However, while the job is running, the database log overflows and stops the job
from loading fact tables. The next day, you truncate the log file and run the job again in recovery
mode. The recovery job does not reload the dimension tables in a failed job because the original
job, even though it failed, successfully loaded the dimension tables.
Fact Tables
To ensure that the fact tables are loaded with the data that corresponds properly to the data
already loaded in the dimension tables, ensure that:
•
220
The recovery job uses the same extraction criteria that the original job used when loading the
dimension tables.
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Setting Up Error Handling
If the recovery job uses new extraction criteria, such as basing data extraction on the current
system date, the data in the fact tables will not correspond to the data previously extracted
into the dimension tables. If the recovery job uses new values, the job execution may follow a
completely different path with conditional steps or try/catch blocks.
•
The recovery job must follow the exact execution path that the original job followed. Data
Services records any external input s to t he original job so that the recovery job can use these
stored values and follow the same execut ion path.
Enable Automatic Recovery in a Job
1. In the Project area, right-click the job and select Execute from the menu.
The Execution Properties dialog box displays .
2. On the Parameters tab, select the Enable recovery check box.
If this check box is not selected, Data Services does not record the results from the steps
during the job and cannot recover t he job if it fails.
3. Select OK.
Recover from last Execution
1. In the Project area, right-click the job that failed and select Execute from the menu.
The Execution Properties dialog box displays .
2. On the Parameters tab, select the Recover from last execution check box.
This option is not available when a job has not yet been executed, the previous job run
succeeded, or recovery mode was disabled during the previous run.
3. Select OK.
Partially Loaded Data
Executing a failed job again may result in duplication of rows that were loaded successfully during
the first job run.
Recoverable Work Flow
Wit hin the recoverable work f low, several methods can be used to ensure that duplicate rows are
not inserted:
•
Include the Table Comparison transform (available in Data Services packages only) in the data
flow when the table has more rows and fewer f ields, such as fact tables.
•
Change the target table options to replace the target table during each execution. This
technique can be optimal when the changes to the target table are numerous compared to the
size of the table.
•
Change the target table options to use the auto-correct load feature when a table has fewer
rows and more fields, such as dimension tables. The auto-correct load checks the target table
for existing rows before adding new rows to the table . Using the auto-correct load option,
© Copyright. All rights reserved.
221
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 7: Error Handling
however. can slow jobs executed in nonrecovery mode. Consider this technique when the
target table is large and the changes to the table are relatively few.
•
Include an SQL command to execute before the table loads. Preload SQL commands can
remove partial database updates that occur during incomplete execution of a step in a job.
Typically, the preload SQL command deletes rows based on a variable that is set before the
partial insertion step began.
Missing Values or Rows
Missing values that are introduced into the target data during data integrat ion and data quality
processes can be managed using the Validation or Query transforms.
Missing rows are rows that cannot be inserted into the target table. For example, rows may be
missing in instances where a primary key constraint is violated. Overflow fil es help to process this
type of data problem.
When you specify an overflow file and Data Services cannot load a row into a table, Data Services
writes the row to the overflow file instead. The trace log indicates the data flow in which the load
failed and t he location of the file. Use t he overflow information to identify invalid data in t he
source or problems introduced in the data movement. Every new run overwrites the existing
overflow f ile.
Alternative Work Flows
Use an Overflow File in a Job
1. Open the target table editor for t he target table in the data flow.
2. On t he Options tab, under Error handling, select the Use overflow file check box.
3. In the File name f ield, enter or browse to the full path and file name for the f ile.
When you specify an overflow file, give a full path name to ensure that Data Services creates a
unique file when more than one file is created in t he same job.
4. In the File format drop-down list, select what you want Data Services to write to the file about
t he rows that failed to load:
•
If you select Write data, you can use Data Services to specify the format of the errorcausing records in the overflow file.
•
If you select Write sq/, you can use the commands to load the target manually when the
target is accessible.
Defining Alternative Work Flows
Set up jobs to use alternative work f lows that cover all possible exceptions and have recovery
mechanisms built in. This technique allows you to automate the process of recovering results.
222
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Setting Up Error Handling
Set Ille SGV_reco.,..l'/_ _ Qlobal
vaNble tD tt.t same M; ttl4t v•lue .n thlt
recc..ery StlOJS table: 1 If l"ICOY&ry IS
needed: O ff recovery tS not needed.
.:::l
•
T~
RecO¥try_OF
@·-~
C•h
~....-~@f
•
•
s
RtCMf'J'.}\C_OF
~
f
Pass
~12r
Figure 38: Alternative Workflow with Try/Catch Blocks
Alternative Work Flow Components
Alternative work flows consist of several components, as shown in the figure Alternative
Workflow with Try/Catch Blocks:
1. A script to determine when a recovery is required.
This script reads the value in a status table and populates a global variable with the same
value. The initial value is set to indicate that a recovery is not required.
2. A conditional that calls the appropriate work f low based on whether recovery is required .
The cond it ional contains an If/Then/Else statement that specifies that work flows do not
require recovery are processed one way, and those that do require recovery are processed
another way.
3. A work flow with a try/catch block to execute a data flow without recovery.
The data flow where recovery is not required is set up without the auto correct load option set.
This ensures that, wherever possible, the data flow is executed in a less resource-intensive
mode.
4. A script in the catch object to update the status table.
The script specifies that recovery is required if any exceptions are generated.
5. A work flow to execute a data flow with recovery and a script to update the status table.
The data flow is set up for more resource-intensive processing that will resolve the
exceptions. The script updates the status table to indicate that recovery is not required.
Conditionals
Conditionals are single-use objects used to implement conditional logic in a work f low as shown
in the figure Workflow with Conditional Decision.
© Copyr ight. All r ights reserved.
223
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 7: Er ror Handling
IF $i < l=TRUE
$i = 10
IF $i
<1
..
IF $i
$i
=
0
< l =FALSE
@
Figure 39: Workflow with Condi tional Decision
Table 30: Define a Conditional
When conditional is defined, specify a condition and two logical branches:
Statement
Description
If
A Boolean expression that evaluates to TRUE
or FALSE. Use functions, variables, and
standard operators to construct the
expression.
Then
Work flow element to execute if the IF
expression evaluates to TRUE.
Else
Work flow element to execute if the IF
expression evaluates to FALSE.
Both the Then and Else branches of the conditional can contain any object t hat you can have in a
work flow, including other work flows, data flows, nested cond it ionals, try/catch blocks, scripts,
and so on.
Try/Catch Blocks
A try/catch block allows you to specify alternative work flows if errors occur during job execution.
Try/catch blocks catch classes of errors, apply provided solutions, and continue execution.
For each catch in the try/catch block, you can specify:
•
One exception or group of exceptions handled by the catch. To handle more than one
exception or group of exceptions, add more catches to the try/catch block.
•
The work f low to execute if the indicated exception occurs. Use an existing work flow or define
a work flow in t he catch editor.
If an exception is thrown during the execution of a try/catch block, and if no catch is looking for
that exception, then t he exception is handled by normal error logic.
Using Try/Cat ch Blocks and Automatic Recovery
224
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Setting Up Error Handling
Data Services does not save the result of a try/catch block for re-use during recovery. If an
exception is thrown inside a try/catch block, during recovery, Data Services executes the step
t hat threw the exception and subsequent steps.
Since the execution path with the try/catch block might be different in the recovered job, using
variables set in the try/catch block could alter the results during automatic recovery.
For example. suppose that you create a job that defines the value of variable $1 within a try/catch
block. If an exception occurs, set an alternate value for $1. Subsequent steps are based on the
new value of $1.
During the first job execution, the f irst work flow contains an error that generates an exception,
which is caught. However, the job fails in the subsequent work flow as shown in the figure
Workflow First Execut ion Captures Error.
An error occurs while
processing this work
flow
An exception is thrown
and caugl"t in the first
execution
Figure 40 : Workflow First Execution Captures Error
Conditional Changes Execution Path
Fix the error and run the job in recovery mode. During the recovery execution, the first work flow
no longer generates the exception . Thus the value of variable $1 is different, and the job selects a
different subsequent work f low, producing d ifferent results as shown in the f igure Conditional
Changes Execution Path.
The execution path changes
because of the results from th e
try/catch block
No excepti on is thrown in
the re covery execution
Figure 41: Conditional Changes Execution Path
© Copyright . AlI rights reserved .
225
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 7: Error Handling
To ensure proper results with automatic recovery when a job contains a try/catch block, do not
use values set inside the try/catch block or reference output variables from a try/catch block in
any subsequent steps.
226
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 7
Exercise 16
Create an Alternative Work Flow
Business Example
With the influx of new employees resulting from Alpha's acquisition of new companies, the
Employee Department information needs to be updated regularly. Because this information is
used for payroll. it is crit ical that there is no loss of records if a job is interrupted. You need to set
up the job in a way that exceptions are always managed. This involves setting up a condit ional
that executes a less resource-intensive update of the table first. If that generates an exception.
the conditional t hen tries a version of the same data flow that is configured to auto correct the
load.
Set up a job Alpha_Empoyees_Dept_Recovery_Job with a try/catch block and conditional to
catch exceptions in the execution of a data flow Alpha_Employees_Dept_DF. Exceptions cause
the conditional to execute a different version of the same data flow
Alpha_Employees_Dept_AC_DF configured with auto correction.
1. Replicate the data flow Alpha_Employees_Dept_DF as Alpha_Employees_Dept_AC_DF in
t he Local Object Library and reconfigure the target table in Alpha_Employee_Dept_AC_DF for
auto correction.
2. In the Omega project, create a new batch job called
Alpha_Employees_Dept_Recovery_Job and a new global variable
$G_Recovery_Needed.
3. In the workspace of the Alpha_Employees _Dept_Recovery_Job add a work flow called
Alpha_Employees_Dept_Recovery_WF.
4. In the Alpha_Employees_Dept_Recovery_WF workspace, add a script called GetStatus
and construct an expression to update the value of the global variable $G_Recovery_Needed
to the same value as in the recovery_flag column in the recovery_status table in the
HR_datamart.
5. In the work f low workspace, enter a Conditional called Alpha_Employees_Dept_Con
connected to the script.
6. Configure the Conditional as an "if" statement t hat determines which data flow to execute
based upon the value of the global variable $G_Recovery_Needed.
7. Execute Alpha_Employees_Dept_Rec overy_Job with the default properties.
© Copy right . All r ights r eserved.
227
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 7
Solution 16
Create an Alternative Work Flow
Business Example
With the influx of new employees resulting from Alpha's acquisition of new companies, the
Employee Department information needs to be updated regularly. Because this informat ion is
used for payroll. it is crit ical that there is no loss of records if a job is interrupted. You need to set
up the job in a way that exceptions are always managed. This involves setting up a conditional
that executes a less resource-intensive update of the table first. If that generates an exception.
the conditional then tries a version of the same data flow that is configured to auto correct the
load.
Set up a job Alpha_Empoyees_Dept_Recovery_Job with a try/catch block and conditional to
catch exceptions in the execution of a data flow Alpha_Employees_Dept_DF. Exceptions cause
the conditional to execute a different version of the same data flow
Alpha_Employees_Dept_AC_DF configured with auto correction.
1. Replicate the data flow Alpha_Employees_Dept_DF as Alpha_Employees_Dept_AC_DF in
the Local Object Library and reconfigure the target table in Alpha_Employee_Dept_AC_DF for
auto correction.
a) In the Local Object Library, select the Data Flows tab. right-click the
Alpha_Employees_Dept_DF data flow and choose Replicate.
b) To change the name of the replicated data flow to Alpha_Employees_Dept_AC_DF,
right-click the data flow, choose Rename, enter the new name, and, on your keyboard,
press the Enter key.
c) To open Alpha_Employees_Dept_AC_DF, in the Local Object Library, select the Data
Flows tab. and double-click the Alpha_Employees_Dept_AC_DF data flow.
d) To open the Target Table Editor, double-click the target table Emp_Dept.
e) In the Target Table Editor, select the Options tab.
f) Change the value in the Auto correct load field from No to Yes.
g) Go Back to t he data flow workspace.
2. In the Omega project, create a new batch job called
Alpha_Employees_Dept_Recovery_Job and a new global variable
$G_Recovery_Needed.
a) In the project area. right-click the Omega project, choose New batch job, and enter the job
name Alpha_Employees_Dept_Recovery_Job.
228
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Setting Up Error Handling
b) In the project area, select the job Alpha_Employees_Dept_Recovery_Job , and, from
the main menu, choose Tools ~ Variables .
c) Right-click Global Variables and choose Insert.
d) Right-click t he new global variable, choose Properties, and, in the Global Variable
Properties dialog box, in the Name field, enter $G_Recovery_Needed.
e) In the Data type dropdown list, choose int and choose OK.
f) Close t he Variables and Parameters editor.
3. In t he workspace of the Alpha_Employees_Dept_Recovery_Job add a work f low called
Alpha_Employees_Dept_Recovery_WF.
a) In the tool palette, select the Work Flow icon, click in t he job workspace, and enter the
name Alpha_Employees_Dept_Recovery_WF.
b) To open t he workflow workspace, double-click Alpha_Employees_Dept_Recovery_WF.
4. In t he Alpha_Employees_Dept_Recovery_WF workspace. add a script called GetStatus
and construct an expression to update the value of the global variable $G_Recovery_Needed
to the same value as in the recovery_flag column in the recovery_status table in t he
HR_datamart.
a) To add a script to the Alpha_Employees_Dept_Recovery_WF workspace, in the tool
palet te, choose the Script icon. and click in the workspace.
b) Name the script GetStatus.
c) To open t he script, double-click it.
d) Type in the following expression:
$G_Recovery_Needed = sql( ' HR_Datamart ' ,' select RECOVERY FLAG from
RECOVERY_STATUS ');
This expression updates the value of the global variable to the value as in the recovery_flag
column in t he recovery_status table in the HR_datamart
e) Close t he script and go Back to the work flow workspace.
5. In the work flow workspace, ent er a Conditional called Alpha_Employees_Dept_Con
connected to the script.
a) In the tool palette, choose the Conditional icon, and c lick in the work flow workspace.
b) Enter the name of the Conditional Alpha_Employees_Dept_Con and, on your keyboard,
press the Enter key.
c) To connect the script and t he conditional, select the script. hold down the mouse button,
drag it to the Conditional and release the mouse button.
d) To open t he Conditional Editor. double-click the Conditional.
6. Configure the Conditional as an "if" statement that determines which data flow to execute
based upon the value of the global variable $G_Recovery_Needed.
© Copyr ight. All r ights reserved.
229
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 7: Error Handling
a) In t he Alpha_Employees_Dept_Con Conditional Editor, in the "if' statement field. enter
the expression: $G_Recovery_Needed = 0.
This "if' statement states that recovery is not required.
b) In t he tool palette, choose the Try icon, click in the "Then" area of the Conditional Editor
and enter the Try name Alpha_Employees_Dept_Try.
c) In t he Local Object Li brary, select the data flow Alpha_Employees_Dept_DF and drag
the data flow into the "Then" pane of the Conditional Editor.
d) Connect the Alpha_Employees_Dept_Try to Alpha_Employees_Dept_DF.
e) In t he tool palette. choose the Catch icon. click in the "Then" pane of the Conditional
Editor. and enterthe name Alpha_Employees_Dept_Catch.
f) Connect the Alpha_Employees_Dept_Catch to Alpha_Employees_Dept_DF.
g) To open the Catch Editor. double click Alpha_Employees_Dept_Catch.
h) To add a script to the cat ch, in t he tool palette, choose the Script icon, click in the lower
paneCatch Editor workspace, enter the script name Recovery_Fail and, on your
keyboard, press the Enter key.
i) Double-click the Recovery_Fail script and enter the following expression:
sql('HR_Datamart' ,' update RECOVERY_STATUS set RECOVERY_FLAG = 1') ;
This expression updates the f lag in the recovery status table to 1, indicating that recovery
is needed.
j) Close the Script.
k) In t he Local Object Library. select the Data Flows tab, select
Alpha_Employees_Dept_AC_DF and drag it to the "Else" pane of the Conditional Editor.
I) In t he tool palette. choose the Script icon, click in the "Else" pane of the Conditional Editor
to the right of the data flow, and enter the script name Recovery_Pass.
m) Double-click the Recovery_Pass script and enter the expression:
sql('HR_Datamart' ,
'update RECOVERY_STATUS set RECOVERY_FLAG = 0');
This expression updates the flag in the recovery status table to 0, indicating that recovery
is not needed to update the flag in the recovery status table to 0, indicating that recovery
is not needed.
n) Close the Script.
o) Connect Alpha_Employees_Dept_AC_DF to the script Recovery_Pass . The script
should be downstream from the data flow.
7. Execute Alpha_Employees_Dept_Recovery_Job with the default properties.
a) In t he project area, select Alpha_Employees_Dept_Recovery_Job and choose Execute.
Data Services prompts you to save any objects t hat have not been saved.
b) In t he Save all changes and execute dialog box. choose Yes.
230
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Setting Up Error Handling
The Execution Properties dialog box appears.
c) To execute the job using default properties, choose OK
Note:
The trace log indicates the data flow generated an error, but the job completed
successfully because of the try catch block. An error log that indicates a
primary key conflict in the target table was generated .
d) Execute the Alpha_Employees_Dept_Recovery_Job a second time with the default
properties.
Note:
The job succeeds and the data flow used was
Alpha_Employees_Dept_AC_DF
© Copyright. All rights reserved.
231
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 7: Error Handling
LESSON SUMMARY
You should now be able to:
•
232
Explain the levels of data recovery strategies
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 7
Learning Assessment
1. Identify the correct data recovery strategy for the data you want to recover.
Match the item in the first column to the corresponding item in the second column.
Recover your entire database
Recover a partially loaded job
Recover missing values or
rows
Use the standard RDBMS
services to restore crashed
data cache to an entire
database.
Use automatic recovery.
Use the Validation transform
or the Query transform with
WHERE clauses to identify
missing values, and use
overflow files to manage rows
that could not be inserted
© Copyright . All r ights r eserved.
233
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 7
Learning Assessment - Answers
1. Identify the correct data recovery strategy for the data you want to recover.
Match the item in the first column to the corresponding item in the second column.
Recover your entire database
Recover a partially loaded job
Recover missing values or
rows
Use the standard RDBMS
services to rest ore crashed
data cache to an entire
database.
Use automatic recovery.
Use the Validation transform
or the Query transform with
WHERE clauses to identify
missing values, and use
overflow files to manage rows
that could not be inserted
234
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Changes in Data
Lesson 1
236
Capturing Changes in Data
Lesson 2
242
247
Using Source-Based Change Data Capture (CDC)
Exercise 17: Use Source-Based Change Data Capture (CDC)
Lesson 3
256
263
Using Target-Based Change Data Capture (CDC)
Exercise 18: Use Target-Based Change Data Capture (CDC)
UNIT OBJECTIVES
•
Update data which changes slowly over t ime
•
Use source-based change data capture (CDC)
•
Use target-based CDC
© Copy right . All r ights r eserved.
235
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 8
Lesson 1
Capturing Changes in Data
LESSON OVERVIEW
The design of your data warehouse must take into account how you are going to handle changes
in your target system when the respective data in your source system changes.
Slow Changing Dimensions (SCD) are dimensions in data warehouses that have data which
changes over time. Use Data Integrator transforms to manage data.
LESSON OBJECTIVES
After completing this lesson, you will be able to:
•
Update data which changes slowly over time
Slowly Changing Dimensions (SCD)
Slowly Changing Dimensions (SCDs) are dimensions with data relationships that change over
t ime.
Slowly Changing Dimensions Management
There are three ways to manage slowly changing dimensions:
•
SCD Type 1: No history preservation
•
SCD Type 2: Unlimited history preservation
•
SCD Type 3: Limited history preservation
SCD Type 1: No History Preservation
Find and update the appropriate attributes on a specific dimensional record. For example. update
one record in the SALES_PERSON_DIMENS/ON table to show a change to an individual's
SALES_PERSON_NAME field.
This action updates the record for all fact records across time. In a dimensional model, facts have
no meaning until you link them with their dimensions. If you change a dimensional attribute
without appropriately accounting for the time dimension, the change becomes global across all
fact records.
Table 31: SCD Type 1 Before Data Change
The table shows the data before an SCD Type 1 change:
236
SALES_PERSON_KEY
SALES_PERSON_I D
NAME
SALES_TEAM
15
000120
Doe, John B Northwest
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Capturing Changes in Data
Table 32: SCD Type 1 After Data Change
The table shows the data when the salesperson's name has been changed:
SALES_PERSON_KEY
SALES_PERSON_I D
NAME
SALES_TEAM
15
000120
Smith, John Northwest
B
Updating the salesperson's dimensional record updates all previous facts so that the salesperson
appears to have always belonged to the new sales team. This may cause issues in terms of
reporting sales numbers for both teams. If you want to preserve an accurate history of who was
on which sales team, SCD Type 1 is not appropriate because it does not preserve history.
SCD Type 2: Unlimited History Preservation
SCD Type 2 resolves most of the issues related to slowly changing dimensions:
•
Generates new rows for significant changes.
•
Requires the use of a unique key. The key relates to facts/time.
•
Optional Effective_Date field. Set a start_time and end_ time column.
•
Optional fsActive field. Filter by fsActive
=Y to view current rows versus expired rows.
With a Type 2 change, you do not need to make structural changes to the
SALES_PERSON_DIMENSION table. Instead, you add a record.
Table 33: SCD Type 2 Before Data Change
The table shows the data before an SCD Type 2 change:
SALES_PERSON_KEY
SALES_PERSON_ID
NAME
SALES_TEAM
15
000120
Doe, John B Northwest
Table 34: SCD Type 2 After Data Change
When you implement a Type 2 change, two records appear as shown in this table:
SALES_PERSON_KEY
SALES_PERSON_ID
NAME
SALES_TEAM
15
000120
Doe, John B Northwest
133
000120
Doe, John B Southeast
SCD Type 3: Limited History Preservation
To implement an SCD Type 3 change, change the dimension structure so that it renames the
existing attribute and adds two attributes. one to record the new value and one to record the date
of the change.
SCD Type 3 Disadvantages
A Type 3 implementation has three disadvantages:
© Copyright . All rights reserved.
237
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 8: Changes in Data
• You can preserve only one change per attribute, such as old and new or first and last.
•
Each Type 3 change requires a minimum of one additional field per attribute and another
additional field if you want to record the date of the change.
•
Although the dimension's structure contains all the data needed, the SQL code required to
extract specific information can be complex.
SCD Type 3 can store a change in data, but cannot accommodate multiple changes or adequately
serve the need for summary reporting.
Table 35: Transactional Source Table Before SCD Type 3 Data Change
The table shows the data in the transactional source table before the SCD Type 3 change:
SALES_PERSON_KEY
SALES_PERSON_ID
NAME
SALES_TEAM
15
000120
Smith, John Northwest
B.
Table 36: Source Table After SCD Type 3 Data Change
The table shows the data in the source table after the SCD Type 3 change:
SALES_PERSON_KEY
SALES_PERSON_ID
NAME
SALES_TEAM
15
000120
Smith, John Southeast
B.
Table 37: Transactional Target Table Before SCD Type 3 Data Change
The table shows the data in the transactional target table before the SCD Type 3 change:
SALES_PERSO SALES_PERSO NAME
N_KEY
N_ID
15
000120
Smith,
John B.
SALES_TEAM
OLD_TEAM
EFF_TO_DATE
Northwest
NULL
NULL
Table 38: Target Table After SCD Type 3 Data Change
The table shows that the new dimensions have been added and the salesperson's sales team has
been changed:
SALES_PERSO SALES_PERSO NAME
N_KEY
N_ID
15
238
000120
Smith,
John B.
SALES_TEAM
OLD_TEAM
EFF_TO_DATE
Southeast
Northwest
Oct_31_2004
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Capturing Changes in Data
Change Data Capture (CDC)
When you have a large amount of data to updat e regularly and a small amount of system down
time for scheduled maintenance on a data warehouse, you must choose t he best method for a
delta load or for updating your data over t ime.
Full Refresh and Changed Data Capture
You can choose to do a full refresh of your data or you can extract only new or modified data t o
update the target system:
Full Refresh
Full refresh is easy to implement and easy to manage. This method ensures that no data is
overlooked or left out due to technical or programming errors. Use full refresh to perform a
delta load to a target system in an environment with a manageable amount of source data.
Changed Data Capture
When an initial load is complete, you can extract only new or modified data and update the
target system. Identifying and loading only changed data is called Changed Data Capture
(CDC). CDC is recommended for large tables.
Changed Data Capture Benefits
•
Improves performance because the job t akes less time to process with less data to extract.
t ransform. and load.
•
The target system tracks change history so t hat data can be correctly analyzed over time.
History Preserving and Surrogate Keys
History preservation allows the data warehouse or data mart to maintain the hist ory of data in
dimension tables so that you can analyze it over t ime.
For example. if a customer moves from one sales region to another, simply updating the
customer record to reflect the new region would g ive you misleading results in an analysis by
region over time. All purchases made by the customer before the move would be attributed
incorrectly to the new region .
The solution is a new record for the customer that reflects the new sales region so that you can
preserve the previous record. In this way. accurate reporting is available for both sales regions.
Data Services is set up to support this by treating all changes to records as INSERT rows by
default.
You also need to manage the constraint issues in your target tables that arise when you have
more than one record in your dimension tables for a single entity. such as a customer or an
employee.
For example. with your sales records, the Sales Representative ID is the primary key and is used
to link that record to all of the representative's sales orders. If you try to add a new record wit h
the same primary key, it causes an exception. On the other hand, if you assign a new Sales
Representative ID to t he new record for that representative, you compromise your ability to
report accurately on t he represent ative's total sales.
To address this issue, create a surrogate key as a new column in the target table as shown in t he
f igure Source Based CDC Using Surrogate Keys . The surrogate key becomes the new primary key
© Copyright . All rights reserved.
239
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 8: Changes in Data
for the records. At the same t ime, you change the properties of the former primary key so that it
is simply a data column.
When a new record is inserted for the same representative, a unique surrogate key is assigned,
allowing you to continue to use the Sales Representative ID to maintain the link to the
representative's orders .
Create surrogate keys by using the gen_row_num or key_generation functions in the Query
transform to create a new output column. This automatically increments whenever a new record
is inserted, or by using the Key Generation transform.
So u rce ( S a les Rep Tabl e )
REP_KEY NAME
REGION
LAST_UPOATE
1
Alvarez
SW
03/01/2008 1:40 PM
2
3
Tanaka
Lani
NE
01/01/2008 2:30 PM
MW
0 1/01/2008 3 :40 PM
Target ( Sa les Rep Dimen s ion Table )
REP_SURR_KEY REP_KEY NAME
REGION
Alvarez SE
1
1
Tanaka NE
2
2
EFF_OATE
ENO_OATE
CURR_FLAG
0 1/0 1/2008
02/29/2008
F
NULL
NULL
NULL
T
3
3
Lani
MW
01/01/2008
0 1/01/2008
4
1
Alvarez
SW
03/01/2008
T
T
t
Surrogate
t
Origi nal
t
Effective
tEnd
t
Curren t
key
key
date
date
Hag
Target (Sales Fact Table)
Fact table uses
surrogate key . Original
key is in the dimension
_J ;
02/29/20084:46 PM
1287
table for lookup of
2
02/29/2008 4: 51 PM
3462
histor ic dat<i.
4
03/01/2008 9 : 10 AM
4286
Figure 42: Source Based CDC Using Surrogat e Keys
Comparison of Source-Based and Target-Based CDC
Setting up a full CDC solution within Data Services may not be requ ired. Many databases now
have CDC support built into them. such as Oracle. SQL Server, DB2 and SAP Sybase.
Alternatively, you can combine surrogate keys with t he Map Operation transform to change all
UPDATE row types to INSERT row types to capture changes.
CDC Solutions
If you do want to set up a full CDC solution, you can choose source-based and/or target- based
CDC.
240
•
Source-based CDC evaluates the source tables to determine what has changed and only
extracts changed rows to load into the target tables.
•
Source-based CDC is preferable to target-based CDC for performance reasons.
•
Some source systems do not provide enough information to make use of the source-based
CDC techniques.
•
Target- based CDC extracts all the data from the source. compares the source and target
rows using table comparison, and then loads only the changed rows into the target.
•
You can use a combination of source-based and target-based techniques.
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Capturing Changes in Data
LESSON SUMMARY
You should now be able to:
•
Update data which changes slowly over t ime
© Copyright . All rights reserved.
241
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 8
Lesson 2
Using Source-Based Change Data Capture
(CDC)
LESSON OVERVIEW
Use source-based Change Data Capture to reduce the amount of data when moving a delta load.
LESSON OBJECTIVES
After completing this lesson. you will be able to:
•
Use source-based change data capture (CDC)
Source-Based CDC
Source-based Changed Data Capture (CDC) is t he preferred method of updating data because it
improves performance by extracting the fewest rows. Source-based CDC, also referred to as
incremental extraction. extracts only the changed rows from the source as shown in the figure
Source-Based CDC.
Source (Sales Rep Table)
REP_
Target (Sales Rep Dimension Table)
REP_
REP_
SURR_KEY
KEY
01/01/2008 1: 10 Pfo1
1
llE
01/01/2008 2:30 PM
r1w
01/01/2008 3: 40 P•1
NAME
REGION
LAST_UPDATE
1
Alvarez
SE
2
n.naiu.
3
Loni
KEY
Source (Sa les Rep Table)
CURR_
NAME
REGION
EFF _DATE
END_OATE
1
AJv&r1tz
SE
01/01/2008
rtULL
T
2
2
n.naiu.
01/01/2008
NULL
3
3
Loni
'"
0 1/01/2008
HULL
,.
MW
FLAG
T
Target (Sales Rep Dimension Table)
..
03/01/2008 1: 40 P1'1
:
,
2
n.naiu.
llE
01/
3
Loni
r<W
01/01/2008 3: 40 P1'1
••
1
2
2
3
3
""'""'"'
llE
1
Alvaroz
SW
ltULL
T
03/01/2008
NULL
T
Figure 43: Source-Based CDC
Time Stamp and Change Logs
To use source-based CDC, your source data must have some indication of the change. The
indication can be a time stamp or a log file change.
•
Time Stamps:
Use the time stamps in your source data to determine what rows have been added or changed
since the last time data was extracted from the source. Your database tables must have at
least an update time stamp to support this type of source-based CDC . Include a create time
stamp also for optimal results.
242
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using Source-Based Change Data Capture (CDC)
•
Change Logs:
Use the information captured by the RDBMS in the log files for the audit trail to determine
what data has been changed.
Time-Based CDC
Some systems have t ime stamps with dates and times, some wit h j ust the dates, and some with
monotonically-generated increasing numbers. You can treat dates and generated numbers in the
same manner.
Time zones are important for time stamps based on real time. You can keep track of time stamps
using the nomenclature of the source system and treat both temporal and logical time stamps in
the same way.
Advantages for Time-Based CDC
Time stamp-based CDC is an ideal solution to track changes if:
•
There are date and t ime fields in t he tables being updated.
•
You are updating a large table t hat has a small percentage of changes between extracts and
an index on the date and time fields.
•
You are not concerned about capturing intermediate resu lts of each transaction between
extracts. for example, if a customer changes regions twice in the same day.
Disadvantages for Time-Based CDC
Time stamp-based CDC is not a suitable solution to track changes if:
•
There are no t ime stamp columns available in the source tables to track changes.
•
You have a large table with a large percentage of it chang ing between extracts and there is no
index on the time stamps.
•
You need to capture physical row deletes.
•
You need to capture multiple events occurring on the same row between extracts.
Time Stamp-Based Technique
To use t ime stamps, add a column to your source and target tables that t racks the time stamps of
rows loaded in a job. When the job executes, t his column is updated along with the rest of the
data. The next job t hen reads the latest time stamp from the target table and selects only the
rows in the source table for which the time stamp is later.
This example illustrates time stamp technique of tracking changes:
•
The last load occurred at 2:00 PM on January 1. 2008.
•
At t hat time. the source table had only one row (key=l ) with a time stamp earlier than the
previous load.
•
Data Services loads this row into the target table wit h the original t ime stamp of 1:10 PM on
January 1, 2008.
•
After 2:00 PM, Data Services adds more rows to t he source table.
© Copyright. All r ights reserved.
243
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 8: Changes in Data
At 3:00 PM on January l, 2008. the job runs again. The job:
1. Reads the Last_ Update f ield from the target table as 01/01/2008 01:10 PM .
2. Selects rows from the source table that have t ime stamps that are later than the value of
Last_ Update. The SQL command to select these rows is:
SELECT
* FROM SOURCE WHERE LAST UPDATE > 01/01/2007 01 : 10 PM
-
This operation returns the second and third rows (key=2 and key=3).
3. Loads these new rows into the target table. For t ime-stamped CDC, you must create a work
flow that contains:
•
A script that reads the target table and sets the value of a global variable to the latest time
stamp.
•
A data flow that uses the global variable in a WHERE clause to filter the data.
The data flow contains a source table, a query. and a target table. The query extracts on ly those
rows t hat have t ime stamps later than the last update.
Time Stamp-Based CDC Delta Job
Follow these steps to set up a time stamp-based CDC delta job:
1. In the Variables and Parameters dialog box, add a global variable with a datatype of datetime
to your job.
2. Add a script in the job workspace.
3. Construct an expression in the script workspace to:
•
Select the last time the job was executed from the last update column in the table.
•
Assign the actual time stamp value to the global variable.
4. Add a data flow to the right -hand side of the script using the tool palette.
5. Add the source. query transform, and target objects to the data flow workspace and connect
them.
6. Right-click the surrogate key column and select the Primary Key option in the menu.
7. On the Mapping tab for the surrogate key column, construct an expression to use the
key_generation function to generate new keys based on t hat column in the target table,
increment ing by 1.
8. Construct an expression on the WHERE tab to select only t hose records with a t ime stamp
that is later than the global variable.
9. Connect the script to the data flow.
Overlaps
There is a window of t ime when changes can be lost between two extraction runs. This overlap
period affects source-based CDC because this capture relies on a static time stamp to determine
changed data.
244
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using Source-Based Change Data Capture (CDC)
For example, if a table has 10,000 rows and a change is made to one of the rows after it was
loaded but before the job ends, the second update can be lost.
There must be a strategy for overlaps. It may be possible to avoid overlaps or it may be necessary
to perform overlap reconciliat ion, for example, by using the database transaction logs. In some
cases, pre-sampling may help with overlaps.
Overlap Reconciliation
Overlap reconciliation requires a special extraction process that re-applies changes that could
have occurred during the overlap period. This extraction can be executed separately from the
regular extraction.
For example, if the highest time stamp loaded from the previous job was 01/01/200810:30 PM
and the overlap period is one hour. overlap reconciliation re-applies the data updated between
9:30 PM and 10:30 PM on January 1, 2008.
The overlap period is equal to the maximum possible extraction t ime. If it can take up to N hours
to extract the data from the source system, an overlap period of N hours (or N hours plus a small
increment) is recommended. For example, if it takes at most two hours to run the job, an overlap
period of at least two hours is recommended.
There is an advantage to creating a separate overlap data flow. A regular data fl ow can assume
that all the changes are new and make assumptions to simplify logic and improve performance.
For example, rows flagged as INSERT are often loaded into a fact table, but rows flagged as
UPDATE are rarely loaded into a fact table.
The regular data flow selects the new rows from the source, generates new keys for them, and
uses the database loader to add the new facts to the target database. As the overlap data flow is
likely to apply the same rows again, it cannot blindly bulk load them or it creates duplicates.
The overlap data flow must check whether the rows exist in the target and insert only the ones
that are missing. As this lookup affects performance, perform it for as few rows as possible.
If the data volume is sufficiently low, you can load the entire new data set using this technique of
checking before loading, avoiding the need to create two different data flows.
© Copyright. Al l r ights reserved.
245
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 8 : Changes in Data
246
© Copyright . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 8
Exercise 17
Use Source-Based Change Data Capture (CDC)
Business Example
You need to set up a job to update employee records in t he Omega data warehouse whenever
they change. The employee records include time stamps to indicate when they were last updated,
so you can use source-based CDC.
Construct and configure a batch job Alpha_Employees_J ob , which updates employee table
columns based on whether records are new or have been changed since the last time data was
updated.
1. In the Omega project. create a new batch job and data flow called
Alpha_Employees_Dim_Job and a new global variable $G_LastUpdate.
2. In the job Alpha_Employees_Dim_Job workspace. add a script called GetTimeStamp and
construct an expression to select the last time the job executed and on that basis, if the time
stamp is NULL, then all records are processed. If the time stamp is not NU LL. then assign the
value to the global variable GetTimeStamp.
3. In the job Alpha_Employees_Dim_Job workspace, add a data flow
Alpha_Employees_Dim_DF to the right of the script and connect it to the script.
4. Add the Employee table from the Alpha datastore as the source object and the EMP_DIM table
from the Omega datastore as the target object of the data flow Alpha_Employees_Dim_DF.
Connect them with a Query t ransform .
5. Map the Schema In fields of the Query transform to the Schema Out fields, as follows:
Schema In
Schema Out
EMPLOYEEID
EMPLOYEEID
LASTNAME
LASTNAME
FIRSTNAME
FIRSTNAME
BIRTHDATE
BIRTH DATE
HI REDA TE
HI REDA TE
ADDRESS
ADDRESS
PHONE
PHONE
EMAIL
EMAIL
REPORTSTO
REPORTSTO
LastUpdate
LAST_UPDATE
© Copyright . All r ights r eserved.
247
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 8: Changes in Data
Schema In
Schema Out
DISCHARGE_DATE
DISCHARGE_DATE
6 . Create a mapping expression for the SURR_KEY column that generates new keys based on
the EMP_OIM target table incrementing by 1 by using the Functions w izard.
7. For the CITY output column. change the mapping to perform a lookup of CITYNAME from the
City table in the Alpha datastore based on the city ID.
8 . For the REGION output column. change the mapping to perform a lookup of REGIONNAME
from t he City table in the Alpha datastore based on the city ID.
9. For the COUNTRY output column. change the mapping to perform a lookup of COUNTRYNAME
from the City table in the Alpha datastore based on the city ID.
10. For the DEPARTMENT output column. change the mapping to perform a lookup of
DEPARTMENTNAME from the City table in the Alpha datastore based on the city ID.
11. On the WHERE tab, construct an expression to select only those records w it h a t ime stamp
that is later than the value of the global variable $G_LastUpdate.
12. View the data in the source and t arget tables before executing the job.
13. Execute Alpha_Employees _Dim_Job with the default properties.
248
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 8
Solution 17
Use Source-Based Change Data Capture (CDC)
Business Example
You need to set up a job to update employee records in t he Omega data warehouse whenever
they change. The employee records include time stamps to indicate when they were last updated ,
so you can use source-based CDC.
Construct and configure a batch job Alpha_Employees_J ob , which updates employee table
columns based on whether records are new or have been changed since t he last time data was
updated.
1. In the Omega project. create a new batch job and data flow called
Alpha_Employees_Dim_Job and a new global variable $G_LastUpdate.
a) In the project area. right-click the Omega project. choose New batch job and enter the
name Alpha_Employees_Dim_Job.
b) Select the job Alpha_Employees_Dim_Job and, in the main menu, choose
Tools -+ Variables .
c) Right-click Global Variables and choose Insert.
d) Right-click the new variable, and choose Properties.
e) In the Global Variable Properties dialog box, enter the name $G_LastUpdate , enter the
Data type da tetime and choose OK.
f) Close the Variables and Parameters window.
2. In the job Alpha_Employees_Dim_Job workspace, add a script called GetTimeStamp and
construct an expression to select the last time the job execut ed and on that basis, if the time
stamp is NULL, then all records are processed. If the time stamp is not NU LL, then assign the
value to the g lobal variable GetTimeStamp.
a) To add the script to the Alpha_Employees_Dim_Job workspace, in t he tool palette.
choose the Script icon, click in t he workspace. and enter the name GetTimeStamp.
b) To open the GetTimeStamp script, double-c lick it.
c) In the script, ent er the following expression:
$G_LastUpdate
EMP_DIM') ,
=
to_date(sql( ' Omega ',
from
'MON DD YYYY HH :MI');
i f ($G_LastUpdate is null)
I yyyy . MM. DD I ) ;
$G_LastUpdate
else print('Last update was '
© Copy right . All r ights r eserved.
' select max(LAST_UPDATE)
=
to_date
('1901 . 01 . 01' ,
I I $G_LastUpdate );
249
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 8 : Changes in Data
This expression updates the value of the global variable to the value of the last update
column in the employee dimension table. The script:
a) Selects the last time t he job was executed from the last updat e colum n in the
employee d imension table.
b) If the last update column is NULL, assigns a value of January 1, 1901 to the
$G_LastUpdate global variable. When t he job executes for the initial load, this
ensures that all records are processed.
c) If the last update column is not NULL, assign the actual time stamp value to the
$G_LastUpdate global variable, and print the value of the variable t o t he job's log file.
d) Close the Script and go Back to the job workspace.
3 . In t he job Alpha_Employees_Dim_Job workspace, add a data flow
Alpha_Employees_Dim_OF to the right of the script and connect it to the script.
a) To add the data flow, in the tool palette, choose the Data Flow icon. click in the job
workspace and enterthe data f low name, Alpha_Employees_Dim_DF.
b) To connect the GetTimeScript to Alpha_Employees_Dim_DF data flow, select the
script, hold down t he mouse button. drag t he cursor to the data flow and release the
mouse button.
c) To open the dat a flow workspace, double-click Alpha_Employees_Dim_DF.
4. Add t he Employee table from the Alpha datastore as the source object and the EMP_DIM table
from the Omega datastore as the target object of the data flow Alpha_Employees_Dim_DF.
Connect them with a Query transform.
a) In the Local Object Library, select the Datastores tab.
b) From the Alpha datastore, select t he Employee table, drag it into the data flow workspace
and choose Make Source.
c) From the Omega datastore, select the EMP_DIM table, drag it into the data flow workspace
and choose Make Target.
d) To add the query, in the tool palette, choose the Query Transform icon and click in t he data
flow workspace.
e) Connect the source table t o the query and connect the query to the target table.
5. Map t he Schema In fields of the Query transform to the Schema Out fields, as follows:
250
Schema In
Schema Out
EMPLOYEEID
EMPLOYEEID
LASTNAME
LASTNAME
FIRSTNAME
FIRSTNAME
BIRTHDATE
BIRTHDATE
HIREDA T E
HI REDA TE
ADDRESS
ADDRESS
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using Source-Based Change Data Capture (CDC)
Schema In
Schema Out
PHONE
PHONE
EMAIL
EMAI L
REPORTSTO
REPORTSTO
Last Update
LAST_UPDATE
DISCHARGE_DATE
DISCHARGE_DATE
a) To open the Query Editor, double-click the query.
b) To map the columns in the Schema In pane to the columns in t he Schema Out pane, select
the column and drag it f rom Schema In to Schema Out.
6 . Create a mapping expression for the SURR_KEY column that generates new keys based on
the EMP_DIM target table incrementing by 1 by using the Functions w izard.
a) In t he Schema Out pane, choose theSURR_KEY column.
b) In t he Mapping tab. choose the Function button.
c) In t he Functions Categories field, choose Database Functions . in the Function Name f ield,
choose the Key_generation function, and choose Next
d) In the Define Input Parameters dialog box, enter the parameters:
Field/Option
Value
Table
Omega.dbo.EMP_DIM
Key_column
SURR KEY
Key_increment
1
e) Choose Finish.
You see the expression key_generation ( ' Omega. dbo .EMP_DIM ',
' SURR- KEY ' ,
1) .
7. For the CITY output column, change the mapping to perform a lookup of CITYNAME from t he
City table in the Alpha datastore based on the city ID.
a) In the Schema Out workspace. select t he CITY f ield, and, in the Mapping tab, delete the
existing expression.
b) Choose the Functions button,
c) In the Functions Categories fi eld, choose Lookup Functions, in the Function Name f ield,
choose lookup_ext, and choose Next
d) In the Lookup_ext - Select Parameters dialog box, enter t he following parameters:
Field/Option
Value
Lookup table
Alpha.dbo.city
© Copyr ight. All r ights reserved.
251
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 8: Changes in Data
Field/Option
Value
Condition
Columns in lookup table
CITYID
Op.(&)
Expression
employee.CITYID
Output
Column in lookup table
CITYNAME
e) Choose Finish.
8. For the REGION output column. change the mapping to perform a lookup of REGIONNAME
from the City table in the Alpha datastore based on the city ID.
a) In the Mapping tab for the output schema f ield REGION, and, to delete t he existing
expression, highlight it and use the Delete button on your keyboard.
b) Choose the Functions button.
c) In the Functions Categories field, choose Lookup Functions. in the Function Name field,
choose lookup_ext , and choose Next
d) In the Lookup_ ext - Select Parameters d ialog box, enter the parameters:
Field/Option
Value
Lookup table
Alpha.dbo.region
Condition
Columns in lookup table
REGIONID
Op.(&)
Expression
employee.REGIONID
Output
Column in lookup table
REGIONAME
e) Choose Finish.
9. For the COUNTRY output column, change the mapping to perform a lookup of COUNTRYNAME
from the City table in the Alpha datastore based on the city ID.
a) In the Mapping tab for the output schema f ield COUNTRY, and, to delete the existing
expression. highlight it and use the Delete button on your keyboard .
b) Choose the Functions button.
c) In the Functions Categories field, choose Lookup Functions, in the Function Name field,
choose lookup_ext , and choose Next
d) In the Lookup_ext - Select Parameters d ialog box, enter the parameters:
252
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using Source-Based Change Data Capture (CDC)
Field/Option
Value
Lookup table
Alpha.dbo.country
Condition
Columns in lookup table
COUNTRY ID
Op.(&)
-
Expression
employee.COUNTRYID
Output
Column in lookup table
COUNTRYNAME
e) Choose Finish.
10. For the DEPARTMENT output column, change the mapping to perform a lookup of
DEPARTMENTNAME from the City table in t he Alpha datastore based on the city ID.
a) In the Mapping tab for the output schema field DEPARTMENT, and, to delete t he existing
expression, highlight it and use t he Delete button on your keyboard .
b) Choose the Functions button.
c) In the Functions Categories field, choose Lookup Functions, in the Function Name field,
choose lookup_ext , and choose Next
d) In the Lookup_ext - Select Parameters d ialog box, enter the parameters:
Field/Option
Value
Lookup table
Alpha.dbo.department
Condition
Columns in lookup table
DEPARTMENT ID
Op.(&)
-
Expression
employee.DEPARTMENTID
Output
Column in lookup table
DEPARTMENTNAME
e) Choose Finish.
11. On the WHERE tab, construct an expression to select only those records with a time stamp
that is later than t he value of t he global variable $G_LastUpdate.
a) In the Query Editor, select t he WHERE tab.
b) In the workspace, enter the following expression:
employee.LastUpdate > $G_LastUpdate
c) Close the editor.
12. View the data in the source and target tables before executing t he job .
© Copyright . All r ights reserved.
253
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 8: Changes in Data
a) In the Data Flow workspace, choose the View Data icon on the employee source table.
b) Choose the View Data icon on the EMP_DIM target table.
c) Note the number of rows in each table.
d) Close both View Data windows.
13. Execute Alpha_Employees_Dim_Job with the default properties.
a) In the project area, select the Alpha_Employees_Dim_Job and choose Execute.
b) To save all the objects that you have created, choose Yes .
c) To execute the job using default properties, choose OK.
According to the log, the last update for the table was on "2007.10.04"
d) Return to the job workspace.
e) To open the data flow workspace, double-click the data flow.
f) Right-click the target table and choose View data.
g) Sort the records by the LAST_UPDA TE column.
h) Close the display.
254
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using Source-Based Change Data Capture (CDC)
LESSON SUMMARY
You should now be able to:
•
Use source-based change data capture (CDC)
© Copyr ight. All r ights reserved.
255
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 8
Lesson 3
Using Target-Based Change Data Capture (CDC)
LESSON OVERVIEW
Some of your data does not provide enough information, for example t ime stamps or logs, to
perform a source-based CDC. Use target-based CDC to compare the source to the target and to
determine which records have changed.
LESSON OBJECTIVES
After completing this lesson, you will be able to:
•
Use target-based CDC
Target-Based Change Data Capture (CDC)
Target-based Change Data Capture (CDC) compares the source to the target to determine
which records have changed.
Source-based CDC evaluates the source tables to determine what has changed and only extracts
changed rows to load into the target tables. Target-based CDC, by contrast, extracts all t he data
from the source, compares the source and target rows, and then loads only the changed rows
into the target with new surrogate keys.
Use target-based CDC when source-based information is limited.
History Preservation
Preserve history by creating a data flow that contains:
•
A source table that contains t he rows to be evaluated.
•
A Query transform that maps columns from the source.
•
A Table Comparison transform that compares the data in the source table with the data in the
target table to determine what has changed .
•
A History Preserving transform that converts certain UPDATE rows to INSERT rows based on
the columns in which values have changed. This transform produces a second row in the
target instead of overwriting the f irst row.
•
A Key Generation transform that generates new keys for the updated rows that are now
f lagged as INSERT.
•
A target table that receives the rows. The target table cannot be a template table.
The steps in the target-based change data capture process are outlined in the figure Targetbased Change Data Capture
256
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using Target-Based Change Data Capture (CDC)
CUSTOMERS(Source ....
1~~~
0..1
I
Queiy
@
Table_Comparison
-.
~
I
Hlstoiy_Preservin...
.....
1ft
I
...
Key_ Generation
..,.
~~
CUST_DIM(Target.D ...
L~~~1
SJ
Figure 44: Target Based CDC
Table 39: History Preserving Transforms
Three Data Services transforms support history preservation:
Transform
Description
History Preserving
Converts rows flagged as UPDATE to UPDATE plus INSERT. so
that the original values are preserved in the target. Specify the
column in which to look for updated data.
Key Generation
Generates new keys for source data, starting from a value
based on existing keys specified by you in the table.
Table Comparison
Compares two data sets and produces the difference between
them as a data set with rows flagged as INSERT or UPDATE.
The Table Comparison Transform
Detect and forward changes that have occurred since the last time a target was updated with the
Table Comparison transform. This transform compares two data sets and produces the
difference between them as a data set with rows flagged as INSERT or UPDATE.
Input, Comparison, and Output
The transform compares the input table and the comparison table. The transform selects rows
from the comparison table based on the primary key values from the input data set. The
transform compares columns that exist in the schemas for both inputs.
The input data set must be flagged as NORMAL.
The output data set contains only the rows that make up the difference between the tables. The
schema of the output data set is the same as the schema of the comparison table. No DELETE
operations are produced.
If a column has a date datatype in one table and a datetime datatype in the other. the transform
compares only the date section of the data. The columns can also be time and datetime data
types, in which case Data Services only compares the time section of the data.
© Copyright . All rights reserved.
257
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 8: Changes in Data
Transform Outcomes
There are three possible outcomes from the transform for each row in the input data set:
•
An INSERT column is added:
The primary key value from the input data set does not match a value in the comparison table.
The transform produces an INSERT row with the values from the input data set row.
If there are columns in the comparison table that are not present in the input data set, the
transform adds these columns to the output schema and fills them with NULL values.
•
An UPDATE row is added:
The primary key value from the input data set matches a value in the comparison table. Values
in the non-key compare columns differ in the corresponding rows from the input data set and
the comparison table.
The transform produces an UPDATE row with the values from the input data set row.
If there are columns in the comparison table that are not presen t in the input data set, the
transform adds these columns to the output schema and f ills them with values from t he
comparison table.
•
The row is ignored:
The primary key value from the input data set matches a value in the comparison table, but
the comparison does not indicate any changes to the row values.
Table 40: Table Comparison Transform Options
258
Option
Description
Table name
Specifies the fully qualified name of the source table. Table
name is represented as datastore.owner.table. Datastore is the
name of the datastore Data Services uses to access the key
source table. Owner depends on the database type associated
with the table.
Generated key column
Specifies a column in the comparison table. When t here is
more than one row in the comparison table with a given
primary key value, this transform compares the row with the
largest generated key value of these rows and ignores the other
rows.
Input contains duplicate keys
Provides support for input rows wit h duplicate primary key
values.
Detect deleted rows from
comparison table
Identifies rows that have been deleted from the source.
Comparison method
Access the comparison table by using row-by-row select,
cached comparison table, or sorted input.
Input primary key columns
Specifies the columns in the input data set that uniquely
identify each row. These columns must be present in the
comparison table with the same column names and data types .
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using Target-Based Change Data Capture (CDC)
Option
Description
Compare columns
Improves performance by comparing only the subset of
columns you drag into this box from the input schema . If no
columns are listed, all columns in the input data set that are
also in t he comparison table are used as compare columns.
The History Preserving Transform
The History Preserving transform has its own data input requirements, data output results, and
options.
Input/Output Data Sets
Input Data Set
This data set is the result of a comparison between two versions of the same data . Rows wit h
changed data from the newer version are flagged as UPOA TE rows and new data from the
newer version are flagged as INSERT rows.
Output Data Set
This dat a set contains rows flagged as INSERT or UPOA TE.
Table 41: History Preserving Options
The History Preserving transform offers the options outlined in t his table:
Option
Description
Valid from
Specify a date or dat etime column from the source schema.
Specify a Valid from date colum n if the target uses an effective
date to track changes in data.
Valid to
Specify a date value in the format: YYYY.MM.00. The Valid to
date cannot be the same as the Valid from date.
Column
Specify a column from t he source schema that identifies the
current valid row from a set of rows with the same primary key.
The Flag column indicates whether a row is the most current
data in the target for a g iven primary key.
Set value
Define an expression that outputs a value with the same
dat atype as the value in t he Set flag column. Use this value to
update the current flag colum n. The new row in the target
preserves the history of an existing row.
Reset value
Define an expression that outputs a value with the same
datatype as the value in t he Reset flag column. Use t his value to
update the current flag column in an existing row in the target
when the row in the target includes changes in one or more of
the compare columns.
©Copyright . All r ights reserved.
259
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 8: Changes in Data
Option
Description
Preserve delete rows as
update rows
Convert DELETE rows to UPDATE rows in the target. If you
previously set Valid from and Valid to values, this option sets
the Valid to value to the execution date. Use this option to
maintain slowly changing dimensions by feeding a complete
data set through the Table Comparison transform first. Select
the Detect deleted rows from comparison table option in the
Table Comparison transform.
Compare columns
List the columns in the input data set to compare for changes.
•
If the values in the specified compare columns in each
version match, the transform flags the row as UPDATE. The
row from the before version and the date and flag
information is updated.
•
If the values in each version do not match, the row from the
latest version is f lagged as INSERT when output from the
transform. This adds a new row to the warehouse with the
new values.
Updates to non-history preserving columns update all
versions of the row if the update is performed on t he natural
key, but only update the latest version if the update is on the
generated key.
The Key Generation Transform
The Key Generation transform generates new keys before inserting t he data set into the target in
the same way as the key_generation function.
When it is necessary to generate artificial keys in a table, this transform looks up the maximum
existing key value from a table and uses it as the starting value to generate new keys. The
transform expects the generated key column to be part of the input schema.
For example. the History Preserving transform produces rows to add to a warehouse. and these
rows have the same primary key as rows that already exist in the warehouse. In this case, add a
generated key to the warehouse table to distinguish these t wo rows t hat have the same primary
key.
Input/Output Data Sets
Input data set
This data set is the result of a comparison between two versions of the same data. Changed
data from the newer version is flagged as an UPDATE row and new data from the newer
version is flagged as an INSERT row.
Output data set
This data set is a duplicate of t he input data set. with the addition of key values in the
generated key column for input rows flagged as INSERT.
260
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using Target-Based Change Data Capture (CDC)
Table 42: Key Generation Transform Options
The table outlines the options available with the Key Generation Transform:
Option
Description
Table name
Specify the fully qualified name of the key source table from
which the maximum existing key is determined. This table must
be already imported into the repository. Table name is
represented as datastore.owner. table where datastore is the
name of the datastore that Data Services uses to access the
key source table and owner depends on the database type
associated with the table.
Generated key column
Specify the column in the key source table containing the
existing key values. A column with the same name must exist in
the input data set. The new key is inserted in this column.
Increment values
Ind icate the interval between generated key values.
© Copyright . All r ights reserved.
261
~
~
Unit 8: Changes in Data
262
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 8
Exercise 18
Use Target-Based Change Data Capture (CDC)
Business Example
You f ind that some of your data does not provide any time stamps or logs to provide a sourcebased CDC. You want to investigate using target-based CDC to compare the source to the target
to determine which records have changed .
You need to set up a job to update product records in the Omega data warehouse to capture
change. The product records do not include time stamps to indicate when they were last updated.
Use target-based change data capture to extract all records f rom the source and compare them
to t he target
1. In the Omega project, create a new batch job called Al.pha_Product_Dim_Job cont aining a
data f low called Alpha_Product_Dim_DF.
2. In the workspace for Alpha_Product_Dim_DF, add the Product table from the Alpha
datastore as t he source object and t he Product_Dim table from the Omega datastore as the
target object.
3. Add a Query transform to the workspace connecting it to the source and target objects. In
addition. add the Table Comparison, History Preserving and Key Generation transforms to the
workspace.
4. In the transform editor for the Query transform. map input columns to output columns. by
dragging corresponding columns from the input schema to t he output schema. After deleting
t he link between the Query transform and t he target table, complete t he connection of the
remaining objects in the data flow workspace.
5. In the t ransform editor for the Table Comparison transform. use the PRODUCT_DIM table in
t he Omega datastore as the comparison table and set the field SURR_KEY as the generated
key column.
6. In the transform editor for the Key Generat ion transform, set up key generation based on the
SURR_KEY column of the PRODUCT_DIM table and increment the key by a value of 1.
Do not configure the History Preserving transform.
7. In the data f low workspace, before execut ing the job, display the data in both the source and
target tables.
8. Execute the Alpha_Product_Dim_Job with the default execution properties.
© Copy right . All r ights r eserved.
263
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 8
Solution 18
Use Target-Based Change Data Capture (CDC)
Business Example
You find that some of your data does not provide any t ime stamps or logs to provide a sourcebased CDC. You want to investigate using target-based CDC to compare the source to the target
to determine which records have changed.
You need to set up a job to update product records in the Omega data warehouse to capture
change. The product records do not include time stamps to indicate when they were last updated.
Use target-based change data capture to extract all records f rom the source and compare them
to the target
1. In the Omega project. create a new batch job called Alpha_Product_Dim_Job containing a
data flow called Alpha_Product_Dim_DF.
a) In the Project area, right-click the Omega project name and choose New Batch Job .
b) Enter t he job name, Alpha_Product_Dim_Job and, on your keyboard, press the Enter
key.
If the Alpha_Product_Dim_Job does not open automatically, to open. double-click the
job .
c) To add the data flow to the Alpha_Product_Dim_Job, in the tool palette, choose the
Data Flow icon. and click in the workspace, enter the name Alpha_Product_Dim_DF and,
on your keyboard, press the Enter key.
d) To open the data flow workpspace. double-click Alpha_Product_Dim_DF.
2. In the workspace for Alpha_Product_Dim_DF, add the Product table from the Alpha
datastore as t he source object and t he Product_Dim table from the Omega datastore as the
target object.
a) In the Local Object Library, choose the Datastores tab.
b) In the Alpha datastore, select the Product table, drag it to the data flow workspace, and
choose Make Source.
c) In the Omega datastore. select the PRODUCT_DIM table, drag it to the data flow
workspace. and choose Make Target.
3. Add a Query transform to the workspace connecting it to the source and target objects. In
addition, add the Table Comparison, History Preserving and Key Generation transforms to the
workspace.
a) To add the query to the Alpha_Product_Dim_DF, in the tool palette, choose the Query
Transform icon and click in the workspace.
264
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using Target-Based Change Data Capture (CDC)
b) Connect the source table, Product , to the Query transform.
c) Connect the target table PRODUCT_DIM to the Query transform .
d) In the Local Object Library, choose the Transforms tab and expand the Data Integrator
node.
e) Select Table Comparison and drag it to the data flow workspace to the right of the Query
transform.
f) Select History Preserving and drag it to the data flow workspace to the right of the Table
Comparison transform.
g) Select Key Generation and drag it to the data flow workspace to the right of the History
Preserving transform.
4. In the transform editor for the Query transform . map input columns to output columns. by
dragging corresponding columns from the input schema to the output schema. After deleting
the link between the Query transform and the target table, complete the connection of the
remaining objects in the data flow workspace.
a) Double-click the Query transform to open the Query Editor.
b) In the Schema In workspace select the following fields, and drag them to the
corresponding fields to the Schema Out workspace.
Schema In
Schema Out
PRODUCTID
PRODUCTID
PRODUCTNAME
PRODUCTNAME
CATEGORYID
CATEGORYID
COST
COST
c) Select the output schema field SURR_KEY and, on the Mapping tab, enter the value NULL.
This provides a value until a key can be generated.
d) Select the output schema field EFFECTIVE_DATE and. on the Mapping tab, enter the value
sysdate ( ) .
This provides the system current date as the effective date.
e) Close the editor.
f) To delete the link between the Query transform and the target table, right-click the link
and choose Delete.
g) To connect the Query transform to t he Table Comparison transform, click the Query
transform. hold down the mouse button, drag the cursor to the Table Comparison
transform and release the mouse button.
h) Repeat the above step to connect the following:
The Table Comparison transform and the History Preserving transform .
The History Preserving transform and the Key Generation transform .
© Copyright . All r ights reserved.
265
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 8: Changes in Data
The Key Generation transform and the target table.
5. In the transform editor for the Table Comparison transform. use the PRODUCT_DIM table in
the Omega datastore as the comparison table and set the field SURR_KEY as the generated
key column.
a) To open the Transform Editor. double-click the Table Comparison transform.
b) In the Table Comparison tab, use the drop-down list for the Table name f iled. and select
PRODUCT_DIM in the Omega datastore as the comparison table from which the maximum
existing key is determined.
The PRODUCT_DIM is t he comparison table from which the maximum existing key is
determined.
c) Use the drop-down list for the Generated key column field, and select SURR_KEY as t he
generated key column .
d) In the Schema In. select the PRODUCTNAME. CATEGORYID and COST fields and drag
t hese to the Compare columns field.
e) In the Schema In. select the field PRODUCTID field and drag it to the Input primary key
column(s) field.
f) Close the editor.
6. In the transform editor for the Key Generation transform. set up key generation based on the
SURR_KEY column of the PRODUCT_DIM table and increment the key by a value of 1.
Do not configure t he History Preserving transform.
a) To open the Key Generation Transform Editor, double-click the Key Generation transform.
b) In the drop-down list for t he Table name field select PRODUCT_DIM in the Omega
datastore .
The PRODUCT_DIM is t he comparison table from which the maximum existing key is
determined.
c) In the drop-down list for the Generated key column field, select SURR_KEY as t he
generated key column.
d) In the Increment Value field . enter 1.
e) Close the editor.
7. In the data flow workspace, before executing the job, display the data in both the source and
target tables.
a) In the data flow workspace, select the magnifying glass on the source table.
A large View Data pane appears beneath the current workspace.
b) In the data flow workspace. select the magnifying glass on the target table.
A large View Data pane appears beneath the current workspace.
c) Note that the "OmegaSoft" product has been added in the source, but has not yet been
updated in the target.
266
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using Target-Based Change Data Capture (CDC)
8. Execute the Alpha_Product_Dim_Job with the default execution properties.
a) In the Omega project area, right-click on the Alpha_Product_Dim_Job and choose
Execute .
b) In the Save all changes and execute dialog box, choose Yes.
c) To execute the job using default properties, in the Execution Properties dialog box, choose
OK.
d) Return to the job workspace.
e) To open the data flow workspace, double-click the data flow.
f) Right click the target table and choose View data.
Note that there are new records for "product IDs 2. 3, 6, 8. and 13" and that "OmegaSoft"
has been added to the target.
g) Close the display.
© Copyright . All r ights reserved.
267
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 8: Changes in Data
LESSON SUMMARY
You should now be able to:
•
268
Use target-based CDC
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 8
Learning Assessment
1. There are three ways to manage slowly changing dimensions. Match the SCD type with the
correct description.
Arrange these steps into the correct sequence.
D Unlimited history preservation
D No history preservation
D Limited history preservation
© Copy right . All r ights r eserved.
269
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 8
Learning Assessment - Answers
1. There are three ways to manage slowly changing dimensions. Match the SCD type with the
correct description.
Arrange these steps into the correct sequence.
~Unlimited history preservation
~No history preservation
~ Limited history preservation
270
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Data Services Integrator
Transforms
Lesson 1
272
Using Data Services Integrator Transforms
Lesson 2
277
Using the Pivot Transform
Exercise 19: Use the Pivot Transform
281
Lesson 3
Using the Data Transfer Transform
287
293
Exercise 20: Use the Data Transfer Transform
UNIT OBJECTIVES
•
Use the data services integrator transforms
•
Use the pivot transform
•
Describe performance optimization
© Copy right . All r ights r eserved.
271
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 9
Lesson 1
Using Data Services Integrator Transforms
LESSON OVERVIEW
Use Data Services Integrator transforms to enhance your data integration projects beyond the
core functionality of the platform transforms. Perform key operations on data sets to manipulate
their structure as they are passed from source to target.
LESSON OBJECTIVES
After completing this lesson, you will be able to:
•
Use the data services integrator transforms
Data Services Integrator Transforms
Business Example
Data Services Integrator transforms are used to enhance your data int egration projects beyond
the core functionality of Platform transforms. In your projects, you encounter XML data with
repeated nodes, hierarchy data, or sources of data where there are either too many fields or not
enough fields. You find that the Platform transforms do not provide enough flexibility and so you
turn to the Data Services Integrator-specific transforms for assistance.
Data Services Integrator transforms perform key operations on data sets to manipulate their
structure as they are passed from source to target as shown in t he figure Data Services
Integrator Transforms.
...·:.t.t1
•••
ITIJ
~
[i]
I! Ill]
[]§]
[§ii]
1 ~·· ~I
•
Data Transfer
Date Generation
Effective Date
Hierarchy Flattening
Map CDC Operation
Pivot
Reverse Pivot
XML Pipeline
Figure 45: Data Services Integrator Transforms
272
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using Data Services Integrator Transforms
Table 43: Defining Data Services Integrator Transforms
These transforms are available in the Data Integrator branch of the Transforms tab in the Local
Object Library:
Transform
Description
Data Transfer
Allows a data flow to split its processing into
two subdata f lows and push down resourceconsuming operations to the database server.
Date Generation
Generates a column filled w ith date values
based on the start and end dates and
increment you specify.
Effective Date
Generates an additional effective to column
based on the primary key's effective date .
Hierarchy Flattening
Flattens hierarchical data into relational tables
so that it can participate in a star schema.
Hierarchy flattening can be both vertical and
horizontal.
Map CDC Operation
Sorts input data, maps output data, and
resolves before and after versions for UPDATE
rows. While commonly used to support Oracle
or mainframe changed data capture. this
transform supports any data stream if its input
requirements are met.
Pivot
Rotates the values in specified columns to
rows.
Reverse Pivot
Rotates the values in specified rows to
columns.
XML Pipeline
Processes large XML inputs in small batches.
Date Generation Transform
Use this transform to produce the key values for a t ime dimension target as shown in the figure
Date Generation Transform Editor. From this generated sequence, populate other fields in the
time d imension (such as day_of_week) using functions in a query.
Example: To create a time d imension target with dates from the beginning of the year 1997 to the
end of the year 2000, place a Date_Generation transform, a query, and a target in a data f low.
Connect the output of the Date_ Generation transform to the query, and the output of the query to
the target.
© Copyright . All rights reserved .
273
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 9 : Data Services Integrator Transforms
Schema out:
3
!!§ElDate_Generation
I Descrintion
IT
8
WI DalB_Generation
1-···- ··- ··- ..- ·- ·- ·- ·- -·- ··- ···- ·- ·- ·- - - -···- ·- ·- ·- -·- -·- ··- ··- ·- - -·-··0 [Dl~G.E.NE.~ TE.Q.~DA.TE..__....Th.~<;lall,! is. au..to.rna!ic~!!}'g.ei.ner.a~..~:.:.._.da~ - - - .
~
I
I
~
Date Generation J
2tart date: j 2006.11.30
I
l!)O'emert: IDaly
E_nd date: 2006.11.30
G
G
G
(yyyy.mm.dd or $name)
(yyyy.mm.dd or $name)
,loin rank: j O
r
~ache
Figure 46: Date Generation Transform Editor
Date_Generation Transform
Inside the Date_Generation transform, specify the following Options:
•
St art date: 1997. 01. 01 (A variable can also be used.)
•
End date: 2000 .12. 31 (A variable can also be used.)
•
Increment: Daily (A variable can also be used.)
Time Dimension Values
Inside the query, create two target columns and the field name, and define a mapping for the
following t ime dimension values:
•
Business quarter: BusQuarter Function: quarter (Generated_date)
•
Date number from start: DateNum Function: julian (generated_date) julian(1997.01.01)
Effective Date Transform
As shown in the figure Effective Date Transform Editor. an effective-to value is calculated for data
that contains an effective dat e. The calculated effective-to date and an existing effective date
produce a date range that allows queries based on effective dates t o produce meaningful results.
274
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using Data Services Integrator Transforms
Oescr ltn
-
• EMPl.0'1££CHANCES
-
q,, l'ERSON'Elft.M!ER
: EffeclMIJ)ate
q,, l'ERSON'Elft.M!ER
D eECINlATE
ll CHANGEJD
8 EJ\lf'LO'IEE_STATUS
0 ORIG_jt]RfJ)ATE
0 TERMINATION_DATE
0 EVENT_TWE
0 REASON_COOE
••
•
T
ll BEGOOATE
ll CHANGEJD
ll EJ\lf'LO'IEE_STATUS
ll OR!GjllREJ)ATE
ll TERMINATION_OATE
ll EVENT_TWE
ll REASON_COOE
ll EFFECTIVE_To_ca.l.MI
EffedM>!lotecol.rM:jSEGJ>l>AIY
v
v
v
The date unlll whleh 1h1S reco...
iJ
Effe<tNe~col.rM: ~
,O<ANGEJ0------3~
E11«t1ve to col.rM: fEff{CllVE_TO_COUHI
Oefd elr.tNe to date ¥M:( 2006. JJ.30 [ • (YY)')l.tlWll.dd 0t $nollle)
y
Figure 47: Effective Date Transform Editor
Map CDC Operation Transform
Map CDC Operation Transform
As shown in the figure Map CDC Operation Transform Editor, using input requirements (values
for the Sequencing column and a Row operation column), performs three f unctions:
1. Sorts input data based on values in Sequencing column box and (optionally) the Additional
Grouping column box.
2. Maps output data based on values in Row Operation column box. Source table rows are
mapped to INSERT, UPDATE, or DELETE operations before passing t hem on to the target.
3. Resolves missing, separated, or multiple before-and after-images for UPDATE rows.
© Copyright . All rights reserved .
275
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 9: Data Services Integrator Transforms
SchoN In: ~CDC-OOS..MlOMEA.
"'-°"' l!lll""'-coc__ation
..:I
I 0er;tr-.1-...
I'
- IW COC_OOS_CIJST~
0 1.SEQ.ENCE_MME!\
0 1.0fS\ATIOH_TWE
,1
lilJ Mop_COC_Optl'i tlOn
Of'[AATl<lm
D
D
CSCNS
oPERAT1CN$
CSCNS
CCM<tlT. TWEST-S
RSIOS
CCJt.NIT_TIMESTA'*S
IJSSWAMES
l\5101i
Tll>ESTAM'S
CV$T.J()
C\.IST_Ct.ASS!'
USERHAN6
d
I
Tlll'EST-S
CUSTJD
CUST.Cl.ASSF
NAME!
NAl-El
CITY
AOORl!$
REGION. ID
C1TY
AEC!OH.JD
ZIP
OJST.tl~AW
ZlP
C\.IST . TWESTAW
Al)l)RESS
•
I
'I
""'CDC ""'"""' I
CDC C°'-"""s
_.....,'°""''
B.°"" ooetilltion <obM:
jOl_SEQUENCC..Nl""R
Jo1_,0ff.RATIOH..,TVP£
.......
,................... ......,.
:::i
3
i;; lr1U '1tNdt SOfted by~ <Cil.nn
I
Figure 48: Map CDC Operation Transform Editor
Map CDC Operation Transform Ed itor
While commonly used to support relational or mainframe changed-data capture (CDC), this
transform supports any data stream as long as its input requirements are met. Relational CDC
sources include Oracle and SQL Server. This transform is typically the last object before the
target in a data flow because it produces INPUT, UPDATE, and DELETE operation codes. Data
Services produce a warning when other objects are used.
LESSON SUMMARY
You should now be able to:
•
276
Use the data services integrator transforms
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 9
Lesson 2
Using the Pivot Transform
LESSON OVERVIEW
Use the Pivot transform and the Reverse Pivot transform to convert columns into rows and to
convert rows back into columns.
LESSON OBJECTIVES
Aft er completing this lesson, you will be able to:
•
Use the pivot transform
The Pivot Transform
The Pivot transform creates a new row for each value in a column that you identify as a pivot
column.
Change the way the relationship between rows is displayed. For each value in each pivot column,
Data Services produces a row in the output data set as shown in the f igure Pivot Transfer
Concept. You can create pivot sets to specify more than one pivot column.
For example, produce a list of discounts by quantity for certain payment terms. List each type of
discount as a separate record. instead of displaying the list as a unique column.
Input:
Output:
Quantity
Net 10
Net 20
Net 30
Quantity
T ype
Discount
1,000
S"fo
100/o
lS"fo
1,000
Net 10
S"fo
S,000
10%
lS"fo
20%
1,000
Net20
10%
10,000
lS"fo
20%
2S%
1,000
Net'.30
lS"fo
15,000
20%
25%
30%
S,000
Net 10
10%
S,000
Net20
15%
S,000
Net'.30
20%
10,000
Net 10
15%
10,000
Net20
20%
10,000
Net'.30
25%
15,000
Nat 10
10%
15,000
Nat20
15%
15.000
Nat30
20%
Figure 49: Pivot Transfer Concept
Reverse Pivot Transform
The Reverse Pivot transform reverses the process. converting rows into columns.
© Copyright . All r ights reserved.
277
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 9: Data Services Integrator Transforms
•
Data inputs include a data set with rows flagged as NORMAL.
•
Data outputs include a data set with rows flagged as NORMAL. This target includes the nonpivoted columns, a column for the sequence number, the data f ield column, and the pivot
header column.
Table 44: Pivot Transform Options
The table lists the options available with the Pivot transform:
Option
Description
Pivot sequence column
Assign a name to the sequence number column. For each row
created from a pivot column, Data Services increments and
stores a sequence number.
Non-pivot columns
Select columns in the source to show in the target without
modification.
Pivot set
Identify a number for the pivot set. Define a group of pivot
columns. a pivot data field, and a pivot header name for each
pivot set.
Data column field
Specify the column to contain all of the pivot column values.
Header column
Specify the name of the column that contains the pivoted
column names. The header column lists the names of the
columns where the corresponding data originated
Pivot columns
Select the columns to be rotated into rows. Describe these
columns in the Header column and describe the data in these
columns in the Data field column.
The f igure Pivot Transform Editor shows the interface where you can specify your Pivot
transform options.
278
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using the Pivot Transform
SO-IN
I?
ll:Jo0s__
::J
I
I
I
0..JI)
!~
-
!r a. I
•
•r
- """"'··- I
'"""''Jt<l
PiYot ~· C'dJnnl
Hcnol'Mlt~
--
::J
~1 11.J""'
1-·
•I
r.:i-
•
IW°"'-"
-C>.c;
el"''""-
•I
•
I
_J
&ld
I
Qolot•
I
I
Pl'IOl)l'Q
Pl'IOl.)QU
Pl'IOIJ>AlAI
-c......
Odta {ii!*t«bM: PNOT_pATAI
I
--,J-•.Jlllt•
I
J
Figure 50: Pivot Transform Editor
Pivot a Table
Follow these steps to pivot a table:
1. Open the data f low workspace:
•
Add your source object to the workspace.
•
On the Transforms tab of the Local Object Library, drag the Pivot or Reverse Pivot
transform to the workspace to the right of your source object.
•
Add your target object to the workspace.
•
Connect the source object to the transform.
•
Connect the transform to the target object.
2. Open the Pivot Transform Editor:
•
Drag columns that will not be changed by the transform from the input schema area to the
Non-Pivot Columns area.
•
Drag columns that are not to be pivoted from the input schema area to the Pivot Columns
area. Click Add to create more than one pivot set if required.
•
Change the values in the Pivot sequence column, Data field column, and Header column
f ields if required. These columns will be added to the target object by the transform.
•
Select Back to return to the data flow workspace.
©Copyright . All rights reserved.
279
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 9: Data Services Integrator Transforms
280
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 9
Exercise 19
Use the Pivot Transform
Business Example
Currently, employee compensation informat ion is loaded into a table with a separate column for
each salary, bonus, and vacation days. For reporting purposes, you need for each of these items
to be a separate record in the HR_datamart.
Use the Pivot transform to create a separate row for each entry in a new employee compensation
table.
1. In the Omega project, create a new batch job called Alpha_HR_Comp_J ob containing a data
flow called Alpha_HR_Comp_ OF.
2. In the workspace for Alpha_HR_Comp_DF, add the hr_comp_updat e table from the Alpha
datastore as t he source object.
3. Add a Pivot transform to the data flow and connect it to the source table.
4. Add a Query transform to the data flow and connect it to the Pivot transform. Create a target
template table Employee_Comp in the Delta datastore .
5. Specify in the Pivot transform t hat the fields Employee/D and date_updated are nonpivot
columns. Specify that t he fields Emp_Salary, Emp_Bonus, and Emp_ VacationDays are pivot
columns.
6. In the editor for the Query transform, map all fields from the input schema to the output
schema and add an expression in t he WHERE tab to filter out NULL values for t he Comp
column.
7. Execute the Alpha_HR_Comp_Job with the default execution properties.
© Copyright . All r ights reserved.
281
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 9
Solution 19
Use the Pivot Transform
Business Example
Currently, employee compensation informat ion is loaded into a table with a separate column for
each salary, bonus, and vacation days. For reporting purposes. you need for each of these items
to be a separate record in the HR_datamart.
Use the Pivot transform to create a separate row for each entry in a new employee compensation
table.
1. In the Omega project. create a new batch job called Alpha_HR_Comp_Job containing a data
flow called Alpha_HR_Comp_DF.
a) In the Project area, right-click the Omega project name and choose New Batch Job.
b) Enter the job name Alpha_HR_Comp_Job and, on your keyboard. press the Enter key.
c) To open the job Alpha_HR_Comp_Job, double-click it.
d) To add the data flow, in the tool palette, choose the Data Flow icon, and click in the
Alpha_HR_Comp_Job,.
e) Enter t he name Alpha_HR_Comp_DF, and, on your keyboard. press the Enter key.
f) To open the data flow workspace, double-click the Alpha_HR_Comp_OF.
2. In the workspace for Alpha_HR_Comp_DF, add the hr_comp_update table from the Alpha
datastore as the source object.
a) In the Local Object Library, select the Datastores tab.
b) In the Alpha datastore, select the hr_comp_update, drag it to the data flow workspace,
and choose Make Source.
3. Add a Pivot transform to the data flow and connect it to the source table.
a) In the Local Object Library, select the Transforms tab.
b) Expand the Data Integrator node, select the Pivot transform, and drag it to the data flow
workspace. to the right of the source table.
c) Connect the source table to the Pivot transform .
4. Add a Query transform to the data flow and connect it to the Pivot transform. Create a target
template table Employee_Comp in t he Delta datastore.
a) In the Local Object Library, select the Transforms tab.
282
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using the Pivot Transform
b) In the too/ palette, select the Query transform. and drag it to the data f low workspace, to
the right of the source table.
c) Connect the Pivot transform to the Query transform.
d) To add the template table, in the tool palette, choose the Template Table icon and click in
the workspace.
e) In the Create Template dialog box. enter the table name Employee_Comp.
f) In the In datastore drop-down list, select the Del ta datastore as the template table
destination target. and choose OK.
g) Connect the Query transform to t he Employee_Comp table.
5. Specify in the Pivot transform that the fields Emp/oyee/D and date_updated are nonpivot
columns. Specify that the fields Emp_Sa/ary, Emp_Bonus, and Emp_ VacationDays are pivot
columns.
a) To open the Pivot - Transform Editor, double-click the Pivot transform.
b) In the Schema In workspace, select the Emp/oyee/D field, and drag it into the Non-Pivot
Columns workspace.
c) Select the date_updated field, and drag it into the Non-Pivot Columns workspace, below
Employee/D.
d) Select Emp_Salary, Emp_Bonus. and Emp_VacationDays fields. and drag into t he Pivot
Columns workspace, ensuring that they appear in that order.
Hint:
Select and move all three columns together by holding down the shift key.
e) In the Data field column f ield, enter the value Comp.
f) In the Header column field, enter the value Comp_Type.
g) Close t he editor.
6. In the editor for the Query transform, map all f ields from the input schema to the output
schema and add an expression in the WHERE tab to filter out NU LL values for the Comp
column.
a) To open the Query Editor, double-click the Query transform.
b) To create the mapping, select f ields from the Schema In and drag them to the
corresponding fields in the Schema Out.
c) Select the WHERE tab.
d) In the Schema In, select the Comp column and drag it into the workspace of the WHERE
tab.
e) Complete the expression by typing is not null.
©Copyright. All rights reserved.
283
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 9: Data Services Integrator Transforms
The expression in the WHERE tab should read Pivot. Comp is not null.
f) Close the editor.
7. Execute the Alpha_HR_Comp_Job with the default execution properties.
a) In the Omega project area, right-click Alpha_HR_Comp_Job and choose Execute.
b) In the Save all changes and execute dialog box, choose Yes.
c) To execute the job using default properties. in the Execution Properties dialog box, choose
OK.
d) Return to the job workspace.
e) To open the data flow workspace, double click the data flow.
f) Right click the target table and choose View data.
g) Close the display.
284
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using the Pivot Transform
Related Information
•
For more information on the Pivot transform see "Transforms" Chapter 5 in the Data Services
Reference Guide.
©Copyright. All rights reserved.
285
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 9: Data Services Integrator Transforms
LESSON SUMMARY
You should now be able to:
•
286
Use the pivot transform
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 9
Lesson 3
Using the Data Transfer Transform
LESSON OVERVIEW
The Data Transfer transform allows a data flow to split its processing into two sub data flows and
to push down resource-consuming operations to the database server.
LESSON OBJECTIVES
Aft er completing this lesson, you will be able to:
•
Describe performance optimization
Push Down Operations
Improve the performance of your jobs by pushing down operations to the source or target
database. Reduce the number of rows and operations that the engine must retrieve and process.
Data Services examines the database and its environment when determining which operations to
push down to the database . There are ful l push-down operations and partial push-down
operations.
Full Push-Down Operations
The Data Services optimizer always tries to do a full push-down operation. Full push-down
operations can be pushed down to the databases and to the data streams directly f rom the
source database to the target database.
For example, Data Services sends SQL INSERT INTO . . . SELECT statements to the target
database and it sends SELECT to retrieve data from the source .
Full Push-Down Conditions
Full push-down operations to t he source and target databases are only possible when the
following conditions are met:
•
All of the operations between the source table and target table can be pushed down.
•
The source and target tables are from t he same datastore or they are in datast ores t hat have a
database link defined between them.
Partial Push-Down Operations
When a full push-down operation is not possible, Data Services tries to push down a partial pushdown with a SELECT statement to the source database.
Table 45: SELECT Statement Operations
The table lists operations within the SELECT statement that can be pushed to the database:
© Copy right . All r ights r eserved.
287
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 9: Data Services Integrator Transforms
Operation
Description
Aggregations
Used with a GROUP BY statement to produce a data set smaller than or
the same size as the original data set.
Distinct rows
Data Services only outputs unique rows when you use distinct rows.
Filtering
Produce a data set smaller than or equal to t he original data set.
Joins
Produce a data set smaller than or similar in size to the original tables.
Ordering
Ordering does not affect data set size. Data Services can effi ciently sort
data sets that fit in memory. Since Data Services does not perform
paging by writing out intermed iate resu lts to disk, use a dedicated disksorting program such as SyncSort or the DBMS itself to order large
data sets.
Projections
Produce a smaller data set because they only return columns
referenced by a data flow.
Functions
Most Data Services f unctions that have equivalents in the underlaying
database are appropriately translated.
Data Services cannot push some transform operations to the database, for example:
•
Expressions that include Data Services f unctions without database correspondents.
•
Load operations that contain triggers.
•
Transforms other than the Query transform.
•
Joins between sources that are on different database servers without database links defined
between them.
Not all operations can be combined into single requests. For example, when a stored procedure
contains a COMMIT statement or does not return a value, you cannot combine the stored
procedure SQL with the SQL for other operations in a query. You can only push operations
supported by the RDBMS down to that RDBMS.
Note:
You cannot push built-in functions or transforms to the source database. Databasespecific f unctions can only be pushed down to the database for execution.
The View Optimized SQL Feature
View t he SQL generated by the data flow before running a job. Adjust your design to maximize the
SQL that is pushed down to improve performance and to improve the data flow when necessary.
Data Services only shows the SQL generated for table sources. Data Services does not show the
SQL generated for SQL sources t hat are not table sources. for example, the lookup function. the
Key Generation transform. the key_generation function. the Table Comparison transform. and
target tables.
The figure View Optimized SQL shows the Optimized SQL dialog box in t he Local Object Library.
288
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using the Data Transfer Transform
Ooltt4$tore.
OptlnbdS<(l ror:
Cl {lnu<.o<e(OOS_OJ51Cf'£1l.NAMEI)), 005_QJSTOM!R.OJ51JD, OOS_QJSl()M(R,QJSl_ClAS'SF,
_QJSl°"'R.AOOll!SS , OOS_QJSTOM!R.CITY , 005_QJ510MD\.R£GIONJI> , 005_0JSlOf'£1l.Z1P ,
_CU5TOM!R.CUST_rJME:ST-
Ol'1 oos.oos_CU>TOf'£1l oos_QJSTOMER
&>d...
s-,;....
II
0oso
I
4
Figure 51: View Optimized SQL
SQL View
Follow these steps to view optimized SQL:
1. In the Data Flows tab of the Local Object Library, right-click the data flow and select Display
Optimized SQL from the menu.
The Optimized SQL d ialog box displays.
2. In the left pane, select the datastore for the data flow.
The optim ized SQL for the datastore d isplays in the right pane.
Data Caching
Improve the performance of data transformations that occur in memory by caching as much data
as possible. By caching data, you limit the number of times the system must access the
database . Cached data must fit into available memory.
Administrators select a pageable cache location to save content over the 2 GB RAM limit. The
pageable cache location is set up in Server Manager and the option to use pageable cache is
selected on the Data flow Properties dialog box.
Create persistent cache datastores by selecting Persistent Cache as the database type in the
Create New Datastore dialog box. The newly-created persistent cache datastore shows in the list
of datastores, and can be used as a source in jobs.
Process Slicing
Optimize your jobs with process slicing by splitting data flows into sub data flows.
Sub Data Flows
•
Work on smaller data sets and/or fewer transforms so each process consumes less virtual
memory.
•
Leverage more physical memory per data flow as each sub data flow can access 2 GB of
memory.
Process slicing is available in the Advanced tab of the Query transform. You can run each
memory-intensive operation as a separate process as shown in the f igure Performance Process
Slicing.
© Copyright . All rights reserved.
289
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 9: Data Services Integrator Transforms
p
:::- ~-,
t·········· .................. ..............................................................
Run
DISTINCT as _,,,
a ............................................
separate process
...........................................
p
Run GROUP BY as a separate process
p
Run JOIN as a separate process
P"
Run ORDER BY as a separate process
Figure 52: Performance Process Slicing
The Data Integrator Transform
The Data Integrator transform allows a data flow to split its processing into two subdata f lows and
push down resource-consuming operations to the database server.
The Data Integrator transform moves data from a source or from the output of another transform
into a transfer object. The transform reads data from the transfer object.
Use the Data Integrator transform to push down resource-intensive database operations that
occur anywhere within the data flow, The transfer type can be a relational database table, a
persistent cache table, a file, or a pipeline.
Data Integrator Usage
•
Push down operations to the database server when the transfer type is a database table. You
can push down resource-consum ing operations such as joins, GROUP BY, and sorts.
•
Define points in your data flow where you want to split processing into mu ltiple sub data flows.
Use the Data Integrator transform to split the processing among multiple sub data flows with
each sub data flow using only a small portion of memory.
The figure Data Transfer Transform Editor shows the interface where you can push down
operations or define points in your data flow to split processing.
_ , IIDJ ORDER DETAllS
I Desalption
..:J
I Type
I Busi>essName
EJ ID} ORDER DETAILS
~
OROERID
Q,. PRODUCTIO
H UNlll'RlCE
~ QUAlllTITY
D DISCO~T
!;eneral Keys
Int
int
decima1(19,4)
Int
...
,
" I
I
~..!J[!J _
I
...
I
irl'\a...., ~v colarnns
PRODUCTID
v
<
,1
>
Figure 53: Data Transfer Transform Editor
Data Input Requirements
290
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using the Data Transfer Transform
•
When the input data set for the Data Integrator transform is a table or a file transfer type, the
rows must be flagged with the NORMAL operation code.
•
When input data set is a pipeline transfer type, the rows can be f lagged as any operation code.
•
The input data set must not contain hierarchical or nested data.
Data Output Requirements
Output data sets have the same schema and the same operation code as the input data sets.
•
In the push down scenario, the output rows are in the sort or GROUP BY order.
•
The sub data flow names use the format dataflowname_n, where n is the number of the data
flow.
•
The execution of the output depends on the temporary transfer type:
Table or file temporary transfer types:
Data Services automatically splits the data flow into sub data f lows and executes them
serially.
Pipeline transfer types:
If you specify the Run as a separate process option in another operation in the data flow.
Data Services splits the data flow into sub data flows. Data Services executes these sub
data flows that use pipeline in parallel.
©Copyright . All rights reserved.
291
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 9: Data Services Integrator Transforms
292
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 9
Exercise 20
Use the Data Transfer Transform
Business Example
You have historical data concerning order details which may have referential integrity problems.
The data may reside in flat files or database tables. Files and tables with no problems with the
data should use an alternative data flow which does not use the Validation transform. You will
create a data flow which uses the Data Transfer t ransform to push the data down to the database
server. You will edit the create table DDL in the transform editor to add a foreign key to the order
details table.
1. Create a new job, Alpha_Order_Details_Pushdown_J ob with two work flows,
Orders Products WF, and Order Details WF.
2. Create a data flow, Alpha_Orders_Products_DF in the work flow
Orders_Products_WF. Add the orders and products table from the Alpha datastore as
source tables, and add two template tables, orders_pushdown and product_pushdown
from the Delta datastore as target tables.
3. Save and execute the job t o create the target tables in t he Delta database.
4. Create Alpha_Order_Details_Pushdown_DF data flow in t he
Alpha_Order_Details_Pushdown_WF. Add a script in the work f low to execute before the
data flow.
5. Add the orders_pushdown, product_pushdown, and order_details_good tables to the
data flow. Add a Query Transform and a target table to the data flow.
6. Edit the Query transform to define the output columns. aggregat e the dat a. and join t he three
tables.
7. Add a Data Transfer transform to the Alpha_Order_Details_OF work space and place it
between t he order_details table and the Query transform.
8. Edit the Data Transfer transform connected to the order_details_good table.
CREATE TABLE order_details_pushdown (
ORDERID int not null
references orders_pushdown (ORDERID).
PRODUCTID int not null references product_pushdown (PRODUCTID),
QUANTITY integer not null, DISCOUNT decimal(4. 2) null).
9. Connect the Dat a Transfer transform to the Query transform
10. Open the Query Editor to confirm that the join has been updated to use the tables defined in
t he Data Transfer transform input.
© Copyright . All r ights r eserved.
293
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 9 : Data Services Integrator Transforms
11. Execute the job
12. Edit t he Alpha_Order_Details_Pushdown_DF data flow to replace the source table wit h
t he order_details_bad table from the Alpha data store.
13. Execute the job
294
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 9
Solution 20
Use the Data Transfer Transform
Business Example
You have historical data concerning order details which may have referential integrity problems.
The data may reside in flat files or database tables. Files and tables with no problems with the
data should use an alternative data flow which does not use the Validation transform. You will
create a data flow which uses the Data Transfer t ransform to push the data down to the database
server. You will edit the create table DDL in the transform editor to add a foreign key to the order
details table.
1. Create a new job, Alpha_Order_Details_Pushdown_Job with two work flows,
Orders Products WF, and Order Details WF.
a) In the Project area, right-click the Omega project and choose New Batch Job.
b) Enter the name of t he job, Alpha_Order_Details_Pushdown_Job and, on your
keyboard. press the Enter key.
c) To add a work flow to the job, in t he workspace tool palette, choose the Work Flow icon,
and click in the workspace.
d) Name the work f low Order_Details_Pushdown_WF.
e) Add a second work flow, and name it Orders_Products_WF.
f) To connect the work flows, click the Orders_Products_WF , hold down the mouse
button. drag the cursor to t he Order_Details_Pushdown_WF, and release the mouse
button.
The Orders_Products_WF, is upstream from (to the left of) the
Order Details Pushdown WF.
2. Create a data flow, Alpha_Orders_Products_DF in the work flow
Orders_Products_WF. Add the orders and products table from the Alpha datastore as
source tables, and add two template tables, orders_pushdown and product_pushdown
from the Delta datastore as target tables.
a) To open the Orders_Products_WF workspace, double-click it.
b) To add a new data flow to the work flow, in the tool palette, choose the Data Flow icon,
click in the workspace. and enter the data flow name Alpha_Orders_Products_DF.
c) Open the Alpha_Orders_Products_DF data flow.
d) In the Local Object Li brary, choose the Datastores tab.
© Copyright . All r ights reserved.
295
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 9: Data Services Integrator Transforms
e) To add the orders source tables to the data flow, in the Alpha datastore, select the
orders table, drag it the data flow workspace. and choose Make Source.
f) Add the products table to the data fl ow, and choose Make Source.
g) To add the orders_pushdown template table, in the tool palette. choose the Template
Table icon, click in the data flow workspace and enter the Table name orders_pushdown
and In datastore Del ta.
h) Add the product_pushdown template table in the Delta datastore.
i) Connect the orders source table to t he orders_pushdown target table.
j) Connect the products source table to the products_pushdown target table.
3. Save and execute the job to create the target tables in the Delta database.
a) In t he Project Area. right-click the Al.pha_Orders_Details_Pushdown_Job and choose
Execute .
The Save all and execute dialog box appears.
b) To save all the objects that you have created, choose Yes .
c) To execute the job using the default settings, in the Execution Properties dialog box.
choose OK.
4. Create Al.pha_Order_Details_Pushdown_DF data flow in the
Alpha_Order_Details_Pushdown_WF. Add a script in the work flow to execute before the
data flow.
a) Open the Order_Details_Pushdown_WF work flow workspace.
b) To add a script to the work flow. in the tool palette, choose the Script icon, click in t he
workspace and name the script sleep.
c) To open t he Script Editor, double click the script.
d) In t he Script Editor workspace. enter sleep (15000) ;
This is t he sleep ([in] Milliseconds As int) function, to sleep for 15 seconds.
e) Exit the script and return to the Order_Details_Pushdown_WF workspace.
f) Add a new data flow to the work f low; name the data f low Alpha_Order_Details_OF.
g) Connect the script to the data flow. with t he script upstream.
h) Open t he data flow.
5. Add the orders_pushdown, product_pushdown, and order_details_good tables to the
data flow. Add a Query Transform and a target table to the data flow.
a) In t he datastores tab. in the Alpha datastore. select the orders_details_good table.
drag it to data flow workspace, and choose Make Source.
296
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using the Data Transfer Transform
b) In the Delta datastore, select the orders_pushdown, and product_pushdown template
tables from the Delta datastore template tables, drag each to the data flow workspace and
choose Make Source.
c) To add a Query transform to the data f low work space, in the tool palette. choose the
Query Transform icon and click in the workspace.
d) Connect all three source tables to the Query transform.
e) Add a template table named Orders_Fact_Test in the Delta datastore as t he target
table.
f) Connect the Query transform to t he Orders_Fact_Test table.
6. Edit the Query transform to define the output columns, aggregate the data, and join the three
tables.
a) To open t he Query Editor, double-click the Query transform.
b) In the Schema In, select the orders_pushdown.ORDERID, orders_pushdown.ORDERDATE,
and orders_pushdown.CUSTOMERID columns.
c) Drag t he selected columns to the Schema Out.
d) To add a new column below the CUSTOMERID column in the Schema out, right-click
CUSTOMERID, choose New Output Column ... and choose Insert Below.
e) In the Column Properties dialog box, enter the column name NETSALE, and data type:
Decimal with precision: 10 and scale: 2 .
f) Select the NETSALE column and, in the Mapping tab, enter the following:
sum((order_details_good.QUANTITY
order_details_good.DISCOUNT)
* product_pushdown.COST) *
g) Choose the GROUP BY tab.
h) In the Schema In, select the following columns and drag into the GROUP BY area:
orders_pushdown.ORDERID
orders_pushdown. ORDER DATE
orders_pushdown .CUSTOM ERi D
i) On the WHERE tab, to create the join condition, enter the following:
order_details_good.ORDERID = orders_pushdown.ORDERID and
orders_details_good.PRODUCTID = product_pushdown.PRODUCTID.
7. Add a Data Transfer transform to the Alpha_Order_Details_DF work space and place it
between the order_details table and t he Query transform.
a) In the Local Object Library, choose the Transforms tab.
b) Expand the Data Integrator node, select the Data Transfer transform and drag in to t he
workspace.
© Copyright . All rights reserved.
297
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 9: Data Services Integrator Transforms
c) To d isconnect the order_details_good table from the Query transform. select the
connector line and, on your keyboard, press the Delete key.
d) Connect the order_details_good table to the Data Transfer transform.
8 . Edit the Data Transfer transform connected to the order_details_good table.
a) Change the Transfer Type to Table.
b) In the Table Name field, choose the ellipsis button( ...), choose the Del ta datastore, and
name the table order_details_pushdown in the Delta datastore.
c) On the Options tab, click the Generate Default DDL button.
d) In the Data Definition Language (DDL) workspace edit the create table command so it
appears as follows:
CREATE TABLE order_details_pushdown (
ORDER ID int not null
references orders_pushdown (ORDERID),
PRODUCTID int not null references product_pushdown (PRODUCTID),
QUANTITY integer not null, DISCOUNT decimal(4, 2) null).
9 . Connect the Data Transfer transform to the Query transform
10. Open the Query Editor to confirm that the join has been updated to use the tables defined in
the Data Transfer transform input.
11. Execute the job
a) In the Project Area, right-click the Alpha_Orders_Details_Pushdown_Job and choose
Execute.
The Save all and execute d ialog box appears.
b) To save all the objects t hat you have created, choose Yes.
c) To execute the job using the default settings, in the Execution Properties dialog box.
choose OK.
The job should run successfully, because of the overflow f ile; however. no rows will be
written to the table due to the primary key constraint.
12. Edit the Alpha_Order_Details_Pushdown_DF data f low to replace the source table with
the order_details_bad table from the Alpha data store.
a) In t he Local Object Library, choose the Datastores tab.
b) In the Alpha datastore, select the order_details_bad table and d rag it the data flow
workspace.
c) Delete the order_details_good table from the Alpha_Order_Details_Pushdown_DF
data flow workspace.
d) Connect the order_details_bad table to the Data Transfer transform.
13. Execute t he job
298
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Lesson: Using the Data Transfer Transform
a) In the project area, right-click the Alpha_Order_Details_Pushdown_Job and choose
Execute .
b) To save all changes, choose OK.
c) In the Execution properties dialog box, choose OK.
The job should fail.
d) To view error log, choose the red X in the Job Log.
Note the failure was caused in the sub data flow by the foreign key constraint violation.
©Copyright . All rights reserved.
299
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 9: Data Services Integrator Transforms
LESSON SUMMARY
You should now be able to:
•
300
Describe performance optimization
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 9
Learning Assessment
1. Which description corresponds with these Data Integrator transforms?
Match the item in the first column to the corresponding item in the second column.
Data Transfer
Map CDC Operation
Reverse Pivot
Sorts input data, maps output
data, and resolves before and
after versions for UPDATE
rows.
Allows a data flow to split its
processing into two sub data
flows and push down
resource-consuming
operations to the database
server.
Rotates the values in specified
rows to columns.
2. The Pivot transform creates a new row for each value in a column that you identify as a pivot
column.
Determine whether this statement is true or false.
D
D
True
False
3. The Reverse Pivot transform reverses converts rows into columns.
Determine whether this statement is true or false.
D
D
True
False
© Copyright . All r ights r eserved.
301
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 9: Learning Assessment
4. You can only specify one pivot column to be rotated into a row.
Determine whether this statement is true or false.
D
D
True
False
5. Which one of these statements about push-down operations is true?
Choose the correct answer.
D
A Full push-down operations are possible when the source and target tables are from
d ifferent datastores.
D
B Data Services performs a partial push-down with a SELECT statement to the source
D
D
database.
C Data Services can push all t ransform operations to the database.
D You can push built-in functions and transforms to the source database.
6 . Name two ways you can improve the performance of data transformations.
302
© Copyr ight . All r ights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 9
Learning Assessment - Answers
1. Which description corresponds with these Data Integrator transforms?
Match the item in the first column to the corresponding item in the second column.
Data Transfer
Map CDC Operation
Reverse Pivot
Allows a data f low to split its
processing into two sub data
flows and push down
resource-consuming
operations to the database
server.
Sorts input data, maps output
data, and resolves before and
after versions for UPDATE
rows.
Rotates the values in specified
rows to columns.
2. The Pivot transform creates a new row for each value in a column that you identify as a pivot
column.
Determine whether this statement is true or false.
[!]
True
D
False
3 . The Reverse Pivot transform reverses converts rows into columns.
Determine whether this statement is true or false.
[!]
True
D
False
© Copyright . All r ights r eserved.
303
~
~
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
Unit 9: Learning Assessment · Answers
4. You can only specify one pivot column to be rotated into a row.
Determine whether this statement is true or false.
D
True
0
False
5. Which one of these statements about push-down operations is true?
Choose the correct answer.
D
A Full push-down operations are possible when the source and target tables are from
d ifferent datastores.
B Data Services performs a partial push-down with a SELECT statement to the source
database.
D
D
C Data Services can push all transform operations to t he database.
D You can push built-in functions and transforms to the source database.
6. Name two ways you can improve the performance of data transformations.
Data caching and process slicing.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com
304
© Copyright . All rights reserved.
For Any SAP / IBM / Oracle - Materials Purchase Visit : www.erpexams.com OR Contact Via Email Directly At : sapmaterials4u@gmail.com