Best Practices for SAP HANA
Modeling and SAP Data
Services Data Loading
Dr. Berg
Comerit
© Copyright 2014
Wellesley Information Services, Inc.
All rights reserved.
In This Session
•
•
•
•
•
We will explore SAP BusinessObjects Data Services and how to
load information into SAP HANA
You will learn how to create transformations, merges and joins
We will look at the best practices of modeling in SAP HANA
We will see step-by-step how to create calculation, attribute and
analytical views
At the end of this session you will know how to load data and
create views to analyze the data
1
What We’ll Cover
•
•
•
BusinessObjects Data Services
 Data Services Overview
 Creating Batch Jobs
 Loading From Flat Files
 Building Transforms and Using Functions
 Creating Table Joins
 Utilizing Data Merging
SAP HANA
Wrap-up
2
Data Services Overview
•
SAP Data Services is a leading technology for enterprise
information management providing solutions for:

Data integration

Data quality

Data profiling

Text data processing
SAP Data Services transforms, refines, and
delivers trusted data for the EDW.
3
What We’ll Cover
•
•
•
BusinessObjects Data Services
 Data Services Overview
 Creating Batch Jobs
 Loading From Flat Files
 Building Transforms and Using Functions
 Creating Table Joins
 Utilizing Data Merging
SAP HANA
Wrap-up
4
What Are Data Services Batch Jobs?
Batch jobs are basically used to:

Extract data from one or many
sources

Transform data to meet the
organization’s business
requirements

Load the processed data to a
location for use
5
Step-by-Step: Creating Batch Jobs
1. Create a new project and give it a
relevant project name
2. Right-click on the project to create a
new batch job
The practice of giving relevant
names to your projects and
batch jobs is useful for
organization purposes
6
What We’ll Cover
•
•
•
BusinessObjects Data Services
 Data Services Overview
 Creating Batch Jobs
 Loading From Flat Files
 Building Transforms and Using Functions
 Creating Table Joins
 Utilizing Data Merging
SAP HANA
Wrap-up
7
Step-by-Step: Loading from Flat Files
1. Select the related batch job
to enter into its workspace
2. From the ‘Format’ category in the
‘Local Object Library’ panel, rightclick on ‘Flat Files’ and select ‘New’
8
Examples of Other Available Data Sources
There are many other data sources that can be used in Data Services.
Use the local object library to find existing data
sources under the ‘Datastore’ category
Can upload more files under the ‘Format’ category
9
Formatting the Flat File
3. In the pop up ‘File Format Editor’, fill in the appropriate fields
Date format must
match data format
‘Tab’ was chosen because data
fields were separated by tabs
10
Defining Table Fields
4. Enter in the field properties
Notice the updated view below
11
Preview Data
5. In the Repository under Format, right-click and select ‘View Data’ to preview the
newly added data source
This allows you to check if the source data
populated without error before using the data
12
What We’ll Cover
•
•
•
BusinessObjects Data Services
 Data Services Overview
 Creating Batch Jobs
 Loading From Flat Files
 Building Transforms and Using Functions
 Creating Table Joins
 Utilizing Data Merging
SAP HANA
Wrap-up
13
Transforms Overview
•
Transforms are built-in objects that
process source data to bring about
desired outputs
•
The most commonly used transform
is Query Transform
•
Query Transform enables you to:
Filter and select data from a source
• Join data from multiple sources
• Map columns from input to output schemas
• Perform data nesting and unnesting
• Add new columns to the output schema
• Assign primary keys to output schema
•
14
Adding a Data Flow Object to the Workspace
The tool palette contains icons which allow the creation of new objects in the workspace.
1. Drag a data flow icon from the
tool palette to the workspace
2. Double click on the data flow to
enter its workspace
When creating a reusable object, such as a data flow object,
it will automatically appear in the local object library.
15
Adding a Data Source to the Workspace
1. Drag a data source (i.e. flat
file) from the local object
library on to the workspace
2. Create a connection
between the data source
and query
16
Query Editor Overview
•
•
The query editor is a graphical interface for carrying out query operations.
It contains three areas:
3. Double-click on
the query transform
to open the ‘Query
Editor’
Schema in
area
Schema out area
Parameters area
17
Setting Up the Output Table
4. Drag the desired output fields to ‘Schema Out’
from the ‘Schema In’ section
It is not necessary to drag all fields from schema in to schema
out unless you want all the fields to appear in schema out.
18
Creating of New Output Column
1. Right-click on an output field and select
‘New Output Column’
2. Select where to insert the new column
New columns can be created to
display results from calculations.
19
Defining Column Properties
3. The ‘Column Properties’ will pop up
for you to define and rename the new
column and its properties
Give the column a descriptive name
that properly identifies what the
column is used for.
20
Using Functions
1. Double-click in the cell
under ‘Mapping’
3. Select the appropriate
category and then the
specific function
2. Click on
‘Functions’
For this demo, we want to calculate
the number of days a case was open
21
Setting Up the Function
Use the dropdown list to
state the input parameters
to avoid typos.
4. Define the input
parameters for the
function
Noticed the updated code in this panel for the
NO_DAYS_CASE_OPEN column after defining the input
parameters. This formula will deliver the number of
days from ODATE to CDATE giving us a measurement
22
of how long it takes to close a case.
Adding an Output Table to the Workspace
1. Drag and drop a table template in the
workspace to be our output table
2. Link the query to the template table
A template table is an object that can
be used as a target for data to populate
in when a job gets executed
successfully and can also be saved in
the object library for use as a data
source at a later time.
A template table allows us
to view the specific
information we want
without the risk of altering
the source data. The data
that gets populated in the
template table is based on
the output schema
requirements in the query
transform.
23
Executing a Job
1. Right-click on the job
and select execute
To analyze any issues that may
occur during data loading, click
on Enable auditing’ and make
sure that ‘Use collected
statistics’ are checked.
24
Job Log Overview
•
The log file displays a list of actions in the job execution.
If any errors occur, the error icon will appear. Otherwise, ‘Job is
completed successfully’ will be displayed
•
The job log has five columns:
•





Pid:
Tid:
Number:
Time Stamp
Message:
Process thread identification number of the executing thread
Thread identification number of the thread
Number prefix of the error followed by a number
Date and time the thread generated a message
Error description of the thread
25
Job Log Overview Continued
A successful job execution
A job with errors will show the error icon
Double-click on the error icon
to view the list of errors as
shown below.
26
How to Preview the Output Table
1. Click on the Data Flow to open its workspace
2. Click on the magnify glass of the output
table to view data in the output table
Notice that the column created earlier is formatted correctly as a
number and that the data is the result of the function defined.
27
What We’ll Cover
•
•
•
BusinessObjects Data Services
 Data Services Overview
 Creating Batch Jobs
 Loading From Flat Files
 Building Transforms and Using Functions
 Creating Table Joins
 Utilizing Data Merging
SAP HANA
Wrap-up
28
Creating Table Joins
A join can be used to combine data from multiple sources into one target.
Source 1
Source 2
Use the Query Transform FROM clause to
join the two sources: Query and Join.
In this example, Source 1 has the Car Description for the case, while Source 2 has the
Solution to the case. The query transform will combine the data from the two sources in the
schema out section to produce a result displaying the overall case solution.
29
Result from a Table Join
1. Once the tables have been joined in the query
transform, execute the job as discussed in earlier slides.
2. Enter the data flow workspace
and click on the magnify glass to
view the results in the output
table.
Noticed in the output table below
how the Solution column from the
Join source is now combined with
the fields from the Query transform.
30
What We’ll Cover
•
•
•
BusinessObjects Data Services
 Data Services Overview
 Creating Batch Jobs
 Loading From Flat Files
 Building Transforms and Using Functions
 Creating Table Joins
 Utilizing Data Merging
SAP HANA
Wrap-up
31
Merges Overview
You can merge rows from two or more sources into a single data set
All sources must have the same schema
to execute the Merge Transform
• Same # of columns
• Same column names
• Columns must have same data type
32
How to Create a Merge
1. To merge two sources, add a query
form to each source to format all the
data to be the same in both sources
2. Join the queries to a ‘Merge Transform’
3. When opening the ‘Merge Transform’,
notice how all the fields and data types
match for all output and input fields.
33
How to Avoid Creating Duplicated Data in Merges
4. To avoid duplicate rows, add a query
transform to display distinct rows only
5. Execute the job to complete merged table
34
Demo of Data Loading with Data Services
35
What We’ll Cover
•
•
•
BusinessObjects Data Services
SAP HANA
 SAP HANA Overview
 Creating Attribute Views
 Building Analytic Views
 Making Calculation Views
Wrap-Up
36
SAP HANA – In Memory Options
•
SAP HANA is sold as an in-memory
appliance. This means that both Software
and Hardware are included from the
vendors
•
Currently you can buy SAP HANA
solutions from Cisco, Dell, Fujitsu, IBM,
HP, NEC, Hauweii and others
•
SAP HANA indexes and compress the
data from a variety of sources, including
ERP and store the data in-memory.
Source SAP AG,2014
SAP HANA can radically change
the way databases operate and
make systems dramatically faster.
37
HANA Editions and Components
•
Area
Platform
Edition
Component ID
Component Name
While HANA is sold as an
appliance, there are many internal
components and the edition you
buy may contain different licenses
to these components.
BC-DB-HDB
SAP HANA database
BC-DB-HDB-ENG
SAP HANA database engine
BC-DB-HDB-PER
SAP HANA database persistence
BC-DB-HDB-SYS
SAP HANA database interface
BC-DB-HDB-DBA
SAP HANA database/DBA cockpit
BC-DB-HDB-POR
SAP HANA DB Porting
BC-DB-HDB-BAC
SAP HANA Backup and Recovery
BC-CCM-HAG
SAP Host agent
BC-DB-HDB-CCM
SAP HANA CCMS
BC-DB-HDB-CLI
SAP HANA Clients (JDBC/ODBC)
BC-DB-HDB-R
SAP HANA Integration with R
BC-DB-HDB-SCR
SAP HANA SQL scripts
BC-DB-HDB-MDX
MDX engine: Microsoft Excel client
BC-HAN-MOD
SAP HANA Studio - Information Modeler
BC-HAN-3DM
Information Composer
BC-HAN-SRC
SAP HANA UI toolkit
BC-DB-HDB-TXT
SAP HANA Text and Search features
BC-DB-HDB-DXC
SAP HANA Direct extraction connector
BI-BIP-CMC, BI-BIP
BI Platform
BC-DB-HDB-SEC
SAP HANA Security and User Mgmt
BI-RA-WBI
Web Intelligence
BC-DB-HDB-XS
SAP HANA Application Services
BI-RA-XL
Dashboard Designer
BC-DB-HDB-AFL
SAP HANA Advanced functions library
BI-RA-CR, BI-BIP-CRS
SAP Crystal reports
BC-DB-HDB-AFL-PAL SAP HANA Predictive analysis library
BI-RA-EXP
SAP BusinessObjects Explorer
BC-DB-HDB-AFL-SOP SAP HANA Sales & Operations Planning
BI-BIP-IDT
Information Design Tool (for universes)
BC-DB-HDB-PLE
BI-RA-AO-XLA
Microsoft Excel add-in
SAP HANA Planning Engine
Area
Lifecycle
Management
Component ID
BC-HAN-SL-STP
SAP HANA unified installer
BC-HAN-UPD
Software Update Manager
BC-DB-HDB-INS
SAP HANA database installation
BC-DB-HDB-UPG
SAP HANA database upgrade
BC-HAN-DXC
Enterprise Edition EIM-DS
BC-HAN-LOA
(also have platform
edition components) BC-HAN-LTR
BC-HAN-REP
End User Clients
Component Name
SAP HANA Direct Extractor Connection
SAP Data Services: ETL-based
SAP HANA Load Controller: log-based
SAP Landscape Transformation (SLT): trigger-based
Sybase Replication Server: log-based
38
Some of the Hardware Options
Dell R920
39
Example: IBM 3850 X6
40
Hardware Options 2014 Onward
These systems are based on Intel's E7 IvyBridge processors with
15 cores per processor (the old had only 10).
UPDATE: Hitachi Servers and Dell (R920) are now also available
41
What We’ll Cover
•
•
•
BusinessObjects Data Services
SAP HANA
 SAP HANA Overview
 Creating Attribute Views
 Building Analytic Views
 Making Calculation Views
Wrap-Up
42
Attribute Views - Overview
• Masterdata
reporting can be modeled using attribute views
• Can
be regarded as Master Data Tables
• Can
be linked to fact tables in Analytic Views
• A measure
• Table
e.g. weight can be defined as an attribute
joins and properties:
•
Leftouter, rightouter, full outer or text table
•
Cardinality 1:1, N:1, 1:N
•
Language Column
•
Some Views and Functions is shipped with HANA
43
Creating a New Attribute View
1.
Open HANA Studio and
expand the ‘Content’
folder
2.
Right click on the
appropriate package in
your system
3.
Navigate to New >
Attribute View…
44
Naming the New Attribute View
Give the view a name
2. Add a description
1.
The name and description that
is provided should accurately
describe the Attribute View you
want to create.
3.
Finish and start adding and
joining tables to the view
45
Adding Tables to the Data Foundation
1.
2.
Open the ‘Catalog’ folder
Expand the system
3. Expand the ‘Tables’ folder
4. Drag the necessary table to the
‘Data Foundation’
46
Adding More Tables to the Data Foundation
Add tables into the data
foundation by dragging
another table to the data
foundation area
Join type is set using the
Properties panel
The first table that was
added will be on the left
in the ‘Details’ panel
47
Applying Filters to the View
Filters can be used
to limit the data
being displayed.
Right click on the
attribute you want
to filter on and
select ‘Apply Filter’
from the context
menu.
This example shows the creation of a filter on the ‘VALID_TO’ date field.
Setting that value to ‘9999-12-31’ forces the result set to only show values
that are always valid
48
Making Attributes Visible to End Users
1 & 2. To make an Attribute visible to users, simply click the circle
beside each attribute
3. An attribute can be
set to a key or
changed to a
certain type of label
Save and Validate
once complete
49
What We’ll Cover
•
•
•
BusinessObjects Data Services
SAP HANA
 SAP HANA Overview
 Creating Attribute Views
 Building Analytic Views
 Making Calculation Views
Wrap-Up
50
Analytic View - Overview
•
Logically very close to InfoCubes in BW
•
Join together one central fact table
containing measures for reporting
•
Can consist of calculated measures and
variables
•
Analytic Views do not store data
•
Data is found in the column store table or
view based on Analytic View structure
An example of an analytic view might be sales by
product, customer, and organizational entity.
51
Starting an Analytic View
•
•
•
•
Analytic views are the most common
views for reporting purposes
They are the basic view type used as
source data in the SAP
BusinessObjects BI tools (or other
frontend tools)
We will join together sales data with
product information
This view will be quite simplistic but
they can be as complex as you like
Analytic views do not have to make use of attribute views. They
can simply be a join of master data tables and a fact table.
52
Adding a New Analytic View
1. Find the appropriate package
2. Right click and choose ‘New’ > ‘Analytic View’
3. Provide a technical
name and a description
in the popup that
follows
Ensure that the ‘View Type’
dropdown is set to Analytic
View
53
Adding Fields to the Output
Add tables to the data foundation by clicking and dragging tables to it
You should also select which attributes will be shown in
the output by selecting the gray circles next to each item.
54
Setting Attributes and Measures
In the semantic layer, you
can assign attributes and
measures to the items that
were selected to be in the
output.
This is necessary for
attributes and measures to
be displayed and
aggregated properly in the
reporting layer.
Select which attributes will be shown in the output
by selecting the gray circles next to each field.
55
Joining Tables
In the ‘Logical
Join’, two or more
tables must be
joined together on
fields that are
identical or share
the same results.
1.
2.
3.
Select the ‘Logical Join’ node
Drag another view or table into the node
Drag from one view to the other on the common field (i.e. Product to Product)
By default this creates a referential join of the
table to the ‘Data Foundation’.
56
Creating a New Calculated Column
Now we will add a new calculated
field called ‘Net Sales’
Using the ‘Advanced’ tab you can set the type
of value such as currency or percentage.
57
Demo- Building Attribute and Analytical Views
58
What We’ll Cover
•
•
•
BusinessObjects Data Services
SAP HANA
 SAP HANA Overview
 Creating Attribute Views
 Building Analytic Views
 Making Calculation Views
Wrap-Up
59
Calculation View - Overview
•
Bring together database tables, attribute
views, analytic views, and other
calculation views
•
Provide one source of data for reporting
tools
•
You can also write SQL statements to
make sure a set of fields match
requirements of other output structures
Calculation views are used to satisfy complex business
requirements. An example of a calculation view might be
a comparison of actual sales with forecast sales.
60
Creating a New Calculation View
A calculation view will
now be created to join
together other tables
and views and utilize
calculations and
aggregations to analyze
the data.
1.
2.
Right click on the
appropriate
package
In the context
menu, click ‘New’ >
‘Calculation View’
61
Naming the New Calculation View
Give the Calculation View a
proper name and label
The ‘Copy From’ option can be used to
copy and extend an existing calculation
view without editing the original view or
having to create a new one each time.
62
Propagate to Semantics
In the projection
layer, right click
on attributes
you want to
display in the
semantic layer
and choose
‘Propagate to
Semantics’
If you choose ‘Add to Output’ instead, that field in every node will
have to be activated manually.
63
Creating a New Calculation in the View
Calculated columns are used to
derive some meaningful
information in the form of
columns, from existing columns.
1.
Give the column a proper
name
2.
Set the ‘Data Type’
3.
Choose a function
4.
Select the text within the
parentheses
5.
Choose an element (or
attribute in your table)
6.
Validate the syntax
You can add your own calculations to the
calculation view just as in the analytic view
64
Aggregation - Overview
•
Aggregation Node - columns will be rolled up, or aggregated, when
placed in this layer.
Customer
Product
Amount
1
1
20
1
1
20
2
2
30
3
3
25
4
4
20
Customer 1’s amounts
were added up so there is
one less row to display.
With an aggregated
column on customer and
amount, you would get a
data set that looks like the
following:
Customer
Amount
1
40
2
30
3
25
4
20
65
Adding a Calculated Column to the Aggregation Layer
In the aggregation node, calculated columns can be added as
aggregated columns
If calculations are not
added to a projection
layer, and then sent to an
aggregation node, the
totals will not work
properly in reporting.
66
Assigning Column Types to the View
In the semantics layer,
each item needs to be
assigned the type
attribute or measure
1. Click on the
‘Semantics’ node
2. Click the ‘Auto
Assign’ button to
automatically assign
the ‘Type’
3. If any of the types are
incorrect, you can
manually adjust them
Once all assignments
are complete, save and
validate the view.
You can set each of these types manually,
but the automatic assignments are usually correct
67
What We’ll Cover
•
•
•
BusinessObjects Data Services
SAP HANA
 SAP HANA Overview
 Creating Attribute Views
 Building Analytic Views
 Making Calculation Views
Wrap-Up
68
7 Key Points to Take Home
•
SAP Data Services transforms, refines, and delivers trusted data for
the Enterprise Data Warehouse
•
Multiple data sources can be used for Data Services including Flat
Files, DTDs, XML Schemas, Excel Workbooks and more
•
Utilize built-in transforms which are objects that process source data
to bring about desired outputs
•
SAP HANA indexes data from a variety of sources and stores the
results on a dedicated server
•
Attributes add details and can be modeled using Attribute Views
•
Analytic Views join together one central fact table consisting of
calculated measures and variables for reporting
•
Calculation Views bring together database tables, attribute views,
analytic views and other calculation views
69
Where to Find More Information
•
•
•
•
www.sap-press.com/products/SAP-HANA%3A-An-Introduction(2nd-Edition).html
 Bjarne Berg and Penny Silvia, SAP HANA: An introduction, SAP
Press; 3rd edition (May 1, 2014)
http://www.saphana.com/welcome
 SAP’s main page for all SAP HANA related information
http://www.saphana.com/community/try
 Powered by HANA demos
http://scn.sap.com/community/hana-in-memory
 SAP HANA and In-Memory Computing by SAP HANA Community
70
Your Turn!
How to contact me:
Dr. Berg
bberg@comerit.com
Please remember to complete your session evaluation
71
Disclaimer
SAP, R/3, mySAP, mySAP.com, SAP NetWeaver®, Duet™®, PartnerEdge, and other SAP products and services mentioned herein as well as their
respective logos are trademarks or registered trademarks of SAP AG in Germany and in several other countries all over the world. All other product and
service names mentioned are the trademarks of their respective companies. Wellesley Information Services is neither owned nor controlled by SAP.
72