OBIEE Data Federation Limor Fledel-Vagman BI Practice Manager Oracle Israel OBIEE Federation • Vertical Federation – Integrating two or more disparate data source having different levels of granularity joined by one or more conformed dimensions – Example: Essbase Sales cube to relational Sales detail data (drillthrough) • Horizontal Federation – Integrating two or more disparate data source having the same level of granularity joined by one or more conformed dimensions – Example: Essbase Sales cube with HR relational database LTS ??? Logical table sources (LTS’s) are a key feature within the OBIEE semantic model OBIEE Aggregate Navigation Why Aggregate Navigation? • Problem: Summary questions are slow – Example, “All sales worldwide for last 2 years by quarter by family” – This will potentially sum/group by millions of rows! – The disk spin time to scan all these rows will be very long • Solution: ETL to batch build high level aggregates – – – – Oracle BI EE has “aggregate navigation” to use them Aggregates can be in the same database as the source They can also be in a different relational source or Essbase Often orders of magnitude performance improvement Oracle BI EE “Architecture” Semantic Layer Metadata BI Server Relational Tables Oracle BI EE Aggregate Navigation Metadata describes aggregate mappings Semantic Layer Metadata BI Server At query time, BI Server queries the fastest source that has enough detail to satisfy the user request Performance of highly summarized requests is dramatically improved Summary Detail Relational Federation Enables External Aggregates Semantic Layer Metadata BI Server Aggregates can be in other federated sources Summary Summary Detail Relational External Aggregates Can Be in Essbase Semantic Layer Metadata BI Server Aggregate navigation can include multidimensional Summary Relational Detail Essbase External Aggregates Can Be Oracle OLAP Semantic Layer Metadata BI Server Aggregate navigation can include multidimensional Summary Oracle OLAP AW Relational Detail "This information is not a commitment to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described remains at the sole discretion of Oracle" How is Aggregate Navigation Set Up? • Database Design time: – DBA defines aggregate stars at various grain combinations Star Time Product Base Day SKU Agg 1 Month Brand Agg 2 Qtr - – Builds ETL to load the aggregates each night Mapping a Base-Level Physical Star Business Model and Mapping Layer Mappings Physical Layer D0 Time F0 Rev Base Measures D_Day D4 Product F_Bill D_Prod_SKU Adding a Moderately Aggregated Star Business Model and Mapping Layer Mappings Physical Layer D0 Time D_Month F_Bill_agg2 F0 Rev Base Measures D_Brand D_Day D4 Product F_Bill D_Prod_SKU Adding a Highly-Aggregated Star Business Model and Mapping Layer Mappings Physical Layer D_Qtr D0 Time F_Bill_agg1 D_Month F_Bill_agg2 F0 Rev Base Measures D_Brand D_Day D4 Product F_Bill D_Prod_SKU Essbase as an Aggregate Source Business Model and Mapping Layer Mappings Physical Layer D0 Time Month Billed Amt F0 Rev Base Measures Brand D_Day D4 Product F_Bill D_Prod_SKU Adding a Highly-Aggregated Star How does the queryand Business Model planner decide which Mapping Layer FJC to use as its source for a given request? D0 Time Mappings Physical Layer D_Qtr F_Bill_agg1 D_Month F_Bill_agg2 F0 Rev Base Measures D_Brand D_Day D4 Product F_Bill D_Prod_SKU Selecting an Aggregate or Base Star at Query Time How does the query planner decide which star to use as its source for a given LSQL request? • First, it eliminates sources that don’t have enough detail to answer the question – Query grain < star grain – Query grain = star grain – Query grain > star grain star not qualified star qualified star qualified • Second, it estimates the fastest/prioritized source How BI Server Knows the Grain of a Star • At query time, the BI Server checks the available mappings to find which ones are qualified by grain How Does It Know Which Star is Fastest? Number of elements • Be sure to correctly populate the Dimension Level’s “Number of elements at this level” OBIEE 10g How Does It Know Which Star is Fastest? LTS Priority Group Order • LTS Priority Group Order • Lower value = higher priority • LTS Priority Group becomes main decider in which LTS to use OBIEE 11g New Feature Aggregate Persistence Aggregate Persistence Feature • • • • Managing aggregates manually has high TCO Oracle BI EE automates creation of aggregates Leverages existing metadata Orders of magnitude performance boost; low TCO “Aggregate Persistence” Automates Create & Load Admin Tool: Aggregate build automation wizard Semantic Layer Metadata • Define agg navigation metadata BI Server • Run scripts • Create Tables • Aggregation Summary Detail Aggregate Persistence – Manual Option Admin Tool: Aggregate build automation wizard Semantic Layer Metadata • Define agg navigation metadata BI Server DBA • Create scripts • Create Table DDL • Aggregation Summary Detail • Run scripts • Create Tables • Aggregation Aggregate Persistence with Essbase & Oracle OLAP Admin Tool: Aggregate build automation wizard Semantic Layer Metadata • Define agg navigation metadata BI Server •Define cube schema Summary Detail "This information is not a commitment to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described remains at the sole discretion of Oracle" Essbase or AW Administration Steps 1 Design aggregate facts and grains 3 Wizard to create metadata 2 Target database for create/load Aggregate Persistence Wizard Aggregate Persistence Wizard • Automates the creation of physical aggregate tables and their corresponding objects in the repository Fact aggregate Use wizard to build script to generate aggregate tables. Script generates physical tables, repository objects, and mappings. Dimension aggregates Aggregate Persistence Wizard Steps 1. Open Aggregate Persistence Wizard. 2. Specify the file name and location. 3. Select business model and measures. 4. Select dimensions and levels 5. Select connection pool, container, and name. 6. Review aggregate definition. 7. View complete aggregate script. 8. Verify script is created. 9. Create and run a batch file. 10. Verify aggregates in the Physical layer. 11. Verify aggregates in the BMM layer. 12. Verify aggregates in the database. 13. Verify results in Answers. 1. Open Aggregate Persistence Wizard • Select Tools > Utilities > Aggregate Persistence Wizard and click the Execute button. 2. Specify File Name and Location • Specify a file and location where the output script should be saved. 3. Select Business Model and Measures Select business model. Select fact table. Select measures. 4. Select Dimensions and Levels • Select corresponding aggregate dimensions and levels. 5. Select Connection Pool, Container, and Name Select the database object. Select the schema. Select the connection pool. Name the aggregate table. 6. Review Aggregate Definition View the aggregate definition. 7. View Complete Aggregate Script Confirmation of script creation and location Script 8. Verify that the Script Is Created • Navigate to the directory where the file was saved and verify that the script was created as expected. 9. Create and Run a Batch File • Create and run a batch file with the following format to drive the aggregate creation process: nqcmd Oracle BI Server command utility -d Oracle BI Server data source name -u Repository username -p Repository password -s Path to the create aggregate SQL script 10. Verify Aggregates in the Physical Layer • Verify that the aggregates are created in the Physical layer of the repository as expected. Fact aggregate Dimension aggregates 11. Verify Aggregates in the BMM Layer • Verify that the aggregates are created in the Business Model and Mapping layer of the repository as expected. Dimension aggregate Fact aggregate 12. Verify Aggregates in the Database • Verify that the aggregates are created in the database. 13. Verify Results in Answers – Activate the aggregate tables: – Run a query in Answers: – Check the log and verify that the aggregate tables are accessed as expected: Considerations • Using aggregates comes with a price: – Additional time is required to build and load these tables. – Additional storage is necessary. • Build only the aggregates you need: – Look at query patterns and build aggregates to speed up common queries that require summarized results. – Ensure that enough data is combined to offset the cost of building aggregates. – Monitor and adjust to account for changing query patterns. Using Partitions and Fragments Business Challenge – Data is often partitioned into multiple physical sources for a single logical table. – Organizations need to seamlessly and efficiently access and process data from multiple sources to satisfy user requests. – Business applications must “know” where to go for what type of data and under what conditions. Business Solution: Oracle BI Server – Oracle BI repository can be configured so that Oracle BI Server handles the navigation to the appropriate source. – Oracle BI Server seamlessly and efficiently accesses and processes data from multiple sources to satisfy user requests. Partition – Is a database element that contains part of the data for a fact or a dimension – Combines with other data fragments as necessary – May be: • Fact-based • Value-based • Level-based • Complex Partitioning by Fact – Data is partitioned by fact when different fact data is stored in different tables. – Example: Actual sales versus quota targets Actual sales Sales Rep 1100 Product Sale 1000 Quota targets Sales Rep 1100 Product Quota 2000 Partitioning by Value – Data is partitioned by value when the data is split into separate tables according to the values of the data. – Example: Invoice data is stored separately for each region. Invoices for Central Region InvNbr Invoices for West Region InvNbr 1135293 114444 Dollars 1000 Region Central Dollars Region 200 West Partitioning by Level – Data is partitioned by level when the same facts are stored in separate tables at different levels of aggregation. – Example: Detailed sales data is summarized and stored by year and region. Sales detailed data Sales Rep Date Sales by year and region Product Sale Year Total Dollars Region 1100 19980105 10000 1998 200000 Central 1100 19981001 25000 1999 300000 Central 1100 19981010 10000 Complex Partitioning – Data is partitioned using more than one technique. – Example: Invoice sales data is partitioned by value and level. Invoices by month for Central Month Total Dollars Region 199801 10000 Central 199802 25000 Central Invoices by month for West Month Total Dollars Region 199801 300000 West 199802 350000 West Invoices by year for Central Year 1998 Total Dollars 200000 Region Central Invoices by year for West Year 1998 Total Dollars 3000000 Region West ABC Example: Fact-Based (Quota) • Allow users to query for actual sales data and quota data in a single query. Actual sales Sales Rep 1100 Product Sale 1000 Quota targets Sales Rep 1100 Product Quota 2000 ABC Example: Value-Based (Customer) • Replace the current, single source for customer data with two value-based partitions. Customers with names starting with letters A–M Customers with names starting with letters N–Z NewKey Name 1000 Clifton Lunch NewKey Name 1002 Tong’s Wok ABC Example: Value-Based (Inventory) • Build a business model for inventory data that is fragmented into multiple tables. Eight quarters of inventory data stored in eight separate tables Implementation Steps – – – – – Import physical sources. Create physical joins. Add sources to the Business Model and Mapping layer. Specify fragmentation content. New step Test the results. Specify Fragmentation Content – Use the Expression Builder to define the type of content that the fragment contains. – Set the flag to specify whether to combine this fragment with other data. Specifies that customer data from A–M is contained in this fragment Specifies combining this fragment with the other Summary Identify reasons for segmenting data and implement it – benefit is promised! Question: • The end user needs to know what data is contained in which partition before submitting a query. True or false? Real-time BI Federated OLTP/EDW Reporting Horizontal Federation for Real-Time BI • Using horizontal fragmentation in OBIEE, we can map a single logical fact table to multiple LTS’s. • Physical fact table in our EDW called SALES_FACT • Physical fact table in our source system to represent real-time data • One logical fact table in the BMM — called “Sales Fact Realtime” Horizontal Federation for Real-Time BI Fact table mapping Horizontal Federation for Real-Time BI Customer dimension mapping Horizontal Federation for Real-Time BI Use variable as the threshold between reporting against the EDW schema and the source system schema Horizontal Federation for Real-Time BI LTS mapping using threshold date Horizontal Federation for Real-Time BI OBIEE does the logical union Horizontal Federation for Real-Time BI When the BI Server has enough information to know that the entire result set will come from a single source, then the SQL will be issued against only one of the LTS’s. Oracle by Example and Relevant Blogs Web tutorial :Oracle by Example “Federating Essbase and Relational Data Sources in OBIEE” http://www.artofbi.com/index.php/2009/07/federation-in-obiee-what-are-you-talking-about/ http://odtugspconference.com/Slides/OBIEE%20Integration%20with%20Essbase.pdf http://www.rittmanmead.com/2011/05/real-time-bi-federated-oltpedw-reporting/