INTRODUCTION OF WEEK 14 2011 Fall Assignment Discussion Graded: 3-1-4 (Lab4: Query Optimization) ITEC 450 Creating, reviewing, and interpretation are all important for this assignment. Analysis of execution plans: scan (full table, index range, index unique), order of execution, join operations (hash, nested loop) Comparing the plans, not statistics Complexity of plan != efficiency of the query: I/O scans, sorting, joins are expensive Understanding application query tuning Due: 12-1 (Database Storage), 12-2 (Bulk Data Movement) Working: 13-1 (Research paper: database metadata management) Working: 3-1-5 (Final Project Write-up) Review of previous week and module Metadata Management, Database Management Tools Oracle 10g Data Dictionary and Dynamic Performance Views Overview of this week – Module 5 Data Warehouse Administration Course Summary Final Exam Discussion 1 2011 Fall ITEC 450 MODULE 5 Metadata, Tools, and Data Warehousing 2 Section 4 Data Warehouse Administration DATA WAREHOUSE AND CHARACTERISTICS 2011 Fall Subject-oriented – data pertains to a particular subject instead of the many subjects pertinent to the company’s ongoing operations. Integrated – consistent naming conventions, formats, encoding structures; from multiple data sources Time-variant – data is identified with a particular time period, can study trends and changes Non-updatable – data is stable in a data warehouse. Data loaded, and should not be removed. ITEC 450 A data warehouse is a subject-oriented, integrated, time-variant, non-volatile collection of data that is designed for query and analysis rather than for transaction processes. 3 COMPARISON OF DATABASE CHARACTERISTICS 2011 Fall ITEC 450 4 DATA WAREHOUSE AND BUSINESS INTELLIGENCE 2011 Fall A data warehouse usually contains historical data derived from transaction data and other sources. It enables an organization to consolidate data. It includes ITEC 450 An extraction, transportation, transformation, and loading (ETL) solution An online analytical processing (OLAP) engine Client analysis tools Reporting 5 ANALYTICAL VS. TRANSACTION PROCESSING 2011 Fall Analytical processing – informational systems DSS – decision support system OLAP – online analytical processing Data mining – the process of mining or discovery of new information in terms of patterns or rules from vast amounts of data ITEC 450 Transaction processing – operational system OLTP – online transaction processing 6 DATA WAREHOUSE DESIGN 2011 Fall ITEC 450 Star schema - data modeling technique used to map multidimensional decision support data into a relational database. It is excellent for ad-hoc queries, but bad for online transaction processing. It contains four components: Fact table Dimension tables Attributes Attribute hierarchies Snowflake schema – a star schema in which the dimension tables have additional relationships 7 STAR SCHEMA COMPONENTS 2011 Fall ITEC 450 8 STAR SCHEMA EXAMPLE 2011 Fall ITEC 450 9 DATA MOVEMENT – ETL PROCESS 2011 Fall ITEC 450 ETL – Extract, Transform, and Load Capture – extract or obtaining a snapshot of a chosen subset of the source data for loading into the data warehouse Scrub or data cleansing – uses pattern recognition and AI techniques to upgrade data quality Transform – convert data from format of operational system to format of data warehouse Load – place transformed data into the warehouse and create indexes 10 DATA WAREHOUSE PERFORMANCE Perspectives of data warehouse performance Automated summary tables ITEC 450 Extract performance – how ETL process performs Data management – database design and data quality Query performance – OLAP tuning Server performance – hardware support 2011 Fall Provide a proper set of aggregate information Commonly implement with materialized views or batch operation tables DBMS features to support data warehousing Materialized views – automatically creation of summaries Bitmap indexes – widely used in data warehousing, in addition to B-tree Parallel execution – multiple processes work together simultaneously to run a single SQL statement 11 2011 Fall ITEC 450 MODULE 5 Metadata, Tools, and Data Warehousing 12 Section 5 DBA Rules of Thumb THE RULES OF THUMB Personal DBA handbook Backup everything and plan for worst all the time ITEC 450 Write down your own experience Categorize them in a searchable note or repository 2011 Fall Before making any changes, ensure that you can recover from them Automation and share your knowledge Create a systematic way to troubleshoot problems Create, reuse and share scripts Knowledge sharing will open many revenues for you Next levels Understand the business, not just the technology Keep up-to-date on technology 13 COURSE SUMMARY (YOUR LEARNING) 2011 Fall ITEC 450 DBA Roles and Responsibilities DBMS Architecture, Physical and Logical Structures DBMS Installation and Database Creation Database Connectivity and Network Components Database Security and Audit Capability Database Backup and Recovery Database Monitoring, DBMS System Tuning, Physical Configuration Optimization SQL Query Coding and Tuning, Data Loading Database Metadata, Data Dictionary Data Warehouse Characteristics and Overview 14 FINAL EXAM ITEC 450 Midterm coverage (30%) Backup choices, recover mechanisms and high availability features Performance influential factors, Database performance tuning Optimizer overview and optimizer influential factors Oracle query optimizer processing, statistics collection, execution plan Oracle physical and logical database structures Space management, RAID technology The load utility, data pump export and import DBMS Metadata, metadata type Oracle data dictionary, dynamic performance views Data warehouse, characteristic differences vs. operational database, analytic (OLAP) vs. transactional processing (OLTP) Data warehouse database design (star schema), ETL 2011 Fall 15 SCHEDULE REMINDER ONE MORE TIME ITEC 450 Final exam can be taken between Thursday, Dec. 8 and the week after Wednesday, Dec. 14. The final exam must be completed on or before Wednesday of Week 15, not Sunday! Check with your proctor or test center. 2011 Fall All assignments are due by Sunday, December 11, and no late assignments will be accepted after the date. Please review your grade book, and let me know any missing grades right way. 16 THANK YOU AND GOOD LUCK 2011 Fall ITEC 450 17