InfoSphere Metadata Comprehensive Deck Last Updated 9-8-08 © 2008 IBM Corporation Information Management Software Agenda • 2008 Metadata Market Survey • InfoSphere Information Server Metadata Approach • InfoSphere Business Glossary • InfoSphere Metadata Workbench • Third Party Metadata • Cognos Integration • Case Studies 2008 Metadata Market Survey IBM and Gavilan Research Associates (GRA) © 2008 IBM Corporation Information Management Software IBM and Gavilan 2008 Survey Highlights: Turning your Data into Information with Metadata Top 5 areas companies said they need to improve their metadata: • Documenting business metadata (business rules & element descriptions) 80.8 % of respondents • Support for Data Governance and Data Stewardship (67.5% of respondents) • Documenting a Data Warehouse, Data Mart, or Analytical Application (64.8% of respondents) • Support for Data Lineage reporting and/or Data Mapping analysis (64.9%) • Implementing a corporate Data Dictionary or Data Glossary (64.7%) Of those surveyed: Less than 7% said their implementation was highly successful ! IBM & Gavilan Research Associates 2008 Metadata Survey – 287 respondents Information Management Software What Are Your Top Metadata Management Issues ? Respondents could choose multiple answers, which created a broad spectrum. The “top 5” metadata management issues in 2008 are: • documenting business metadata (business rules & element descriptions) - 80.8 % of respondents • support for Data Governance and Data Stewardship (67.5% of respondents) • documenting a Data Warehouse, Data Mart, or Analytical Application (64.8% of respondents) • support for Data Lineage reporting and/or Data Mapping analysis (64.9%) • implementing a corporate Data Dictionary or Data Glossary (64.7%) Respondent Comments: • “My top issue is getting metadata out of existing toolsets into a repository, where it can be accessed and reported on.” • “Change Management is our biggest issue! How could you miss this point?” Source: IBM / Gavilan Research Associates 2008 MetaData Market Survey InfoSphere Information Server Metadata Approach © 2008 IBM Corporation Information Management Software So what exactly is Metadata? With a trademark label Stored in a can Made by Since Type of food With a special ingredient With many varieties Metadata enables you to put context and meaning to things. It is generated and consumed by every organization and software product. Information Management Software Metadata Primer for Business B • Business Metadata • T • Technical Metadata • O • Business rules, Stewardship, Business Definitions, Auditing Terminology, Glossaries, Algorithms and Lineage using business language. Audience: Business users. Defines Source and Target systems, their Table and Fields structures and attributes, Documentation for Auditing Derivations and Dependencies. Audience: Specific Tool Users – BI, ETL, Profiling, Modeling. Operational Metadata • Information about application runs: their frequency, record counts, component by component analysis and other statistics for auditing purposes. Audience: Operations, Management and Business Users. Literally, “data about data” that describes your company’s information from both a business and a technical perspective Information Management Software Let’s put Metadata to an Example Suppose you have these items of “data” MetaData is what describes, defines, & inter-relates your key information assets • 552-90-3456 • “Social Security Number: a unique identification number issued to citizens, permanent residents, and temporary (working) residents by the US Government ...” • pic XXX-XX-XXXX • $233,567.00 • “Profit Amount: a currency value that is calculated by combining data from the Customer Master database and Wholesale Inventory applications . . .” • Calculation included on Monthly Report Information Management Software Metadata helps answer important questions such as: • What data or information exists ? • Where is it being used ? • What is its business definition ? • What other names has it been called or is being called ? • How is it inter-related to other information ? • Who is using it ? • Why do we need it ? • When was it last updated ? What is “Profit Amount” ? How is it defined ? How is it calculated ? Where is it stored or used ? Is it reliable ? Accurate ? If I make a change to “Profit Amount” – what will be impacted ? Information Management Software Metadata Business Drivers What is happening in the Data Integration Market that requires Metadata? Governance and Compliance Regulations are increasing - How do organizations comply and meet documentation requirements? - How can organizations ensure accountability and responsibility? Business Competition continues to grow - How do organizations individualize their customer experience? - How can organizations get access to information to make correct decisions? Costs and system complexities are expanding - How can organizations drive optimization with integration? - How do organizations manage complex software environments? Information Management Software IBM Data Governance Maturity Model - measurable results Business Transformation Information Management Software A Brief History of Metadata – Where did we come from? Information Server Launched IBM buys Unicorn Community Focused Repositories … to $100 million mainly $50 million 2008 IBM buys Ascential 2007 maintenance and most organizations decided live with metadata definitions in multiple technologies, and manually synchronize 2002 Single Enterprise Repository 2000 IBM led AD Cycle Repositories Steve Born 1980 1970 ETL, DQ, DP Metadata Consolidation Solutions Consolidation to central repository using parsers, bridges and generators e.g. Dovetail First Metadata Repositories IBM Data Dictionary And Registries DataWarehouse 1990 13 Federated Repositories 1/3 failed miserably, 1/3 succeeded marginally, 1/3 demonstrated good ROI e.g. Ascential Metastage less-ambitious morescoped metadata management solutions e.g. Unicorn Understand how software and data services related to business processes workflows, and the requirement to publish, reuse and govern these services. Information Management Software Where IBM Fits in the Metadata Life Cycle IBM Clean, controlled metadata supply Enterprise Repositories InfoSphere Information Server Metadata Information Management Software Why Businesses Care Information Management Software Metadata Business Challenges Metadata is naturally a very complex subject which virtually all organizations address at some point and time. Some successfully and some unsuccessfully. Key challenges: • Obtaining agreement on what metadata means • What does metadata mean to a specific organization or division? • What metadata is important to track and manage? • How does each group use metadata for their particular job? • Selecting the correct metadata strategy for particular business requirements • How many and what kinds of silos of metadata exist today in organization? • Are there revenue $ at risk, compliance issues, regulatory rules which must be addressed? • Do we have the flexibility to assess the impact of changes with our current architecture? • Promoting adoption of a metadata strategy and associated technology • How does our approach address our different user’s needs? • Is our approach one that easy to use and facilitates adoption rather than hinders it? • Do we have business and technical sponsorship? Information Management Software The IBM Solution: InfoSphere Information Server IBM InfoSphere Information Server Unified Deployment Discover, model, define, and govern information structure and content Standardize, merge, and correct information Combine and restructure information for new uses Unified Metadata Management Delivering information you can trust Synchronize, virtualize and move information for in-line delivery Information Management Software InfoSphere Information Server: Unified Metadata Management Unified Metadata Management | Business Active, centrally managed repository with secure access via services layer Technical | Operational Define relationships, control extensibility, and link 3rd party metadata Share and deliver relevant information across the organization • Data integration metadata repository – core to InfoSphere Information Server • Facilitates change management & reuse • Simplifies integration • Increases trust and confidence in information • Enable data governance initiatives Any User Information Management Software InfoSphere Information Server: Unified Metadata Management Unified Metadata Management Business Active, centrally managed repository with secure access via services layer | Technical | Define relationships, control extensibility, and link 3rd party metadata Operational Share and deliver relevant information across the organization Role-based Tools with Integrated Metadata Business Users Subject Matter Experts Architects Data Analysts Enhance Collaboration & Alignment Developers DBAs InfoSphere Business Glossary © 2008 IBM Corporation Information Management Software Business Glossary Overview • Applications • Users • Business Glossary • Business Glossary Anywhere Information Management Software Applications of a Business Glossary Simply put, a Business Glossary is created to represent the language of the business, independent of technology Three Primary Applications: 1. 2. 3. Ownership Identifying stewards Managing content Collaboration Common, approved vocabulary Sharing domain expertise - Business & IT Auditability Evolution of language Centralized management All key enablers to regulatory compliance and support the IBM Data Governance Maturity Model Information Management Software Business Glossary: For Every Enterprise User Examples of most common users include: • General Business Users • Business Analysts • Data Analysts • BI and ETL Developers • Data Stewards • Subject Matter Experts • Business Glossary Authors • Business Glossary Administrators Information Management Software Understanding the value of Business Metadata • In the language of the business, independent of technology • Documents the business meaning of data & related technology assets • Used to • • • • • • • define a shared meaning standardize names establish responsibility, accountability, and traceability govern access share insights & experiences among users represent business hierarchies document business descriptions, examples, abbreviations and synonyms • Must be managed by those that understand the meaning and importance of the information assets to the business • Better aligns the efforts of IT with the goals of the business Information Management Software InfoSphere Business Glossary Create and manage business vocabulary and relationships Features Steward Business Users Steward Console • Facilitate business & IT communications by creating & managing a common business vocabulary • Web based interface shared across enterprise business teams • Allows creation of stewards & assignment of their responsibilities for terms & assets. • Link business terms to information assets Benefits • Aligns the efforts of IT with the goals of the business • Provides business context to information technology assets • Establishes responsibility and accountability in accordance with data governance policies Business Interface Information Management Software Populate: Import IBM Industry Models Financial Services Insurance Healthcare Available now! Industrial Telco Retail Jump-start your Business Glossary with thousands of industry standard terms from the IBM’s Industry Models. Refined by real customer usage, leverage years of experience and expertise to provide the best foundation for your glossary. Information Management Software Populate: Import InfoSphere MDM for PIM Share your product category information structures from Master Data Management with Business Glossary enabling access by all enterprise users. Information Management Software Populate: Create and Assign Stewards • Create and Assign Stewards to Terms, Categories or Assets • Who is responsible for this Term? • Who is responsible for this Category? • Who is responsible for this Asset? • View Contact Information for Steward Information Management Software Access: Simple to use – business browser I didn’t need any special training to use the business glossary. It’s laidout exactly as I expected. I can set this up so that my people immediately see the most used business hierarchies Information Management Software InfoSphere Business Glossary Anywhere ANY User Real-time access to business glossary from any desktop application Features From Any Application... • Click on a term and view definition in a pop-up window without loss of context or focus • Intelligent matching returns best candidates in a single search • Search engine for terms and categories • Access steward contact information directly • Security enforced via the Information Server common security layer Benefits • Increased trust and acceptance of information by delivering definitions in context • Expanded adoption of enterprise glossary outside of Information Platform technologies • Improved information availability with multiple access mechanisms for electronically stored information (ESI) Pop the Definition! Information Management Software Business Glossary Anywhere with Cognos 3. Select Term to view full details. 2. Candidate Terms returned. 1. Highlight Term Active and click to look up in Business Glossary Information Management Software Manage: Business and IT understanding Category: Costs Database = DB2 Term: Tax Expense Schema = NAACCT Full Name: Tax to be paid on Gross Income Table = DLYTRANS Column = TAXVL “The expense due to taxes …..” data type = Decimal (14,2) (John Walsh is responsible for updates. 90% reliable source) Derivation: SUM(TRNTXAMT) Status: CURRENT Achieve a common vocabulary between business & technical users! Shared Information Server metadata repository InfoSphere DataStage InfoSphere Business Glossary Information Management Software Business Glossary Business Benefits • Enables data governance • Common language supports compliance regulations such as Basel II • Represent and expose business relationships • Accountability and responsibility • Assign stewards as single point of contact for control • Improved productivity • Administrators can tailor the tool to the needs of their business users • Access information you need when you need it • Increased collaboration • Capture and share annotations between team members • Greater understanding of the context of information. • Greater trust in information • Immediate, in-context access to terms and definitions, improves decision making 33 InfoSphere Metadata Workbench © 2008 IBM Corporation Information Management Software Metadata Workbench Overview • Applications • Users • Metadata Workbench Information Management Software Applications of Metadata Workbench Simply put, Metadata Workbench allows you to understand where information came from and where it is used. Three Primary Applications: 1. 2. 3. Trust Data lineage for traceability Understand relationships Productivity Assess impact of change Maximize re-use Governance Visibility and reporting Centralized management All key enablers to regulatory compliance and the IBM Data Governance Maturity Model Information Management Software Metadata Workbench: For Projects leads & Architects Examples of most common users include: • Project Leaders • Data Architects and Modelers • DataStage Developers • BI and OLAP Developers • Business Subject Matter Experts • Business Users and Compliance Officers Information Management Software InfoSphere Metadata Workbench IT Developers Administrators Exploration of assets generated and used by Information Server Features • Cross-suite view and administration of Information Server assets • Explore, analyze and manage metadata graphically and textually • Comprehensive reporting on data lineage • Assess dependencies across Information Server & 3rd party tools • Perform searches and build customized queries Benefits • Assess and mitigate change management risk • Support compliance and governance initiatives requiring data lineage (eg Sarbanes-Oxley, Basel II) • Proactively manage and administer complex data integration environments Project Managers & DBAs Information Management Software Information Server: Optimizing Application Development Import Industry Data Models Exchange Data Structures Services Oriented Architecture Rational Data Architect Information Services Director Link Populate Deploy Common Enterprise Vocabulary Business Glossary Share Search and Profile Source Data Information Analyzer Share Map Sources to Target Model Transform and Cleanse FastTrack DataStage and QualityStage Share Unified metadata for active administration, management and reporting Metadata Workbench offers complete visibility and control of metadata Share Information Management Software Metadata Workbench Integration Asset Categories Information Management Software Metadata Workbench Feature Overview ADMINISTER Manage Integration Assets to enable in-depth analysis Assign security roles Link together multiple viewpoints of design assets from ETL, business, BI and modeling with operational metadata Edit names and descriptions of Integration Assets EXPLORE Explore key Integration Assets: Jobs, Reports, Databases, Models, Terms, Stewards, Systems, Specifications, Quality Rules Easy navigation of key Integration Assets Simple and advanced search Access runtime information to enrich reporting Integrated cross-view of Information Server and 3rd party assets Import export manager for 3rd party integration Graphical view of Asset Relationships ANALYZE Analyze dependencies and relationships between key Integration assets, Business Intelligence Reports and data models Trace data movement to and from databases, jobs and reports for full lineage Understand business meaning of columns, tables, and other assets Assess the impact of change across Integration assets Robust query builder Information Management Software Explore: Homepage The homepage offers direct access to Discover key Information Assets, Find any Information Asset or execute Queries. Information Management Software Explore: Graphically Browse Browse Database, Files, DataStage/QualityStage Jobs, Table Definitions or other objects within their Folder and Project placement. Flexible browsing capabilities enable users to find information easier and better manage complex environments Information Management Software Explore: Textually Search Where and how do I find out details about assets? • Full search and explore capabilities • View in textual list or graphical format • Search multiple ways including by asset class, name or by asset property Information Management Software Analyze: Data Lineage Where does a Field of Data in this Report Come From? • Import & Browse Full BI Report Metadata • Navigate through report attributes • Visually navigate through data lineage across tools • Combines operational & design viewpoint IBM Information Server Source Tables Information Management Software Analyze: Cross-Tool Impact Analysis Where Happens if I Change this Column? • Show complete change impact in graphical or list form • Includes impact on reports in BI tools • Visually navigate through impacted objects across tools • Allows impact analysis on any object type Information Management Software Analyze: Business Meaning What does this Field Mean? • Show relationships between business terms, data model entities, and technical and report fields • Provides cross-tool mapping of business terms • Allows field meaning to be understood • Allows business term relationships to be understood Information Management Software Metadata Workbench Business Benefit • Good exploitation and coordination of metadata across tools is very compelling • Tool Integration & metadata collaboration saves time and money, and improves Quality of results. • An organization’s ability to Govern Data is significantly improved. • A organization is better able to Manage Change, more Agile. • A Shared, Common, Vocabulary saves time and effort, helps development, and makes data more accessible & more understandable. The key is making the metadata capture and share process effortless Third Party Metadata © 2008 IBM Corporation Information Management Software Import Export Manager for Information Server IT Developers IT Administrators Expand visibility of metadata touch-points in support of data integration projects Features • Security enforced via Information Server common security layer as well as the 3rd party application security layer • Metadata Bridges interchange metadata with each specific application a consist of a model, a decoder, and an encoder which require no coding. • Import capabilities for 3rd party BI tools (Cognos, Business Objects, MicroStrategy), data modeling tools (ERwin, RDA) and databases (ODBC connections to all major RDBMS) • Support a variety of import formats including XMI, XML, UML, CWM and CSV metadata exchange formats Benefits • No manual interface coding required for 3rd party metadata visibility • Visibility of data modeling to ETL to report layer minimizes risks of overlooking critical dependencies • Leverage common metadata exchange environment for application development consistency Information Management Software Meta Integration Technologies, Inc. (MITI) • OEM of 3rd party metadata bridges for import - More than a dozen major vendors OEM MITI bridges* • IBM and MITI jointly certified and tested bridges • Additional (MITI) bridges - Many bridges are available “as-is” and can be easily enabled post installation. * http://www.metaintegration.net/Partners/Directory.html Information Management Software Fully Supported Bridges IBM and MITI both perform QA testing independently As of 8.1 release Information Management Software Bridges Provided As-Is MITI performs QA testing independently As of 8.1 release Information Management Software Bridges Provided As-Is (continued) MITI performs QA testing independently As of 8.1 release Information Management Software The Areas of Metadata Business Glossary & IS Users ETL Operational Metadata (Job Run Information) BI Reports Physical Schemas ETL Job Design Information Management Software The Areas of Metadata Connected Business Glossary & IS Users ETL Operational Metadata (Job Run Information) BI Reports Physical Schemas ETL Job Design Case Studies © 2008 IBM Corporation Information Management Software Melbourne Health (MH) Advances clinical research with next generation genetic information infrastructure Challenge • Melbourne Health’s legacy I/T infrastructure had limited ability to access and share medical data. • Their research data, diagnostic results, and clinical documentation were stored in mixed formats across mixed environments: MS SQL Server, MS Excel and MS Access • They managed disparate information silos without being able to gain insight and meaning from them – key to unlocking the advancement of medical research and improving treatments. Benefits • The BioGrid Austalia virtual platform currently links research databases across 16 organizations. A world’s first and milestone achievement for the research community. • BioGrid provides access to a comprehensive set of research data which ultimately leads to improved research accuracy and treatment quality, by: Solution • • Melbourne Health is deploying their BioGrid Australia project in 3 phases from 2003-2009 in collaboration with hospitals and research institutes. • Linking data across disparate clinical and biomedical data sources within and across institutions • Linking patient/subject records by assigning data Unique Subject Identifiers (USI’s) • Providing a uniform interface and access mechanism The project utilizes the following key software: • IBM DB2 • IBM InfoSphere Information Server: InfoSphere Business Glossary, IBM InfoSphere DataStage, and IBM InfoSphere Federation Server • SAS Enterprise Business Intelligence Server, SAS Web Report Studio, and SAS Enterprise Guide • BioGrid is a building block of a larger national grid initiative to share research data for the priority disease group of cancer. APPROVED FOR EXTERNAL USE Information Management Software Melbourne Health Screenshot: BioGrid Australia APPROVED FOR EXTERNAL USE Information Management Software Melbourne Health Client Quotes "At this stage, our researchers are already able to carry out previously impossible studies to stratify patients by genetic types. Researchers are already seeing the initial fruits of collaboration, which will stimulate further cooperation as BioGrid is expanded. Ultimately, BioGrid equips us with capabilities to accelerate clinical research through collaboration.“ – Robert Merriel, Director of Business Development, Melbourne Health “IBM InfoSphere Information Server is a key toolset for our multi-phase, multi-year Molecular Medicine Informatics Model project. We are leveraging the capabilities of InfoSphere Business Glossary, a module of IBM InfoSphere Information Server, to share information about the business and technical metadata stored in our research repositories so that users can have the relevant information at their fingertips.” – Robert Merriel, Director of Business Development, Melbourne Health APPROVED FOR EXTERNAL USE Information Management Software Major Insurance Company in the UK – Metadata Workbench Improved Trust of BI Reporting Through Lineage and Impact Analysis Business challenge Business intelligence reports are used to make some of this company's important business decisions. Understanding how these reports are created to ensure validity is a must have requirement. Today, when errors or questions about reports come up, a manual , time consuming, and error prone discovery process is completed by DataStage Developers to track the lineage of the BI report field in question. Solution As a long-time DataStage user, this Customer saw immediate value in leveraging the Metadata Workbench to better understand lineage and impact analysis across thousands of DataStage jobs. The capabilities of the Metadata Workbench could be used in place of tedious and error prone documentation currently used to trouble shoot errors or questions about business intelligence reports. In addition, the creators of BI reports saw value in accessing the Metadata Workbench on their own to do their own research about reports in question. Benefits Improved TRUST in BI reports Time saved to do manual troubleshooting of errors in BI reports Automated the documentation process “The Metadata Workbench helps this customer prove through lineage and impact analysis that their BI reports contain Trusted Information.” —Desiree RuizAmirgholi IBM Lab Advocate and Product Manager for the Metadata Workbench Summary © 2008 IBM Corporation Information Management Software Information Server Metadata Differentiators Auditing and traceability - single data integration repository Enhanced collaboration – seamlessly share information across user roles Simplified development – reduce manual steps with integrated platform development Streamlined information access – promote enterprise adoption through ease of use Strategic integration with IBM brands – roadmap for success