DAMA-NY 2008 Implementing Enterprise Information Programs with MDM-CDI & SOA Larry Dubov, Sr. Director, Sales Consulting & Architecture New York, NY May 15, 2008 LDubov@InitiateSystems.com Copyright © 2008 Initiate Systems, Inc. Agenda Definitions Why MDM and EDM Now and Key Challenges MDM and Data Hub Capabilities Data Stewardship Framework and Information Quality SOA and Data Services: Strengths and Weaknesses Information Management Methodology Lessons Learned and Accelerators What are the next disruptive things in EDM and MDM? 2 What is Master Data Management (MDM)? Master Data Management (MDM) is a framework of processes and technologies aimed at creating and maintaining an authoritative, reliable, and sustainable, accurate, and secure data environment that represents a “single version of the truth,” an accepted system of record used both intra- and inter-enterprise across a diverse set of application systems, lines of business, and user communities. Master data are those data which are foundational to business processes, are usually widely distributed, which, when well managed, are directly contributing to the success of an organization, and when not well managed pose the most risk 3 Customer Data Integration (CDI) is the “Entry Point” to MDM CDI Focus is on Individual & Organizational Entities: MDM MDM Expands the Problem to Include New Entities: Customers Products Prospects Equipment Patients People CDI Financial Assets Vessels Citizens Containers Employees Weapons Vendors Locations Suppliers Drugs Trading Partners Vehicles Party (CDI) 4 Product (PIM) Key Challenges Very complex, multidimensional, and multi-disciplined and can be risky Difficult to sell data initiatives to the business and executive management No single vendor provides a comprehensive solution These factors mandate development and reliance on sound models, open integration standards, and methodologies in building holistic solutions from multiple best-of-breed components New Customer & Relationship Centric Business Processes Data Governance & Standards Metadata Management Customer Identification, Correlation & Grouping External Data Providers Visibility, Security, Confidentiality, Compliance Service Oriented Architecture Information Quality Data Acquisition, Distribution & Synchronization (Batch & Real Time) 5 CDI Consuming Applications Business & Operational Reporting Exception Capture & Processing Enterprise Customer Data Managed by Lines of Business Enterprises organized by Lines of Business and manage customer information in a product-centric model with overlapping customer domains Business Line 1: Business Line 2: Business Line 3: Business Line 4: Business Line 5: 6 Business Line Products Business Line Products Business Line Products Business Line Products Business Line Products Business Line Customers Business Line Customers Business Line Customers Business Line Customers Business Line Customers So, What’s the MDM-CDI Focus? The Need to Transition from Product/Account to Party…it’s a Big Deal Current State: Future State: Household Client Grouping Account 123: Product 1 Account 456: Product 2 Account 789: Product 3 E-Statement KYC Acct & Client Docs Approval E-Statement KYC Acct. & Client Docs Approval Paper Statement KYC Acct & Client Docs Approval Spouse Joe E-Statement KYC Client Doc Approval Owner Joe Mary Joe Mary Mary Derived 7 Mary 1 Mary 2 Owner (Joint) Account 123: Product 1 Joe Mary Acct. Attr. Only Paper Statement KYC Client Doc Approval Owner Beneficiary Account 456: Product 2 Acct. Attr. Only Owner Account 789: Product 3 Acct. Attr. Only Account Grouping Drivers Business Area: Business Development, Sales & Marketing Customer Service Risk, Privacy, Compliance & Control Operational Efficiency Drivers: Cross sell/up sell to existing customers Effectiveness of marketing campaigns Recurring revenue from existing customers Retain “good” customers by reducing attrition rates Recognize “bad” customers Risk Management Accurate Books & Records Compliance with AML & KYC Regulations Compliance with corporate standards and policies Regulatory fines and penalties 8 Account setup time Customer service time Customer intelligence and level of service Consolidated statements Account setup costs Customer acquisition costs Administrative overhead of redundant data entry Operation costs: duplication, redundancies, transaction errors, data processing errors & exceptions Failed tactical initiatives Reduces costs of planned initiatives due to CDI MDM Is Adding Value When You… … Purchase Software … Pick Up Your Prescription … Apply for a Loan … Check In & Earn Points … Identify Risks … File Insurance Claims … Register a Patient 9 MDM and Data Hub Capabilities Copyright © 2008 Initiate Systems, Inc. Single Version of The Truth Merge and Persist or Composite View INFORMATION J. Jones (Name) 35 West 15th Street (Address) Toledo, OH Sales (Address 2) INFORMATION James INFORMATION (First) James Jones Jones (Name) 35 W. 15th Street (Address) Toledo, OH (Address 2) (Last) 35 West 15th Street (Address) Toledo Customer Support (City) OH INFORMATION Jim Jones M (Name) Gender) 35 West 15th St. (Address) Toledo, OH (Address 2) 11 (State) 30 (Age) E.R. Single Version of Truth: Commercial Customers D&B Name Addr Cont. Phone ABC Incorporated 9146 E VIA DEL SOL NETOWN, CA 45883 Joe Smith 480-473-5620 Trusted System of Record Back Office Name Cont. Phone ABC Inc. Will Jones 480-473-5620 Name Cont. Phone ABCC Incorp. Joseph Smithe 304-473-5602 AIU Name Addr Phn 12 Provides accurate, real-time access to complete customer or entity data across disparate sources, systems and networks AB&C 9146 VIA DEL SOL NETOWN, CA 45883 480-473-5620 INFORMATION Name ABC Inc. Addr 9146 VIA DEL SOL NETOWN, CA 45883 Cont. Joseph Smith Cont. Will Jones Phone 480-473-5620 Product Product Why Do We Need MDM? (DW is Only as Good as Its Dimensions) Customer Customer Can we really ‘slice and dice?’ Traditional Deterministic ETL may not be sufficient… This is where Probabilistic MDM enabled by Data Hubs comes in 13 Star Schema Hubs If your MDM solution is BI driven, align your MDM solution with complex DW dimensions Time Customer Hub Customer Branch Hub Branch Facts Account Account Hub 14 Product Status Product Hub The Initiate MDS Solution in an Enterprise Architecture Transaction Support Security & Audit Trail 15 Master Data Views Performance Scalability Relationships Hierarchies Data Stewardship Batch Support Messaging Matching Accuracy Orchestrated Services Loads & Extracts Sales Web APIs CRM Orders Self-Service Events & Alerts Initiate Master Data Service™ Web Services Call Center Profiles Marketing Implementation Styles Batch Near Real-Time Consolidation Style Real-Time Link Co-Exist Combine Registry/Slave Hybrid Transaction/Master Ownership Style 16 More on Matching and Linking Step 1: Optimizes data for statistical comparisons Normalizes & compacts data, creates derived data layer, source data remains intact Phonetic equivalences, tokenization, nicknames, etc. Step 2: Finds all the potential matches Casts a wide net – all matches on current or historical attributes, prevents misses Partial matches, reversals, anonymous values, etc. Step 3: Scores accurately via probabilistic statistics Compares attributes one-by-one and produces a weighted score (likelihood ratio) for each pair of records Frequency weights specific to your business Edit distance, proximity of match Allows custom deterministic rules, e.g. false positive filters Should be linked Step 4: Custom threshold settings Should not be linked Single or dual threshold models Link, don’t link, don’t know – “learns” from manual input Manage cost/quality trade-offs Don’t Manage the linkages, workflow review link Lowest possible score Manual review Lowest threshold 17 Upper threshold Link Highest possible score Hierarchy Management The term hierarchy is used only as a simple hierarchy with one and only one root, only one parent for each node within one hierarchy Typically one hierarchy is selected as the Master used as a foundation (e.g. D&B or custom) There is a notion of source precedence / tree of truth High performance match to build the hierarchy 40MM records for 15 minutes One original systems record (member) or single version of truth record (entity) can belong to multiple hierarchies (e.g. corporate for D&B and geography with territories, regions etc.) A data steward can edit the hierarchy manually, (e.g. if there is a knowledge of a merger) When later the merger update is coming from the Master source, the data steward can reconcile the source merge with the node previously created manually Hierarchy query and navigation is done using various types of methods that allow to navigate to the node, node’s immediate children and all the sub-tree below or navigate from the node up and across (to be checked) The product can export a hierarchy (e.g. to build a DW dimension) 18 Hierarchy: Management & Services Understand & Visualize Customer Relationships Establish business & consumer hierarchies Hierarchy Source Data Customer Data 19 Resolve logical master from multiple internal or third party source hierarchies Rule based hierarchy & relationship creation and management Maintain individual & organizational hierarchies through web application to support active data stewardship Initial Hierarchy Harmonization: Original State of Customer Records Customer’s Organization Records are Fragmented Utility of hierarchies housed in SAP & other apps is inconsistent Disclaimer: (Example Data Only) Legend: 10 20 30 Existing (SAP) Relationships Shipping Location Pricing ID: 20 Source: 40 Bill To: 50 60 70 80 Name: Address: Phone: 10 SAP Harley-Davidson Inc 3700 W Juneau Ave Milwaukee, WI 53208-2865 4143424680 20 SAP Harley-Davidson Motor Co 3700 W Juneau Ave Milwaukee, WI 53208-2865 4145353500 40 SAP 20 Andy’s Harley-Davidson Business Highway 81 N Grand Forks, ND 58203 7017756098 60 SAP 20 Shumate Harley Davidson 6815 E TRENT AVE Spokane Valley, WA 99212-1252 5099286811 30 .COM Buell Motorcycle Co 2815 BUELL DR East Troy, WI 53120-1366 2626422020 80 SAP Buell Motorcycles 8272 Gateway Blvd E El Paso, TX 79907-1511 9155925804 30 90 Initial Hierarchy Harmonization: DNB Reference Source 1 2 3 4 5 6 ID: ID: 21 Source: Source: Parent: Parent: Name: Name: Address: Address: Phone: Phone: 1 DNB 1 2 DNB DNB 1 Harley-Davidson Inc Harley-Davidson Financial Services, Inc 3700 W Juneau Ave Milwaukee, WI 532082865 3700 W Juneau Ave 150 S Wacker Milwaukee, WI Dr 532082865 Chicago, IL 606064103 65 DNB DNB 2 1 Harley-Davidson Harley-Davidson Credit Motor Corp Company Inc 4150 W Technology Way 3700 Juneau Ave Carson City,WINV 897062009 Milwaukee, 532082865 7758863393 4143424680 3 DNB 1 Buell Motorcycle Company 2815 Buell Dr East Troy, WI 531201366 2626422020 4 DNB 1 Harley-Davidson Europe LTD 6000 Garsington Rd Oxford, Oxfordshire OX4 2DQ 1865719000 5 DNB 1 Harley-Davidson Motor Company Inc 3700 W Juneau Ave Milwaukee, WI 532082865 4143424680 Harley-Davidson Inc 4143424680 4143424680 3123689501 Initial Hierarchy Harmonization: Target State 1 2 3 4 5 6 10 20 40 ID: Source: 1 3 5 DNB 30 10 20 SAP .COM 22 Parent: Parent: 1 50 30 60 70 80 Name: Name: Address: Phone: Harley-Davidson Harley-Davidson Inc. Buell Motorcycle Motor Company Company Inc. 2815 Dr. Ave. 3700 Buell W. Juneau Milwaukee, WI 531201366 532082865 East Troy, WI 4143424680 2626422020 Buell Motorcycle Motor Co. Harley-Davidson Inc. Co. Harley-Davidson 2815 DR Ave. 3700 BUELL W. Juneau East Troy, WI Milwaukee, WI 53120-1366 53208-2865 2626422020 4143424680 4145353500 90 Resolve & Rationalize Hierarchies for Immediate Impact 10 2 Harley Davidson: Global Account Info: Customer Locations / Bus Units 9 Current Yearly Purchasing $900K Added Potential $300K 1 3 4 5 30 20 70 6 80 90 40 50 ID – Pricing Code: 20 – KFDRR 70 – [missing] 60 Accurate Relationships Will Drive Revenue: DNB members that are not yet Customers Will be properly included and targeted in marketing campaigns and in sales activity Misaligned Pricing or Territories: Unassigned or incorrect track codes and rep assignments Improved Customer Satisfaction, Improved Sales Coverage, Improved Sales Operations Incomplete Customer Profiles: Matched & organized records required for accurate analytics Deliver complete customer relationships to Data Warehouse and Marketing Analytics apps 23 New Account Creation Scenarios A ID 24 Source Parent Name A New Andya Harley Davidson B New Buell Motor Cycles C New Schumate Harley Davidson B C Address Phone HWY 81 N Grand Forks, ND 58206 8272 Gateway Boulevard El Paso, TX 77907 7001 E Trent Ave Spokane, WA 99212 592-5804 Improve Account Creation Process & Data Quality Pricing Code: Duplicate prevention Pricing alignment Territory assignment KFFDE 1 10 2 3 A 4 B C 60 70 5 30 20 Sales Territory: 6 840123 80 90 40 50 Duplicate: DO NOT ADD! Assign Correct Track Code to New Account Assign Correct Territory to New Account ID: Source: Source: Parent: Parent: Name: Address: 6 90 40 SAP .COM SAP 2 30 20 Buell Motorcycles Shumate Harley Davidson Andy’s Harley-Davidson 8272 Blvd. E.N. 6815 Gateway E. TRENT AVE. Business Highway 81 El Paso, TX 79907-1511 Spokane Valley, 99212-1252 Grand Forks, NDWA 58203 N/A 840123 7017756098 B C A [new] [new] Buell Motor Cycles Schumate Harley Davidson Andya Harley Davidson 8272 Boulevard 7001 Gateway E. Trent Ave. HWY 81 N. El Paso, TX Spokane, WA77907 99212 Grand Forks, ND 58206 KFFDE 840123 25 Track Code: Territory: Phone: Relationship Management Relationship is a much looser construct than a hierarchy. Relationship can be used to associate people with group or products with categories, etc. Relationships supports one-tomany and many-to-many associations Relationships can be symmetric and asymmetric For each relationship type its cardinality (one-to-many or many-to-many) is defined along with its symmetry (symmetric or asymmetric) Also when a relationship type is defined, the types of records that can be related are also defined 26 One-to-Many Many-to-Many Asymmetric Symmetric Information Quality and Data Stewardship Framework Copyright © 2008 Initiate Systems, Inc. Approaches to Information Quality “Upstream” at the point of entry of customer information Better validation Change in business process is likely required Change in applications, workflows and data flows “Downstream” Focus on ETL Includes data stewardship Less invasive – does not require changes in business processes Less effective – always on the flow of dirty data Combination of both “Upstream” and “Downstream” approaches required to accomplish best results 28 Moving Resolution of IQ Issues Closer to Point of Entry: Account Opening & Client On-boarding Account-centric: Customer-centric: Data Entry Data Entry Customer Information File Product 1: Account 1: Account Attributes Customer Attributes Account 2: Account Attributes Customer Attributes Account 1: Account Attributes Customer Attributes Product 2: Householding System Data Hub Product 3: 29 Account 2: Account Attributes Customer Attributes Product 2: Account 3: Account Attributes Customer Attributes Account 4: Account Attributes Customer Attributes Product 1: Account 3: Account Attributes Customer Attributes Product 3: Account 5: Account Attributes Customer Attributes Data Enrichment Vendor Account 4: Account Attributes Customer Attributes Account 5: Account Attributes Customer Attributes Enterprise Data Stewardship Framework Data Stewards Data Governance Policies, standards, processes, roles, responsibilities, metrics, & controls Validity of Identifiers Updates overlaying identity Link Merge Conflicts between match & hierarchy / relationships associations Summary reports on data quality metrics 30 Information Technology Improves data entry validation Configures data quality task generation & reports Systems Data Quality Improvement Loop Technology Supports Customizable Data Quality Task Resolution Queues: Performs ongoing data quality task resolution InitiateTM Inspector: High Level Capabilities 31 InitiateTM Inspector: Primary Tasks 32 InitiateTM Inspector: “Potential Linkage” 33 InitiateTM Inspector: “Potential Linkage” – Select and Action (Page 1 of 2) 34 Service Oriented Architecture and Data Services: Strengths and Weaknesses Copyright © 2008 Initiate Systems, Inc. Service Oriented Architecture Software design and implementation architecture characterized by the following: Logical View – abstraction of loosely coupled, reusable business needs and functions; Message Orientation – exchange between provider and requester Description Orientation – machine-processable metadata to support interoperabilty Granularity – combination of coarse-grained and fine-grained Network Orientation – typically used over network Platform Neutral Software architecture in which software components are exposed as services on the network, which can be reused as necessary for different applications When implemented with web services, offers a standard foundation for functionality reuse and data access within a federated enterprise and services provided externally What is in common between SOA initiatives and EDM or MDM? Data Services 36 Data Services: Format Agnostic & Source Agnostic Data Management 1 Executive End User Interfaces with the System by Requesting a data source agnostic Business/Data Service Manager Find Client List Reports Report Premium Revenue Analyst Create New Account Create New Report 2 Data Services Metadata Data Services Metadata translates the course-grain business service call into an orchestrated set of data location aware service calls. The Metadata processes parameters of the request and security access eligibility and data visibility parameters of the requestor. Calculate Transform Data CRUD Record Get Best Source Data and Process It 3 Note: Metadata insulates user applications and business services from data sources. Thus the data sources can be changed or replaced seamlessly with no changes in user interfaces and user experience 37 Capture Exception Data Sources 4 Execute and Return to Requestor Data Hub: SOA Architecture Viewpoint Service Consumers – Business Applications External (Exposed) Services Layer Individual Identification & Recognition Privacy Preference Capture & Notification Identification of Associations, Roles & Relationships Customer Grouping Management Compliance Notification & Reporting Customer Information Maintenance Customer Insight Reporting Internal Services Layer 38 Key Services Third-Party Data Interface Service Data Synchronization & Queue Management Data Archival & Versioning Coordination & Orchestration Visibility, Entitlements, Privacy & Security Rules / Workflow Administration Metadata Management Transaction Logging & Auditing Content Management & Caching Event / Notification Management Error Management Data Provider Service Interface Data Provider Service Interface Data Provider Service Interface Legacy Data Store CDI Customer Hub Other Data Store ILLUSTRATIVE Reference Architecture Business Processes Layer Contact Mgmt. Campaign Mgmt. Process Mgmt. Orchestration Relationship Mgmt. Document Mgmt. Hub Data Management Layer Client/Suspect Identification Profile Mgmt. Grouping Mgmt. Party Mgmt. Enrichment & Sustaining Hub Data Rules Layer Identity Matching Aggregation & Split Rules Synchronization Rules Visibility Rules Transformation Rules Rules Capture & Mgmt. Hub Data Quality Layer GUID Mgmt. Address Standardization Data Quality Mgmt. Transformation & Lineage Reporting Hub Systems Services Layer Security Visibility Services Orchestration Transaction & State Mgmt. Data Sources 39 Persistence Synchronization Legacy Connectivity Pros & Cons of SOA: When to Use or Not to Use SOA Pros Reduced data redundancy Business, data governance and data stewards can define SLA via services Data services provide level of abstraction – no need to work at the data and data model levels Standardized interaction within the enterprise, external and vendor provided services Increased productivity of development and agility to support evolving requirements Cons Performance problems if multiple pieces of information are to be joined If implemented with web services, solutions do not support transactional integrity for synchronous processes; compensating transactions required SOA initiatives don’t meet expectations if not supported by data strategy Use of data services requires strong governance and a new culture for the data governors, data stewards, and testers When implemented properly SOA and data services provide significant benefits for MDM and EDM 40 Testing Data Hub (Not Just Data But Also Services) Testing SOAP messages Testing WSDL files and using them for test plan generation Web service consumer and producer emulation Testing the publish, find, and bind capabilities of a SOA Data Hub Testing the asynchronous capabilities of Web services Testing dynamic run-time capabilities of Web Services Web services orchestration testing Web service versioning testing 41 Information Management Methodology Copyright © 2008 Initiate Systems, Inc. Importance of Information Management Methodology Implementation of enterprise information management projects (DW, ODS, MDM, CDI, etc) require well structured methodology Methodologies used for data intensive projects are different from traditional application development methodologies 43 Methodology – Need and Overview Mike2.0 (Method for an Integrated Knowledge Management) is an open source for Enterprise Information Management (www.openmethodology.org) Developed by BearingPoint Available for Open Source Community since December 2006 Transition to Open Source “Creative Commons” license completed in May 2007 Over 2000 online pages and growing Contains Phases, Activities and Tasks Open Source Mike2.0 allows global community to use this methodology right now for their Information Development initiatives Organizations and individuals can sign-up to become a contributing member of Mike2.0 44 Mike2.0 – Usage Model Details 45 Mike2.0 SAFE Architecture SAFE (Strategic Architecture for the Federated Enterprise) provides the technology solution framework for MIKE2.0. 46 Lessons Learned and Accelerators 47 The Three Dimensional Socialization Roadmap Level of Involvement Training Testing Development 48 Lifecycle Phases/Releases Planning Ownership: Demonstrated commitment to the change and accountability Buy-in: Agreement with the concepts and ideas & expressed support Understanding: Internalizing the concepts and ideas and grasping the implications of the change Awareness: Becoming cognizant and developing a sense of appreciation for the change Security Front Office Back Office Senior Management Technology & Infrastructure Stakeholders Legacy Systems Helps Program Managers Build Communications Plan Typical Implementation Work Streams: Organizing for Success & Breaking the Problem Down It is much easier to discuss, define and plan MDMCDI when the problem is broken down into more manageable areas and specialty domains Master Entity Identification Entity groups & relationships Data governance, standards, quality, & compliance Data architecture Metadata management & administrative applications Initial data load Inbound data processing (batch & real-time) Outbound data processing (batch & real-time) Changes to legacy systems & applications Visibility & security Exception processing Infrastructure Data Hub applications Reporting requirements of a stratified user community Testing Release management Deployment Training Helps Program Managers Build Project Plan 49 Complexity vs. Manageability Manageability Complexity Plan a Release Here Critical Point Helps Program Managers Define Phases & Releases 50 Fastest ROI High Potential End State Business Value & ROI ROIInitial Resolve… Synchronize… Relevant Relationships in the Data Data, Systems, Processes & People Master… Start w/ Resolve Your Data Start w/ Master Low CDIStart Initial Phase 6 12 Time - Months 51 18 24 Implementation Continuum Pure Registry Mastered Customer Process Management Customer Transaction Management Customer Data Access Customer Data Synchronization Real-time/ Operational Batch/ Analytical Cross Reference Management Create linkages amongst all records Prepare data for new systems 52 Establish and maintain a trusted source for analysis Provide people with on demand search Provide bidirectional update between sources of customer information through messaging, APIs and other integration methods Transactional applications built on top of customer definition across sources Transference of record ownership – Hub owned and maintained Manage business process associated with customer data/ transaction management Enterprise Information Program Develop Repeatable Initiative On-boarding Processes & Templates Program Initiation: Business Case & Value Proposition Business Requirements Target State Solution Detailed Roadmap Data Governance, Standards, Data Quality Architectural Principles Program Initiated Global Geography Program Planning & Definition On-boarding Initiative 1 On-boarding Initiative 2 On-boarding Initiative 3 53 DATA MODELING DW DEVELOPMENT CDI-MDM DATA PROFILING & DATA QUALITY DATA SERVICES FRAMEWORK DATA GOVERNANCE & STEWARDSHIP Develop Repeatable MDM Systems on-boarding Processes and Templates In the first year and first implementation phase the number of legacy systems integrated in scope of MDM-CDI is limited (typically 2-3) How to accelerate on-boarding of new systems in the consequent phases given that it is not unusual that 20-50 systems can be in scope of MDM-CDI integration? A well-defined set of system on-boarding standards and procedures determines common rules that each legacy system should comply with to be integrated into the evolving MDM-CDI solution Enables a repetitive on-boarding process and enables sustainable accelerated solution growth in terms of the number of systems and LOBs Preserves integrity and consistency of the MDM-CDI solution Improved data governance Enables highly sustainable pace of 54 Two Schools of Thought on Hub Data Model Hub with Out-of-the-box Data Model Data Model Agnostic Hub Product Pros Pros Seems attractive to have the “right” data model out-of-the-box Flexible to accommodate any data model and its changes The product has some pre-built coarse-grain business transactions Can generate fine-grain services on top of any data model Cons Cons How flexible is it to support ongoing changes? Development work required to build coarse-grain services to support composite transactions Possible performance and maintenance impact due to additional metadata lookups Overhead of having multiple entities and attributes that never used by your specific solution 55 Hub Implementation: Buy vs. Build vs. Data Enrichment Partner Traditional “buy or build” question is typically resolved in favor of “buy” An additional consideration is the use of an External Data Enrichment vendor Can we outsource the primary function of CDI hub and do customer match externally? Use of an External Data Enrichment partner has its own pros and cons Pros Higher match accuracy that based on the Knowledge Base (US NCOA and other Libraries) Ability to recognize new customers and prospects Additional data from the Knowledge Base – “data enrichment” Cons Need to share customer data with external vendor Capabilities and Knowledge Base quality depends on the country – domestic better than international Additional cost 56 Focus on Data Mapping Data Mapping is an activity that “maps” the legacy system attributes to the new customer-centric model and vice versa This activity is performed by business analysts The produced data maps are used by ETL and EAI developers The mapping process is time-consuming, can cause numerous errors and can be on the critical path of development A data mapping vendor product can help accelerate delivery Drag-and-drop interface Open source mapping metadata Ability to integrate the mapping metadata with ETL, EAI and EII tools and share the metadata rules Ability to reverse the transformation rules when possible 57 Creation and Protection of Test Data Sensitive Customer data must be anonymized (obfuscated, cloaked) to disguise it from unauthorized personnel in test and development environments. Some anonymization techniques are as follows: Masking Data Substitution Shuffling records . Number Variance Gibberish Generation Encryption / Decryption Key challenges in using data anonymization techniques include: • Ability to preserve logic potentially embedded in data to ensure that application logic continues to function • Need to provide consistent transformation outcome for the same data 58 Some Key Reasons for Project Failure to Avoid at Project Kick-off Lack of executive support and budgetary commitment Lack of cooperation and/or coordination between business and technology Lack of consuming applications – “if we build, they will come…” Lack of end-user adoption Underestimation of legacy impact Insufficient socialization throughout the enterprise to include all stakeholders at the right level Underestimation of the need for layered architecture provided by SOA Gaps in data governance, stewardship, and information quality strategy Miscalculated staffing needs 59 What are the Next Disruptive Things in MDM and EDM? Match and link evolution: From entities to relationships Integration of MDM with Business Rules Engines and Work Flows Data Stewardship Framework Metadata Integration Externalization of data visibility and security 60 Q&A 61