The EIM Reference Architecture A Journey to the Center of the Enterprise Presented by Luminita Vollmer - MBA, CDMP, CBIP April 30, 2014 – Austin, TX CONTENTS • • • • • • • • Introduction Presentation Goals Enterprise Architecture EnterpriseInformationManagement Reference Architecture Drivers for Ref Arch References Closure INTRODUCTION Speaker Bio: • IT professional for over 20 years • Data Architecture /Enterprise Information • Undergraduate studies in MIS/Business • MBA – College of St. Thomas in St. Paul • CDMP, CBIP certified • Cyberfraud Analyst and AML – in training INTRODUCTION Speaker Contact: • luminita.vollmer@gmail.com • lvollmer@moneygram.com • Linkedin: Luminita Vollmer The Presentation Goals Process Overview Introduce the Enterprise Architecture and Domains Standards that Influence IT Architectures Present the EIM Reference Architecture Introduce the templates The steps to the EIM RA Overview of how a RA is used in the enterprise Links to Resources available Enterprise Architecture – What It Is Enterprise Architecture is the process of turning business vision and strategy into a portfolio of capabilities, applications, information assets and technologies that enable the enterprise’s business vision and strategy through a clearly defined roadmap. • Ensures alignment of business and IT strategies by identifying and enabling common business capabilities, ensuring quality and timely information and making certain that proper technology investments are selected to meet MGI’s business objectives. • Helps to create a strategic roadmap with key programs/projects to attain the strategic business objectives and goals, and • Helps to create the enterprise’s IT strategies and technology direction by analyzing the industry trends, best practices, and technology direction in the marketplace • What are Principles Principles are general rules and guidelines, intended to be enduring and seldom amended, that inform and support the way in which an organization sets about fulfilling its mission. Depending on the organization, principles may be established at any or all of three levels: Enterprise: These principles provide a basis for decision-making throughout an enterprise, and inform how the organization sets about fulfilling its mission. Such enterprise-level principles are commonly found in governmental and not-for-profit organizations, but are encountered in commercial organizations also, as a means of harmonizing decision-making across a distributed organization. In particular, they are a key element in a successful architecture governance strategy (see Architecture Governance). Information Technology (IT): These principles provide guidance on the use and deployment of all IT resources and assets across the enterprise. They are developed in order to make the information environment as productive and cost-effective as possible. Architecture: These principles are a subset of IT principles that relate to architecture work. They reflect a level of consensus across the enterprise, and embody the spirit and thinking of the enterprise architecture. Architecture principles can be further divided into: Principles that govern the architecture process, affecting the development, maintenance, and use of the enterprise architecture EA Governance Framework Another view of Principles • Principles are statements of an Enterprise’s Values and Policies, and how they related to the IT Architecture. • They provide links between business, organizational resources, technology strategies; they represent the foundation for technical positions and templates. • A Position is “a stake in the ground”. • A template is a Blueprint. • Created in a vacuum – they will not serve the Enterprise. What Affects EA Principles Many kinds of Principles drive or affect the IT architecture • Information Security Association’s Generally Accepted Information Security Principles (GAISP) • GARP (Generally Accepted Recordkeeping Principles) is a framework for managing records in a way that supports an organization's immediate and future regulatory, legal, risk mitigation, environmental and operational requirements. • ITIL - is the most widely adopted approach for IT Service Management in the world. It provides a practical, no-nonsense framework for identifying, planning, delivering and supporting IT services to the business. To develop an IT Architecture – there has to exist a position on such principles and the position has to be at the Enterprise, domain, or project levels – where the variance of opinions happens. Principles Hierarchy Enterprise overall Principles EA Overall Principles EA Domain Principles EA Subdomain Principles Enterprise Architecture Domains (example) Enterprise Information Architecture Principles - Example Information is a corporate asset - Information is a shared asset that must be recognized and treated as a valuable asset to the enterprise and managed efficiently, consistently and securely to ensure its reliability and availability. Information must be defined - Information assets will be defined within models aligned to an organizing enterprise conceptual information framework. Information Redundancy must be carefully Managed - Information redundancy creates expense and risk and must be avoided where possible, minimized where necessary and properly architected using specific implementation patterns to ensure sound design is achieved. Information owners are accountable for information quality and appropriate usage Ensuring information quality and appropriate usage is the accountability of the information owner. Management of Information is a joint Business and IT responsibility - Information Management will be orchestrated centrally but responsibilities will come from both IT and Business. Enterprise Information Architecture Principles – Cont. Information is shared across organizational and business process boundaries Considering the constraints of Privacy and Security, Information captured by one business process will be shared as needed through a common architected approach. Develop and promote re-usable data services - Data services must use a common reusable model when consumed by applications thus ensuring uniformity and data quality is not in jeopardy. Information must be managed real-time - Have systems that manage information real time to respond to changing business conditions. Information must be secure - Have systems that protect against unauthorized use and disclosure of information. Develop and promote re-usable data services - Data services must use a common reusable model when consumed by applications thus ensuring uniformity and data quality is not in jeopardy. Information must be managed real-time - Have systems that manage information real time to respond to changing business conditions. Information must be secure - Have systems that protect against unauthorized use and disclosure of information. The IA Framework TOGAF 9.1 Architecture Framework Zachman Architecture Framework The DAMA Framework for EIM The TOGAF9 Framework – The ADM Method TOGAF 9.1 as an example A STEP-BY-STEP approach A METHOD for developing EA THE CORE of TOGAF HELPS ESTABLISH an EA Framework The Modules for TOGAF9 The TOGAF9 TRM TOGAF9 - The Technical RA A Model and Taxonomy of Generic platform services TOGAF9 - The Technical RA TOGAF9 - Information Infrastructure RA Information Management Technical Framework Enterprise Information Management – What It Is • Enterprise Information Management Architecture describes the principles, standards, patterns, and software components for managing information assets throughout the enterprise. It encompasses applications and services that integrate and manage data, content, and information. • As an enterprise discipline, Enterprise Information Management is concerned with the management of structured and unstructured information and the delivery of information across the enterprise to business processes and their supporting applications. The key sub-domains of enterprise information management include: Master Data Management, Data Integration, Business Intelligence, Content Management, and Knowledge Management. • Information Management (IM) is the sustaining organizational capability chartered to manage, govern, build and support BI across the organization. Information Management– What It Is The core capabilities of Information Management: – Acquire and Understand Information – this capability is concerned with analyzing business processes and underlying data management practices to understand information needs and gaps. It includes conceptual and logical modeling of our information, driving semantic reconciliation, profiling data, uncovering data quality issues and gaps, identifying system or record, and defining the sourcing mechanism. – Integrate Information – a core capability of information management is integrating data and information. This capability includes application integration capabilities, data integration capabilities, and master data management techniques. – Store Information – the storage, retention, and management of data and information at rest. Concerned with structure, unstructured, and metadata storage and retention. Data management practices for transactional and analytical repositories. Relational and dimensional physical structures and supporting infrastructure. – Publish and Present Information – is concerned with delivering information to both internal and external information consumers. This includes metadata (business, operational, and technical), reporting, analytics, and content. – Govern and Secure Information – data governance & stewardship, data quality management, information development and delivery (SDLC), managing identity, controlling access, auditing access, and encrypting sensitive information. The DAMA Framework for EIM Reference Architecture Loosley Reference Architecture Loosley Reference Architecture Loosely Reference Architecture – What It Is • A Reference Architecture (RA) is a proven template solution for an architecture in a specific domain (i.e. Master Data Management, Business Intelligence Reference Architecture, Service Oriented Reference Architecture.) It captures the essence of existing architecture and the vision of the future needs and evolution to provide guidance to assist in developing new needed architectures. • From EA Charter document - Reference Architecture (RA) consists of information accessible to all project team members that provides a consistent set of architectural best practices. • The Reference Architecture has several Viewpoints that together provide complete blueprints - Capabilities, Processes, Information, Components, Infrastructure • It is enterprise focused vs project • Directed by business needs, goals, objectives – use of Capabilities and Standards Reference Architecture – cont. • Originates from Enterprise Architecture Principles (contained in the EA Charter) • Describes the architecture layers, principles, major components, and patterns used • Introduces a common vocabulary to all constituents • Based on best practices within the industry and specific domains as a strong versatile reference point baseline for future projects • Enables conformity of specific project solutions, leading towards a desired target state, governing and guiding architectural decisions – used as a blue print. Benefits of the Reference Architecture • Communication – identifies and defines architecture components, patterns, and layers increasing understanding across stakeholders • Common point of arrival for projects – defines vision state of the architecture and provides guidance across projects. Projects solution architecture moves the current state architectures one step closer to the target state architecture • Reuse – Solutions that leverage the reference architecture for guidance will also be able to leverage common architectural components and patterns of preceding solutions. This provides incremental value at a lower cost due to reuse. • Strategic Planning and Roadmaps – provides a prescription for future work and facilitates the understanding of sequencing, dependencies, and gaps that exist within the architecture • Risk Mitigation – Based on industry best practices, standards, patterns, and proven solutions. We do not need to re-invent, we only innovate. The Viewpoints for RA The Viewpoints for RA • Relationship between the viewpoints in the Reference Architecture – Capabilities – What is provided by the Information Management architecture – Processes – How capabilities and services are provided within the architecture – Services – Interface to Capabilities and Processes, this view point describes the interface channels and the publicly consumable products offered by the IM architecture (people, process, technology) – Application and Technology, software components and technologies used to realize capabilities, processes, and services. – Infrastructure – represents the runtime, operational, or physical realization of the capability, process, and technology. The EIM Reference Architecture Capabilities DW BI Reference Architecture-Capabilities The Reference Architecture - Standards Component Standard Technologies ETL Information Server DataStage Data Warehouse DB2 Enterprise Warehouse Edition Data Mart DB2 Enterprise Warehouse Edition Metadata Management Information Server Metadata Repository, Workbench, Business Glossary Data Modeling Sybase Power Designer Data Profiling Attacama DQ Analyzer Integrated Reporting IBM Cognos, Actuate, Tableau Dashboards IBM Cognos Advanced Analytics ??? Ad hoc Analysis PL SQL Software Configuration Management SubVersion ODS, Master Data Management Repositories Oracle RDBMS IM RA – Component View 40 IM RA Component Technology Mapping 41 IM RA Environment View IM RA Data Management Layers 43 IM RA Gap View 44 Related Reference Architectures 45 Related Reference Architectures 46 Related Reference Architectures – Metadata 47 Related Reference Architectures – Mediation 48 Data Integration Data Integration Patterns Context Business capabilities and processes are often served by many heterogeneous applications. Optimization and automation of these capabilities and processes requires integration between applications. The applications vary in complexity, function, level of access, custom or package, and technologies. Problem How do you integrate information systems that were not designed to work together ? Forces • The Information Systems environment consists of multiple systems that were never designed to work together. • Application in which you are integrated to does not expose a business logic layer to support the integration needs • You need to minimize the disturbance to the existing application as much as possible or changing the existing system is not an option (packaged application) • Read-only access is required • High-volume/bandwidth (very large datasets) are required • Existence of commercial off the shelf tools (ETL, Data Federation, and Replication) can be more cost effective and timely than custom development Solution Integrate applications at the data layer by allowing the data in one application (the provider) to be accessed by other applications (the consumer). Implementation Choices: • Shared Database or Data Federation Patterns • Population or Move a Copy Data Patterns • Data Replication Pattern • File Transfer Pattern Connecting Technology through a SQL interface ETL external data integration using ODBC/JDBC. 49 Data Integration Data Integration Patterns(cont.) Data Integration Continued Consequences • Since this pattern bypasses all the business logic in the provider application, this pattern is not recommended for add, update, and delete operations. See functional integration • Tight coupling to the underlying provider application physical schema. Changes in the provider application will ripple down to the consumer applications. This can be mitigated by using Views into the data • Direct access to the providing application data may require the replication of security business rules in the consuming application. Security rules are typically implemented in the business logic layer. • Most commercial off the shelf products do not publish their schemas and therefore reserve the right to change the underlying schema at anytime. • Data integration does not provide business context of the data, the ability to understand the operation performed on the data is lost (updated, added, etc) Related Patterns Data Integration is the high level pattern, whereas the following are specific implementations of data integration. Integration efforts should select one of the following based on their requirements: • Shared Database Pattern can be implemented with Data Federation • Move Copy of Data Pattern (ETL) also known as the Population Pattern • File Transfer Pattern • Data Replication Pattern The preferred data integration patterns are the File Transfer Pattern and Move Copy of Data Pattern /Population Pattern using investment ETL technologies. 50 Current State Capabilities 51 What is driving/guiding the Information Management Reference Architecture? • Overarching reporting and analytic (BI) drivers influencing the reference architecture: – Align to external product demand – Improved quality of information – Improved timeliness of information – Improved availability of information – Improved security of information – Improved delivery of information • Business Intelligence Goals for the Year (IM Program Scope) – Build Solid Data Foundation (create a single source of data for consistency, create comprehensive Customer data, improve source data) – Upgrade Self-Service Capability (improve data visualization, reduce query time to increase adoption rate) – Decisions driven by Analytics at all levels 52 What is driving/guiding the Information Management Reference Architecture?- Cont • People, Process, Technology, and Data Gaps • Principles – Insert your list of principles • Proven Solutions, Industry Best Practices, and Patterns – Insert your solutions, selected best practices, patterns used 53 What is driving/guiding the Information Management Reference Architecture? – Fundamental Data Needs • ECIM Subject Areas: – Member – Person – Partner – Vendor – Transaction – Reference data – Specific enterprise data domains for each enterprise – (Insurance, Financial, Manufacturing, HealthCare, etc ) 54 What is driving/guiding the Information Management Reference Architecture? – Current Gaps Summary • Gaps in existing data warehouse(s): – DW is the “source of truth” • Inconsistent sourcing mechanisms , lacks uniform sourcing strategy, patterns, and consistency • Several methods to bring the same data domain data into the DW • Inconsistent semantics, lack of implementation models that are aligned with conceptual and logical models. Definitions largely driven by IT with little business involvement. Similar concepts have different meanings in each DW. • Inconsistent structure, lack conformed facts, dimension, structure, and data types • Incomplete metadata (logical model and definitions, data lineage, transformation rules, operational metadata, data quality metrics) – Timeliness of information, current latency of the existing DWs is not sufficient for the needs of the business – Missing historical changes, both warehouses perform destructive updates resulting in the in ability to compare as-was to as-is (type 1 DWs) – Deviation for industry best-practices and patterns. Neither follow a strict star schema design impairing usability and performance. – Limited scalability based on the lack of leveraging underlying technology to partition data 55 What is driving/guiding the Information Management Reference Architecture? – IM Program Scope • Current Scope for the specified time period ( 2014) – List of projects planned for execution • Planned (Beyond 2014 – Roadmap still fluid) – MDM Consumer and Partner – Compliance – ESB – BR system – ??? 56 Projects Roadmap, Business, and Industry Trends • Complexity of data security requirements will continue to increase, requiring more efficient management and integration with data at rest and in transit security and underlying technologies. Requirement complexity will be in the form of new regulatory requirements, client requested partitioning, and the increase of both data content in the EDW as well as the underlying user base. • The EDW will need to handle a diverse set of workloads and therefore technology advances in both the DBMS and workload monitoring solutions will need to be leveraged. The EDW will need to support: Client Reporting, Internal Reporting, Operational Reporting, Analytic processing, Embedded analytics, OLAP, and Ad hoc Use workloads. Appropriate layering, physical, and logical partitioning will need to be leveraged. Mixed workloads will follow data subject builds as well as sun setting existing warehouses. • Infrastructure resource utilization will continue to grow: processor, memory, network IO, and disk utilization. Every new subject area addressed in the EDW build (IM program) will drive additional capacity needs across the EDW infrastructure, specifically within the ETL and DBMS components. The layer of the EDW will drive increased disk utilization to support log, scratch, temp space for ETL processes as well as supporting persisted staging areas, the data warehouse repository, and data marts. • Adding data subjects, add metadata. Space requirements for metadata repositories will increase along with the EDW. Furthermore, these repositories will continue to be populated with business, operational, and technical metadata. Standing up metadata management services will increase load on the Information Server platform and increase the number of user transactions. 57 Projects Roadmap & Industry Trends • Standing up capabilities within our BI platform such as Dashboards, additional Partner Portal features, Enterprise Search, OLAP, Crystal Reports, and MS Office Integration will drive capacity needs at this layer. A more diverse user population with various workloads will stretch the usage. • Internal business users and our clients are demanding lower latency of information in the EDW. We are trending toward daily updates to the Data Warehouse Repository or Integration Layer. • Our customized service model is driving the need for new capabilities: Ad Hoc access to the EDW, security, usage of analytic tools, embedded analytic tools, analytic services, and data exchange services • Volume of data will continue to increase. We will be capturing information at a more atomic level than before, latency will be increased, architecture layering will require additional storage, our business model is based on driving volume, additional audit information will need to be captured to support new heath care regulatory requirements. 58 Projects Roadmap & Industry Trends • Users are asking for increased self service capabilities and data sandboxes to prove product and analytic value (user labs). • Layered architecture will require additional administration and maintenance. Environment segmentation and standardized release management will also compound the issue. Retention polices will become more complex. Need for test environment data refresh processes. • As embedded analytics become more pervasive at Prime, more and more of our front line applications will leverage analytic and BI services to enable operational processes. This will result in increased HA/DR requirements for EDW repositories supporting embedded analytics. This will drive capacity requirements for Enterprise Web and Application Server Architecture. • As the S2E roadmap drives data management and master data management maturity, new transactional and operational data stores will be introduced in the architecture. These master data management and operational data stores will require higher HA/DR requirements than the EDW. They will support missing data and master data management functions within the enterprise and support application integration through services. 59 Projects Roadmap & Industry Trends • Master Data Management and new Operational Systems will continue to drive new functional, process, and data integration requirements. Standardized platforms will be needed to orchestrate, coordinate, govern, develop, deliver, and test application integration and serviceoriented components. 60 References • Cliff Notes for GARP – http://www.cliffsnotes.com/more-subjects/accounting/accounting-principles-i/principles-ofaccounting/generally-accepted-accounting-principles • Open Group – http://pubs.opengroup.org/architecture/togaf9-doc/arch/index.html • Proven Solutions, Industry Best Practices, and Patterns • Oracle – http://www.oracle.com/technetwork/topics/entarch/articles/info-mgmt-big-data-ref-arch-1902853.pdf 61 References 62 CLOSURE