NOT PROTECTIVELY MARKED Enterprise Architecture 2010 Reference Architecture Enterprise Metadata Management June 2010 Version: 1.0 Editor Mike Williams Status: Issued Date: 16 June 2010 HA Reference Architecture _________________________________________________________________________ Document Control Reference Architecture – Enterprise Metadata Management Mike Williams Ivan Wells General – SHARE and PartnerNET Issued Document Title Author Owner Distribution Document Status Revision History Version 0.1 1.0 Date 30th December 2009 16th June 2010 Description First draft. Baseline issue. Author Mike Williams Mike Williams Forecast Changes Version Date Description Reviewer List Name Role Ominder Bharj Ian Cornwell Dave Waite Reviewer Mott MacDonald (ITS Metadata Registry) Atos Origin Approvals Name Title Ivan Wells Strategy and Architecture Date Version Document References Document Title Document Links [1] Technology Policy L1 – Corporate Data Catalogue Deprecated. [2] EA White Paper – Meta-models TBA _________________________________________________________________________ 08/03/2016 v1.0 Page 2 of 22 HA Reference Architecture _________________________________________________________________________ CONTENTS 1 INTRODUCTION ........................................................................................................... 4 1.1 PREAMBLE ................................................................................................................... 4 1.2 RELATIONSHIP TO REFERENCE MODELS ........................................................................ 4 2 PLATFORM INDEPENDENT MODEL (PIM) ................................................................. 7 2.1 DEFINITIONS ................................................................................................................ 7 2.2 REGISTRIES AND REPOSITORIES ................................................................................... 8 2.3 META-MODELS ........................................................................................................... 11 2.4 CATALOGING: KEY TO ARTIFACT DISCOVERY ............................................................... 12 3 PLATFORM SPECIFIC MODEL (PSM) ....................................................................... 13 3.1 INTRODUCTION........................................................................................................... 13 3.2 ENTERPRISE METADATA MANAGEMENT OPEN INTEGRATION FRAMEWORK..................... 13 3.3 SOA REGISTRY REPOSITORY ..................................................................................... 16 3.4 THE BUSINESS INTELLIGENCE DOMAIN ......................................................................... 18 3.5 THE EA MODELLING DOMAIN ....................................................................................... 21 3.6 SUMMARY AND OVERVIEW .......................................................................................... 22 _________________________________________________________________________ 08/03/2016 v1.0 Page 3 of 22 HA Reference Architecture _________________________________________________________________________ 1 INTRODUCTION 1.1 Preamble Reference architectures describe one or more Architecture Building Blocks for architectures in a particular domain. They also provide a common vocabulary with which to discuss implementations, often with the aim of stressing commonality. In Model-Driven Architecture (MDA) terms, they equate to Platform Independent Models (PIM’s). These represent (potentially re-usable) components of business, ICT, or architectural capability that can be combined with other building blocks to deliver architectures and solutions. Building blocks can be defined at various levels of detail, depending on which stage of architecture development has been reached. For instance, at an early stage, a building block can simply consist of a name, or an outline description, in architecture models which represent a placeholder for subsequent specifications. Later on, a building block may be decomposed into multiple supporting building blocks that may then be accompanied by full specifications. Reference Implementations are examples of software specifications. These are intended as a guide for Service Providers to develop concrete Solution Building Blocks (SBB’s). In ModelDriven Architecture (MDA) terms, they equate to Platform Specific Models (PSM’s). These PSM’s are described as either Commercial-Off-The-Shelf (COTS) or Open Source Software (OSS). In this respect, the HA Technology Policies are aligned with CrossGovernment Enterprise Architecture (xGEA) Technical Policies. These specify that OSS components should be considered as viable building blocks wherever they can be shown to meet the business requirements and offer Value for Money (VfM). Therefore, actual product selections will generally be determined through procurements and their evaluations of the Most Economically Advantageous Tenders (MEAT). Where such selections have already been made, the Reference Implementations will be superseded by Level 2 (Physical) Technology Policies which reinforce the use of those components. Some of these components will stem from a build out, through re-use, of the HA’s more recently acquired, existing infrastructure assets and investments, such as in Business Intelligence. In all other cases, the PSM’s will be based on OSS projects which implement the relevant Open Standards. 1.2 Relationship to Reference Models This reference architecture refers to the Enterprise Metadata Management capability. This is positioned within EA as depicted in Figures 1-3 for the Enterprise Architecture Reference Model (EARM), Technical Reference Model (TRM) and Data Management Framework respectively. _________________________________________________________________________ 08/03/2016 v1.0 Page 4 of 22 HA Reference Architecture _________________________________________________________________________ Figure 1 - EARM Context Figure 2 - TRM Context _________________________________________________________________________ 08/03/2016 v1.0 Page 5 of 22 HA Reference Architecture _________________________________________________________________________ Figure 3 - Data Management Framework context This reference architecture supercedes Ref.[1] Technology Policy L1 – Corporate Data Catalogue. _________________________________________________________________________ 08/03/2016 v1.0 Page 6 of 22 HA Reference Architecture _________________________________________________________________________ 2 PLATFORM INDEPENDENT MODEL (PIM) 2.1 Definitions Metadata is "data about data". There are two distinct classes of metadata: Structural (or control) Metadata and Guide Metadata. Structural Metadata is used to describe the structure of machines such as tables, columns and indexes in a database. Guide Metadata is used to help humans find specific items and is usually expressed as keywords in a natural language search. Types of metadata include: Information Technology and Software Engineering metadata o General IT metadata o IT metadata management products o Relational database metadata o Data warehouse metadata o Business Intelligence metadata o File system metadata o Program metadata o Existing software metadata o Document metadata Digital library metadata Image metadata Geospatial metadata Meta-metadata Key structural domains for the HA are depicted in Figure 4 below. Figure 4 - Metadata Domains _________________________________________________________________________ 08/03/2016 v1.0 Page 7 of 22 HA Reference Architecture _________________________________________________________________________ Metadata Management comprises the Planning (P), Development (D), Operations (O) and Control (C) activities necessary to enable easy access to high quality, integrated metadata. Specifically, it includes the following functions: Understand Metadata Requirements (P) Define the Metadata Architecture (P) Develop & Maintain Metadata Standards (P) Implement a Managed Metadata Environment (D) Create & Maintain Metadata (O) Integrate Metadata (C) Manage Metadata Repositories (C) Distribute & Deliver Metadata (C) Support Metadata Reporting and Analysis (O) The first three of these start with this document. The current issue of the Reference Architecture is focused mainly around the centre circle in Figure 4, however, some of the key domains are included as they have been identified as priorities with respect to current and planned projects, including: 1. Data Architecture and Modelling Tools as used by the EA and BI teams. 2. Federated Third-Party Data Exchange as used by the ITS Metadata Registry project and the Travel Information Highway (TIH) community. 3. Service-Oriented Architecture (SOA) as planned for the EA To-Be architecture. 4. Geospatial as planned for the forthcoming Geographical Information Systems (GIS) Strategy. 5. Data Warehouse and Business Intelligence as used by the BI Programme. Other domains may be added as and when required or covered in other related artifacts. 2.2 Registries and Repositories 2.2.1 Introduction A metadata registry is a central location in an organisation where metadata definitions are stored and maintained in a controlled method. Metadata registries are used whenever data must be used consistently within an organisation or group of organisations. Examples of these situations include: Organisations that transmit data using structures such as XML, Web Services or EDI, for example, the Travel Information Highway (TIH) organisation employs the HA’s ITS Metadata Registry (www.itsregistry.org.uk). Organisations that need consistent definitions of data across time, between databases, between organisations or between processes, for example when the HA BI Programme builds a data warehouse. Organisations, like the HA, that are attempting to break down "silos" of information captured within applications or proprietary file formats. A metadata registry typically has the following characteristics: It is a protected area where only approved individuals may make changes. It stores data elements that include both semantics and representations. The semantic areas of a metadata registry contain the meaning of a data element with precise definitions. The representational areas of a metadata registry define how the data is represented in a specific format such as within a database or an XML Schema. _________________________________________________________________________ 08/03/2016 v1.0 Page 8 of 22 HA Reference Architecture _________________________________________________________________________ There are a number of international standards in this area – these are covered in the following sub-sections. 2.2.2 ISO 11179 Metadata Registry (MDR) The ISO 11179 standard provides some guidelines for the description of individual data elements. The ISO 11179 standard is broken into six parts: ISO 11179-1 Framework for the specification and standardisation of data elements: Provides an overall view of how data elements should be annotated and classified. It basically serves as a roadmap to how the other parts of the ISO specification should be used. ISO 11179-2 Classification for data elements: Defines an ontological classification mechanism for the clear and unambiguous classification of individual data elements (values). ISO 11179-3 Basic attributes of data elements: Provides a brief list of metadata that should be collected for data elements. These include information about the value itself (such as whether it is required or optional, how many times it may appear, and what the allowable values for the element are) as well as information about the meaning of the element (representational information from ISO 11179-2, keywords, and so on). ISO 11179-4 Rules and guidelines for the formulation of data definitions: Provides some best practice naming and definition approaches for individual data elements. This includes simple rules such as “element names should be singular”, as well as more sophisticated guidelines such as “element definitions should not contain circular references.” ISO 11179-5 Naming and identification principles for data elements: Provides some standardised mechanisms for the unique identification of data elements. It includes some guidelines for creating element names from the classification of those elements, as well as a mechanism for associating a universally unique identifier with each element. ISO 11179-6 Registration of data elements: Describes a methodology for the registration of individual data elements. It describes how data elements should be versioned over time, and provides for a way to uniquely identify a data element through a combination of its registry authority identifier, its own unique identifier within that registry, and its version number. 2.2.3 ISO 15000 - ebXML registry and repository (regrep) The OASIS standards body is responsible for the Electronic Business using eXtensible Markup Language, or ebXML, standard (www.ebxml.org). This standard actually consists of several more granular standards, including a standard for a Collaborative Partner Profile Agreement (CPP/A) describing an agreement between trading partners, a Messaging Service Specification (MSS), a Business Process Specification Schema (BPSS), and a Registry Information Model (RIM). This registry model was promoted to version 3.0 in May 2005, and provides a common mechanism for the sharing of technical information (XML Schemas), business process descriptors (BPSS documents), and collaboration partner profiles and agreements (CPP/A documents) to fully describe the business relationships between two or more participants in an information sharing effort. Because the ebXML registry effort is a natural outgrowth of ebXML’s other goals (including the BPSS documents and CPP/A documents), an ebXML registry is both a technical registry (so-called “green pages”) as well as a business registry (so-called “white” and “yellow” pages). 2.2.4 Universal Description, Discovery and Integration (UDDI) UDDI is a platform-independent, XML-based registry for businesses worldwide to list themselves on the Internet. UDDI is an open industry initiative, sponsored by OASIS, enabling _________________________________________________________________________ 08/03/2016 v1.0 Page 9 of 22 HA Reference Architecture _________________________________________________________________________ businesses to publish service listings and discover each other and define how the services or software applications interact over the Internet. A UDDI business registration consists of three components: White Pages — address, contact, and known identifiers; Yellow Pages — industrial categorisations based on standard taxonomies; Green Pages — technical information about services exposed by the business. UDDI was originally proposed as a core Web Service standard. It is designed to be interrogated by SOAP messages and to provide access to Web Services Description Language (WSDL) documents describing the protocol bindings and message formats required to interact with the web services listed in its directory. 2.2.5 The Need for an Integrated SOA Registry-Repository The HA’s future state (To-Be) architecture is predicated on a Service Oriented Architecture (SOA). The need for a point of control and governance within the SOA deployment demands that service information artifacts be stored and managed in a consistent manner that allows enforcement of organisational policies. This is precisely the role served by a registry-repository service within an SOA deployment. Earlier SOA deployments recognised that the rising complexity and diverse nature of service information artifacts demanded more formal means of management, sharing, and tracking than simple Web sites. This led to the use of a UDDI registry to manage, share, and track service information artifacts. Though this is an improvement over the less formal web site approach, it has a major limitation: a registry can only store links or pointers to service information artifacts. The actual artifacts must reside outside the registry, using informal and inconsistent means such as web sites. This makes the actual artifacts ungovernable by the registry. What is missing in this scenario is a repository that stores the actual artifacts. A registry-repository provides an integrated solution which is able to store metadata such as links or pointers to artifacts, as well as the actual artifacts themselves. In practice, SOA deployments tend to span organisational and governance boundaries. This can be true even among different parts of the same enterprise. Organisations large and small sometimes prefer to have local autonomy over their SOA deployments, but equally they also need to seamlessly integrate their services with those in SOA deployments of other organisations. For example, the HA will need to interoperate with Other Government Departments (OGD’s) through the G-Cloud as well as with partner organisations such as MAC’s and TechMAC’s, Service Providers and DBFO’s 1. In order to meet the requirement of local autonomy while providing seamless integration and interoperability globally, SOA deployments must federate with other SOA deployments using open standards. The trend towards open standards-based, integrated registry-repositories is largely because they allow organisations to share and link information with other organisations in a secure manner. Federated information management allows multiple registry-repositories to federate together and appear as a single, virtual registry-repository, while allowing individual organisations to retain local control over their own registry-repositories. This also leads to developments in the Semantic Web and linked data culminating recently in the Government’s Open Data initiative, which employs standards such as the Web Ontology Language (OWL) and Resource Description Framework (RDF). These are catered for in the OASIS ebXML Registry and Repository 4.0 Profile for OWL-Lite (draft) standard. 1 Design, Build, Finance and Operate. _________________________________________________________________________ 08/03/2016 v1.0 Page 10 of 22 HA Reference Architecture _________________________________________________________________________ 2.2.6 Choosing a Registry-Repository In evaluating which registry-repository to deploy as part of the SOA infrastructure, the choices fall into the following categories: 1. 2. 3. 4. 5. 6. 7. A proprietary registry-repository. An ISO 11179 registry without a repository. An ISO 11179 registry with a proprietary repository. A UDDI registry without a repository. A UDDI registry with a proprietary repository. An ebXML registry-repository. A combination of UDDI registry and ebXML registry-repository. As mentioned earlier, federated SOA deployments require a standards-based registryrepository. This rules out options 1, 3 and 5 above. The remaining standards-based choices involve three standards: ISO 11179, UDDI and ISO 15000 - ebXML registry and repository (regrep). The ISO 11179 and UDDI registries offer only a subset of capabilities offered by an ebXML Registry. In particular, they provide only a registry and no repository. What gets published in an ISO 11179 or UDDI registry are pointers to service artifacts such as WSDL documents. What gets published to the ebXML Registry are not just pointers to service artifacts, but also the actual artifact themselves. Thus, an ebXML registry-repository can be used for governance of any type of service artifacts throughout their life cycles. For these reasons, the chosen solution for this reference architecture is the ISO 15000 ebXML regrep. 2.3 Meta-models The EA Team have adopted the TOGAF 9 meta-model. There are, however, a number of anomalies with this standard as far as the current manual goes. The first one pertains to capabilities, shown in Figure 5 below. The original TOGAF 9 diagram shows the “Capability” as an “Organisation”. Figure 5 - TOGAF “Capability” _________________________________________________________________________ 08/03/2016 v1.0 Page 11 of 22 HA Reference Architecture _________________________________________________________________________ The other anomaly relates to the SOA extension which should appear as follows: Figure 6 - SOA modification to TOGAF Metamodel In the original manual the Data Entity and Technology Component are directly connected to the Business Service rather than the IS Service. For further information on meta-models refer initially to Ref. [2]. Further considerations arise in terms of extensions such as ebRIM and OGC’s Catalogue Service for Web (CSW). 2.4 Cataloging: Key to Artifact Discovery The cataloging of artifacts improves their discoverability and is essential in supporting artifactspecific queries – it’s very similar in concept to the indexing of tables in a relational database. In each case, information is automatically converted to metadata at the time it is published, and the metadata is used to facilitate efficient discovery of the published information. Metadata discovery is the process of using automated tools to discover the semantics of a data element in data sets. This process usually ends with a set of mappings between the data source elements and a centralised metadata registry. Metadata discovery is also known as metadata scanning. Without this automation capability the task of manually managing metadata would become impossible – a never-ending job like painting the Forth Bridge! The cataloging function also needs to support the following activities: Implement a Managed Metadata Environment (D) Create & Maintain Metadata (O) Integrate Metadata (C) Manage Metadata Repositories (C) Distribute & Deliver Metadata (C) Support Metadata Reporting and Analysis (O) _________________________________________________________________________ 08/03/2016 v1.0 Page 12 of 22 HA Reference Architecture _________________________________________________________________________ 3 PLATFORM SPECIFIC MODEL (PSM) 3.1 Introduction Since metadata is such a far-reaching topic with multiple domains, this issue of the reference architecture will focus on a few key areas, as follows: The top-level, Enterprise Metadata Management integration framework. The SOA registry-repository. The Business Intelligence domain. The EA modelling domain. The last two have already been subject to some product selections. The remaining topics are just reference implementations and subject to confirmation via formal procurements and/or product selections with partners as appropriate. 3.2 Enterprise Metadata Management Open Integration Framework The InfoLibrarian Metadata Open Integration Framework is the reference implementation for this top-level domain. It’s available both as a software product and as a hardware appliance. Subject to a value for money evaluation, the latter is considered most appropriate for the HA’s internal implementation whereas partner organisations may prefer a more limited adoption based on the software version. Figure 7 - InfoLibrarian Framework Framework features include: Source-code is provided for a generic, vendor neutral, object oriented meta-model schema (the Universal MetaMart). XML can be easily returned by querying the repository directly. The repository can be queried directly using very simple SQL commands. Component API fully documented and supports several popular languages. _________________________________________________________________________ 08/03/2016 v1.0 Page 13 of 22 HA Reference Architecture _________________________________________________________________________ The meta-model schema can be deployed on virtually any RDBMS – Oracle is preferred in the HA context. The InfoLibrarian portal is web-based and open source. Maintenance of the metadata collection is 100% automated. Hundreds of scanners and templates are available for most environments and are completely plug-and-play. Software features include: Advanced search and comparison functions. Drill-Through-Impact-Analysis. Lineage analysis (visual in the portal). Semantics management. Business rules management. Business term definitions. Subject area and category modeling. Historical views. Code and object comparison. Change management. Object modeler. Document linking. Multi threaded auto-scanning. Auto-scanning command center with metrics and scanning progress indicators. Hundreds of scanners and templates available. Scanners are plug-and-play (drop in and scan). Automatic generation of graphical business views. The InfoLibrarian Administrator is a PC based desktop application built on the robust InfoLibrarian Engine. It allows Administrators to: Navigate metadata. Manage metadata catalogues and add descriptions. Connect to different InfoLibrarian repositories. Delete scanned metadata catalogues or data sources. Setup scanning options. Add, remove, and modify business rule objects. Build and manage subject areas, categories and mappings. Add, remove, and modify custom object types. Add, remove, modify, and categorise linked documents to appear in the portal. Manage Data Dictionary and synchronise with scanned metadata. Search metadata descriptions. Configure portal, log, and repository settings. Register scanners. View and navigate scanning logs. Run scanner wizards and autoscan interactively. The Universal MetaMart is an open and simple schema that enables users to quickly understand and have direct access to their metadata. While sophisticated standards like OMG’s XMI, MOF, and CWM2, are supported the MetaMart provides a “Cooked” view of the metadata for administrators. 2 OMG: Object Management Group XMI: XML Metadata Interchange MOF: Meta-Object Facility - a meta-model used to formally define Unified Modeling Language (UML) CWM: Common Warehouse Metamodel _________________________________________________________________________ 08/03/2016 v1.0 Page 14 of 22 HA Reference Architecture _________________________________________________________________________ The Auto-Scanners refresh and maintain historical changes as they occur over time. This catalogues and safeguards the core knowledge about the local environment. Immediate value from the technical metadata provides technical personnel with reliable and accurate knowledge to help them more effectively manage and maintain complex ICT systems and interrelated processes. The foundational metadata repository is populated this way. The Object Modeller allows users to identify anything that needs to be documented and to define it. Additionally, users can organise it in a customised manner, like defining templates for the metadata to be managed within the enterprise. This can include: Hardware inventory Business processes Key Performance Indicators (KPI’s) Technical processes Applications and components There is Point and Click Category Modeller which enables users to easily capture, define and map business metadata, automatically displaying and categorising it in the portal with terminology that is already familiar to users. The built-in visual mapper tool allows users to associate objects or defines hierarchies and categories of objects, build subject areas, and link documents. The portal automatically graphically materialises any pre-defined subject areas and provides the users with the ability to drill through the hierarchies to the finest level of detail. Subject areas are presented in the portal graphically. The repository has been tested with the following databases: Oracle, SQL Server DB2. _________________________________________________________________________ 08/03/2016 v1.0 Page 15 of 22 HA Reference Architecture _________________________________________________________________________ 3.3 SOA Registry Repository The WellGEO RegRep is the reference implementation for registry and repository since it has specialised extensions for the Geographical Information (GI) domain which is of significant importance to the HA. It enables seamless, secure management of GI services, datasets and other content. Its extensible repository manages all types of GI and non-GI content. Its registry manages standardised metadata describing the content within its repository or in external repositories. It is built using open standards and interfaces and integrates well with existing ICT infrastructure wherever it is deployed. Figure 8 - Wellfleet RegRep Architecture Features include: GI Service Catalog based on OASIS ebXML Registry and Repository 4.0 (draft) and OGC WRS 1.0 Basic Package. Cataloging of GI services, datasets and content using ISO 19115, ISO 19119 and ISO 19139. Rapid discovery of GI services, datasets and content based on ISO 19115 and ISO 19139 metadata using flexible queries. Spatial query support using GML 3.2.1. Full-text indexing and search. WSDL management based on OASIS ebXML Registry and Repository 4.0 Profile for WSDL (draft). XML Schema management based on OASIS ebXML Registry and Repository 4.0 Profile for XML Schema (draft). Ontology management based on OASIS ebXML Registry and Repository 4.0 Profile for OWL-Lite (draft). Registry interfaces available with REST and SOAP bindings. Relationship management between managed content. Pluggable authentication modules allowing leverage of existing identity management and authentication solutions within the enterprise. Flexible role-based authorisation and access control based on OASIS XACML 1.0. Federation of multiple WellGEO RegRep instances to allow sharing and linking GI information across organisational boundaries. Seamless federated search across multiple WellGEO RegRep instances even across organisational boundaries. GUI and API interfaces provide human and programmatic access to WellGEO RegRep services and content _________________________________________________________________________ 08/03/2016 v1.0 Page 16 of 22 HA Reference Architecture _________________________________________________________________________ A Publish-Subscribe workflow is illustrated below in Figure 9 below. Figure 9 - Wellfleet Pub-Sub Workflow A lifecycle workflow is illustrated below in Figure 10 below. Figure 10 - Lifecycle Workflow The product is tested on the following open source platforms: Operating systems: Ubuntu Linux 7.10 and Red Hat Enterprise Linux 4 Java platforms: JDK 6 Application servers: Tomcat 5.5 Databases: PostgreSQL 8.3 Web browsers: Mozilla Firefox 2.0+ _________________________________________________________________________ 08/03/2016 v1.0 Page 17 of 22 HA Reference Architecture _________________________________________________________________________ 3.4 The Business Intelligence domain 3.4.1 Introduction The HA’s BI Programme has already selected the Oracle stack through its BI Service Enablement Project. This includes the Oracle Data Integrator (ODI) product. Oracle Data Integrator is organised around modular repositories. All runtime and design-time components are Java components that store their information in metadata repositories. The components of this architecture can be installed and run on any platform. The architecture is shown in Figure 11 below. Figure 11 - ODI Architecture 3.4.2 User Interfaces The five ODI graphical modules are: Designer: In this interface, users can define declarative rules for data transformation and data integrity. Database and application metadata can be imported or defined. Designer uses metadata and rules to generate scenarios for production. All project development is performed through this interface, and it is the main user interface for developers and metadata administrators at design time. Operator: In this interface, users can manage and monitor ODI jobs in production. It is designed for production operators and shows the execution logs with error counts, the number of rows processed, execution statistics, the actual code that is executed, etc. At design time, developers can also use Operator for debugging purposes. It is the main user interface at runtime. Topology Manager: In this interface, users can define the physical and logical architecture of the infrastructure. Servers, schemas, and agents are registered in the ODI master repository through this interface, which is primarily used by the administrators of the infrastructure or project. Security Manager: In this interface, administrators can manage user accounts and privileges. It can be used to give profiles and users access rights to Oracle Data Integrator objects and features. This interface is primarily used by security administrators. _________________________________________________________________________ 08/03/2016 v1.0 Page 18 of 22 HA Reference Architecture _________________________________________________________________________ Metadata Navigator/Lightweight Designer: Business users, developers, operators, and administrators can use their Web browser to access Metadata Navigator or Lightweight Designer. Through these Web interfaces, they can see flow maps, trace the source of all data, and even drill down to the field level to understand the transformations used to build the data. They can also launch and monitor scenarios and edit data mappings though these Web interfaces. This forms the basis of the strategic BI Data Catalogue or Data Dictionary and is separate to data modeling tools which are covered in §3.5. 3.4.3 Repositories The Oracle Data Integrator repository is composed of a master repository and several work repositories as shown in Figure 12 below. These repositories are pluggable in any relational database management system. All objects configured, developed, or used with ODI modules are stored in one of these two types of repository. There is usually only one master repository, which contains security information (user data and privileges), topology information (definitions of technologies and servers), and versions of objects. Figure 12 - Structure of ODI Repositories The work repository is where projects are stored. Several work repositories may coexist in a single ODI installation. This is useful for maintaining separate environments or to reflect a particular versioning lifecycle, such as development, user acceptance tests, and production environments. A work repository stores information related to: Models, including datastores, columns, data integrity rules, cross references, and data lineage3. 3 Data lineage is a foundation capability of metadata management which provides the functionality to determine where data comes from, how it is transformed, and where it is going. Data Lineage metadata traces the lifecycle of information between systems, including the operations that are performed upon the data. _________________________________________________________________________ 08/03/2016 v1.0 Page 19 of 22 HA Reference Architecture _________________________________________________________________________ Projects, including interfaces, packages, procedures, folders, knowledge modules, and variables. Runtime information, including scenarios, scheduling information, and logs. 3.4.4 ODI User Roles As Oracle Data Integrator relies on a centralised repository, different types of users need to access it. The list below describes how ODI is used within a BI/DW team based on best practice. BI Business User Business users have access to the final calculated key indicators through reports or ad-hoc queries. In some cases, they need to understand what the definition of the indicators is, how they are calculated and when they were updated. Alternatively, they need to be aware of any data quality issue regarding the accuracy of their indicators. BI Business Analyst Business Analysts define key indicators. They know the source applications and specify business rules to transform source data into meaningful target indicators. They are in charge of maintaining translation data from operational semantics to the unified data warehouse semantic. BI Developer Developers implement the business rules in respect of the specifications described by the Business Analysts. They release their work by providing executable scenarios to the production team. Developers must have both technical skills regarding the infrastructure and business knowledge of the source applications. BI/DW Metadata Administrator ODI Metadata Administrators reverse engineer source and target applications. They guarantee the overall consistency of Metadata in the ODI Repository. They need to have an excellent knowledge of the structure of the sources and targets and they should have participated in the data modelling of key indicators. In conjunction with Business Analysts, they enrich the metadata by adding comments, descriptions and even integrity rules (such as constraints). Metadata Administrators are usually responsible for version management. BI/DW Database Administrator Database Administrators are in charge of defining the technical database infrastructure that supports ODI. They create the database profiles to let ODI access the data. They create separate schemas and databases to store the Staging Areas. They make the environments accessible by describing them in the Topology. BI/DW System Administrator System Administrators are responsible for maintaining technical resources and infrastructure for the project. For example, they may: Install and monitor Scheduler Agents. Backup and restore Repositories. Install and monitor Metadata Navigator. Setup environments (development, test, maintenance etc.). _________________________________________________________________________ 08/03/2016 v1.0 Page 20 of 22 HA Reference Architecture _________________________________________________________________________ BI/DW Security Administrator The Security Administrator is responsible for defining the security policy for the ODI Repository. He or she creates ODI users and grants them rights on models, projects and contexts. BI/DW Service Operator Operators are responsible for importing released and tested scenarios into the production environment. They schedule the execution of these scenarios. They monitor execution logs and restart failed sessions when needed. ODI comes with the following standard user profiles: Role Business User Business Analyst Developer Metadata Administrator Database Administrator System Administrator Security Administrator Operator 3.5 User Profile CONNECT, NG REPOSITORY EXPLORER CONNECT, NG REPOSITORY EXPLORER, NG DESIGNER CONNECT, DESIGNER, METADATA BASIC CONNECT, METADATA ADMIN, VERSION ADMIN CONNECT, DESIGNER, METADATA ADMIN, TOPOLOGY ADMIN CONNECT, OPERATOR CONNECT, SECURITY ADMIN CONNECT, OPERATOR The EA modelling domain The EA Team has selected iServer Enterprise Architect from Orbus Software as the standard EA modelling tool. The server is hosted externally and is accessible on the same basis as PartnerNET so that external consultants and Service Providers can access it when necessary. iServer is a multi-user repository and modelling environment for Microsoft Office and Visio documents, thus making it accessible to a wide range of users. iServer's central repository stores both modelling information (diagrams and shapes) and MS Office(Word, Excel and PowerPoint) documents in a single EA store. Modelling objects such as services, applications, interfaces, data elements and activities can be directly related to textual information such as principles, policies, goals or objectives stored in Word documents. Deliverables created in Powerpoint, Excel and Word can also be created, stored, versioned and managed as individual artifacts (objects) in the same environment allowing for easier control and publication (dissemination) to stakeholders. iServer allows users to choose a meta-model suitable for their EA approach and maturity. By using iServer's pre-packaged meta-models for key methods, such as TOGAF, EA initiatives can be up and running immediately, but by making use of iServer's extensible environment the EA Team can customise or build their own as required. Customisation includes choosing (and or building) the artifacts that may exist, their meta-definitions and the relationship types between them - all elements are easily customisable. iServer provides out-of-the-box notations for Enterprise Architecture including TOGAF 9, but can also support any Visio stencils/notations and document templates. Stakeholder needs can be met in a number of ways: targeted HTML publications, EA views, analysis information (matrices, comparisons), documents, spreadsheets and reports can be generated instantly from iServer's repository helping to ensure the dissemination and communication of EA content to the wider business audience. Traceability means that EA artifacts and proposed solutions can be traced back to stakeholder concerns and requirements. _________________________________________________________________________ 08/03/2016 v1.0 Page 21 of 22 HA Reference Architecture _________________________________________________________________________ Organisations often use a variety of tools to describe their Enterprise Architecture. The EA Team has number of models from the BI and Technology Convergence Programmes in IBM/Telelogic System Architect. Similarly, the ITS Metadata Registry receives models in a number of formats, e.g. Datex II in Sparx Systems’ Enterprise Architect and UTMC UML in No Magic’s MagicDraw. Therefore, a key EA tool requirement is the ability to seamlessly transfer information between these different data sources. iServer Data Exchange provides a variety of import and synchronisation mechanisms, from a simple artifact Excel import to a daily synchronisation of artifacts stored in another tool such as InfoLibrarian. 3.6 Summary and Overview The overall architecture is shown in Figure 13 below. Figure 13 - Metadata Mangement Architecture Eventually the building blocks can be connected through the Enterprise Service Bus (ESB) but in the interim they may be connected conventionally using standard network protocols. _________________________________________________________________________ 08/03/2016 v1.0 Page 22 of 22