Project Title: UCPATH Data Dissemination Operational Data Store (DDODS) Submitter: Albert Course Senior Applications Manager ITS Data Services Albert.Course@ucop.edu Phone (510) 987-0761 Mobile (925) 348-4265 Project Leads: Micheal Schwartz, Enterprise Data Architect Hooman Pejman, Data Services Data Architect Team Members: Steve Hunter, Java Developer Stephen Dean, SOA Architect/Developer Ric Carr, Systems Analyst Sanketh Sangam, ETL Developer Jerome McEvoy, Enterprise Architecture, Manager Deborah Hill, Project Manager The Data Dissemination Operational Data Store (DDODS) is an UCPath product. It is designed to cull information from a complex Human Capital Management & Payroll software package containing 20,000 tables, determine change data, and relay pertinent and concise information to multiple UC institutions. The primary objective of the DDODS is to provide HR & Payroll data for the Data Warehousing processing needs of each UC Location (Campuses, Medical Centers). The data provided to each UC Location is structurally and semantically identical, uses a less complex data model consisting of approximately 200 tables (1%), and establishes a set of consistent data definitions for this subject across UC. This level of data consistency across all UC Locations is a primary objective of the UCPath project. Secondary objectives of the DDODS pertain to consumption of DDODS data by local business applications, local operational reporting use and as an enabler of unique or localized Data Shaping needs (e.g. offsetting the need for new UCPath interfaces). The DDODS is comprised of a number of enabling technologies that support the processing steps required to determine data change, identify data that has to be sent to each Location, consistently deliver the data in a secure manner, and process the change data into the local DDODS database repositories built on varying database platforms (Microsoft, IBM and Oracle). The DDODS was a collaborative effort with all of the campuses participating in the design reviews. The team participated in the development of a common data dictionary with approximately 200 tables and 5000 data elements. The DDODS was an early delivered product for locations to plan and design data warehouse and interfaces. The DDODS is also being used to validate conversion data at Wave 1 locations. What is it The DDODS is a product and service that distributes PeopleSoft data to campuses nightly and will be supported and managed by ITS. The three main components of the product include the database, loader application, and data dictionary. ITS is responsible for the design of the database and providing locations with DDL scripts to recreate the database locally on one of three data base platforms: SQL Server, Oracle or DB2. The loader application is a Java utility that locations can use to populate their local DDODS. The data dictionary contains the published data definitions for all data elements in the DDODS. How does it work The DDODS starts out with a nightly Change Data Capture (CDC) process. The CDC utilizes materialized views to snapshot the 200+ HCM tables and compares them with the prior days snapshot using Informatica. The Informatica program evaluates each table and writes the changes to the ODS database at Oracle Managed Cloud Services (OMCS). The data includes an indicator for Insert, Update or Delete, the batch number and date and time stamp. A description of the CDC process can be found on GCDP\UC - Systems\ODS\May 9 Preparation Materials. The file name is “DDODS Change Data Capture.pptx”. Once the Informatica job completes the updates to the ODS database, the Business Process Engineering Language (BPEL) is launched. This process reads the new data that has been loaded and applies the Affiliation Rules to the data to determine which location or locations the data needs to be sent to. These Affiliation Rules take into account if the data needs to go to all locations (lookup tables), or a single location based on the person’s job or to multiple locations due to data processing requirements (for example UC Merced and UCOP data to UCLA). Once the BPEL processes generate all of the files necessary for a location, the data is SFTP’d to each location. The process includes a control file that identifies which files have been sent to a particular location and is used to launch the Java Loader. Business locations are using a number of different databases, we used ERWin to generate the logical and physical models. After surveying the locations, three target databases were selected: Oracle, Microsoft SQL Server and IBM DB2. After a version of the DDODS is finalized, language appropriate DDLs are generated for each location. These DDLs are sent to each location to allow them to build their local copy of the ODS database. By managing the model centrally and delivering the DDLs, we ensure that each locations database matches the central ODS database and reduce the workload for each location to build these tables independently. Locations are utilizing a number of different ETL tools for loading data; we decided not to pursue a particular ETL tool but rather to develop a loader program. The loader program was developed in Java. As part of the installation at each location, the location has the ability to control a number of parameters including the source file directories and target database. The loader program runs a background process to check for the existence of the “Control File” and then begins the data load into the local ODS database. The log file keeps track of the success and/or failures of any of the data loads and can be restarted after error resolution. The DDODS data dictionary provides standard data definitions for over 5000 data elements in both HCM and the DDODS. This data dictionary will be accessible and searchable via a web interface for all locations to use. The DDODS Data Dictionary is currently maintained on GCDP \UC Systems\ODS\DDODS Data Model and Dictionary and the file name is EA_DA_DDODS_DataDictionary_[date]_v9.0.xlsx where the date indicates the as of date within a given version. We hope this serves as the foundation for other system wide data dictionaries. Timeframe The DDODS will go live with the Wave 1 implementation in July 14, however it is currently in use at Wave 1 locations in support of conversion. The design and test data has been sent to Wave 2 and 3 locations so that they can begin work on their local systems and data warehouse that will use data from the DDODS. The DDODS will also be run as part of the SIT testing this fall. Customer Comments The DDODS plan to implement a standard solution for UC Path was effective and efficient. It is one of the best UC wide work efforts I have experienced in my 27 years of service to UC. The principles of the plan were truly collaborative. Every UC location participated in the design of the UC wide DDODS. Each location had the opportunity to represent their location’s requirements and assure that they were met through the design process. Each location would implement their ODS from the DDODS. From there the location had the flexibility to implement a Data Warehouse or report off the ODS. This saved UCLA significant time and resources. The DDODS Team enforced best practice for ODS/DW design and build. It also enforces consistency across the UC in standardized naming and business logic. Again this saved UCLA time and money. UCLA could extend the build of the ODS into our DW. UCLA chose to implement the DDODS with little to no modifications. It gives UCLA the opportunity to utilize resources saved in the DW build and Tier 2 education. It also assisted UCLA in enforcing best practice ODS guidelines which has been a struggle in the past. The UC DDODS Team has been wonderful to work with. They are extremely skilled, knowledgeable, and experienced in DW practices and major project implementations. They found a formula for success in building a standard ODS with standardized naming and business rules in a truly collaborative way. I highly recommend them for the Sautter Award. Regards, Donna M. Capraro Director Information and Data Strategy UCLA IT Services Campus Data Warehouse donnacap@ucla.edu 310 206-1624