Oregon Department of Education KIDS processing Proposal Current System The current KIDS processing system has issues with reporting consistency, a reliable communication process for errors, and an architecture that will make statewide validation very challenging, if at all possible, given the tight timelines for processing data. To start with, data is submitted or pulled from the affiliated district’s student information systems (SIS) to the Regional Data Warehouse (RDW). There the RDW transforms the data to ensure it meets defined business rules. In the process errors in the data submitted are communicated back to the affiliated district for correction. Once data is ready to be submitted by the RDW, files are extracted for their districts and sent to ODE for processing. At ODE the data is first loaded to a staging area for examination. Files and records that do not meet the minimum qualifications are rejected. Mainly those minimum qualifications are: 1) An SSID must be present, 2) the format of the data provided must be correct, and 3) the time period submitted for must be valid. Further business rules are then applied to ensure that additional required data is present and that the data provided conforms to the values expected for each of the data elements. Key checking is also done to ensure that data provided in separate files are properly linked to each other. If there are records in a file for a student with no corresponding student record, a key related error is produced. Currently this error reporting is at a summary level. The report is combined with any rejects encountered during the data staging and emailed to the corresponding RDW. From there the RDW manages the distribution of errors to their affiliated districts which then correct their student system data. Data that passes the validations are posted to the operational data store (ODS) from staging and following the statewide validation process data are posted to the ODE data warehouse (KIDS DW in the diagram) from the ODS. OSAT data is presently being extracted from the Student Centered Staging transactional database which houses all assessment test results and placed in a folder for the RDW to download. This process may be scheduled for a daily or weekly extract. There are problems taking the data from the transactional system because it is not designed to keep an audit trail needed to properly communicate changes to the RDW. If a record is deleted, there is no notification of the deletions. 9/30/2008 Page 1 of 5 Proposed Vision ODE proposes to streamline the process to aid in statewide validations and to reduce duplicated effort it takes to maintain six individual database systems. By using a Central ODS the need for cross database calls are eliminated. Attempting to ensure the data quality from six separate databases can be difficult resulting in much slower processing and instability at times which introduces unacceptable risk. The present validation process will be replaced by a run time application we are calling the Core Validation Application. This program will read in the present file set produced and perform the validations for format, minimum requirements and more. Essentially it will be the responsibility of this application to perform all the basic validation rules such as: Required columns are present. Code values provided exist in the list of valid values. Numeric fields contain properly formatted numeric data. Date fields contain valid dates. Valid key data for linking tables is provided. Following the processing of data, the valid data, errors and rejects are sent to ODE so that they may be available for review by the RDW’s districts. The extraction of OSAT data will be from the new Student Centered ODS rather than from Student Centered Staging. Student Centered has all the auditing columns necessary to properly repeat reports and can send data to the RDW when a record is deleted, updated or inserted. The process will be much cleaner than the present extraction. Data will be extracted in a format very similar to the tables within the database rather than extracting several large flat rows as it is done now in the adjustment format. The process of synchronizing data between the RDW and ODE will be greatly simplified. Data from the Student Centered ODS will also be posted nightly to the OSAT DW in the same area as the KIDS warehouse data. This will be the publicly consumable OSAT data which, like the KIDS data warehouse, will be stripped of student identifying information. 9/30/2008 Page 2 of 5 Vision Detail The process starts off similar to the current system with data coming from the District SIS to the RDW. Data is transformed by the RDW and extracted for shipment from their warehouse to ODE. The Core Validation Application is then called to do the basic validation rules. This application will start out by checking with ODE to see if there are new metadata or validation rules to add to the application. Changes made will be logged so there is an audit trail of the changes applied should there be a problem. The application will produce files that can then be sent to ODE for processing. A valid data file An invalid data file with related errors A file with rejected data and related reasons for rejection At this point we now know we have properly formatted data. The files are sent to ODE for processing. The checks performed externally will not be repeated again internally. Instead, the record will be check-summed and a checksum will be determined again as the data is read into Staging to ensure that the row has not changed from the time it was validated. The invalid and rejected data will be stored for display to the district. The valid data will then be loaded to the Central ODS. Statewide error resolution will then be performed resulting in errors that could not be determined locally. Statewide and complex cross column validations will be done at ODE. Additional errors found will be stored in the same table as those found by the RDW earlier. Data that passes the statewide and complex validation will be posted that evening to the warehouse (KIDS DW). 9/30/2008 Page 3 of 5 Error Distribution Detail It is in this area that we get the most improvement in the system processing. Following the statewide and complex validations, the errors and rejects will be broken down and reported by topic area to the appropriate staff (based on staff chosen by the RDW, district or school). In this way staff with specific knowledge of the data can be assigned to deal with related errors. In addition, ODE can provide direct communication to staff resulting in faster resolution of the error. A summary report that contains information about the errors for a specific district will be created and sent via email with a link back to ODE. When the district clicks the link they will be directed to ODE’s Central Login site and challenged for a user name and password. From there the district can navigate to the KIDS errors and see those errors relevant to them. While reviewing the errors the district can correct the associated errors in their student system. The RDW will also receive emails for their districts with a larger amount of information (a summary for all the errors essentially) and may review them online in the same fashion. Errors found may not be related to an entry error in the district’s student system, but rather a transformation error in the process of pushing the data to the RDW and on to ODE. The RDW will be able to resolve these errors through reviewing the data. Errors found should remain in the system until a successful processing of the same data is accomplished. There may also be the need for manually clearing certain error types in special circumstances. 9/30/2008 Page 4 of 5 Summary Benefits Communication improvement by targeting key staff by content area with errors related to the data they maintain. Core validation performed uniformly with everyone using the same rules promotes consistency. Leveraging staff at ODE to maintain the core validation reduces burden on Regional Data Warehouse providers to provide same functionality on multi platforms and keep business rules in sync. Central ODS makes it possible to do statewide validations in a timely and stable fashion. Utilizing central login to keep all the errors behind secure communication lines eliminates the need to share raw data via email. Less time spent overall in communication and more time spent resolving data problems and improving the system. Issues ODE needs to have the core validation run on a computer at the Regional Warehouse Providers that has access to the internet. How data will be accepted for promotion to the warehouse is still under discussion. Right now a nightly promotion is proposed. Staff involved in the communication process need to be setup in Central Login with an account and granted access to error and rejected data. Cost Mainly time from ODE to develop the core validation application, statewide validation routines and a web application for the review of errors. Some time from the Regional Warehouse Provider to setup the core validation routine in the system and aid in the identification of staff that need access to the errors and rejects. Very little time from the School Districts / Schools to also aid in the identification of staff that need access to the errors and rejects. Training in the form of online demonstrations and walkthroughs that will explain what is expected of districts who receive emails to resolve errors in the KIDS system. 9/30/2008 Page 5 of 5