Data Management Best Practices - EdTelligence

advertisement
Data Management Best Practices:
1. The longitudinal data repository should use a dimensional model, built using common dimensions
and facts as star schema data marts organized by process, not organized by source systems.
2. The longitudinal data warehouse contains detailed atomic data. (The data repository should also
have performance enhancing summary data, but must allow drill-down to the most granular data
captured for each business process.)
3. The longitudinal data repository uses bus architecture with shared, common dimensions and facts.
4. Data is loaded into the data repository using managed load processes, not transactional updates.
5. Data quality is maintained through the implementation of enterprise-wide data governance
policies and practices. These policies recognize data as an asset of the organization rather than a
specific program area or I.T. function.
6. The longitudinal data repository and extract transform load (ETL) processes include metadata
(data about the data) that support various needs of technical, administrative, business use, and
data governance.
7. An operational data store (ODS) is used to collect and store data from multiple sources prior to
feeding the staging area, and to deliver specific pre-built edits and business rule validation
reports. (The ODS is not designed for ad-hoc queries, performance-enhancing aggregations,
longitudinal analysis, or descriptive attributes, which should be left to the data warehouse.)
8. The central repository maintains the official historical record. The data warehouse supports
correction of incorrect historical data through a tightly controlled process, maintaining both the
data as originally reported certified and as corrected.
9. The data repository dimensional model and ETL processes support incremental loading of new or
changed data over time. The entire longitudinal data set does not need to be reloaded for each
load cycle.
10. Records in a fact table represent a measurement or measurements related to a single grain. All
facts in the fact table intersect with the same set of dimensions (day, assessment item, student,
location) and define the scope of the measurement.
11. Facts in the fact table are usually numeric and additive. Facts that are non-additive or semiadditive (additive across some dimensions, but not others) are exceptions and special
consideration is given to insure proper end-user access/use.
12. Dimension tables include textual descriptions, e.g., verbose education terminology rather than
cryptic codes, that provide rich meaning for users of the decision support system and support
robust analytical slicing and dicing.
Celero Partners Corporation, Inc.
www.celerocorp.com
Download