CSU’s Data Architecture and Governance Nina Clemson Enterprise Architecture Symposium, 2006 Where it all began • Information architecture issue papers, 2000 – Reliability – Complexity – Scalability • External architectural review and recommendations – Technical only • Information Strategy and principles – Aimed to educate the CSU community to accept the importance of information Constellar • First middleware solution – Point to point transfers via hub • A limited success – Enabled decoupling and improved reliability – Technical limitations • Revealed other dimensions to the information architecture problem – “Dirty” data – “The chicken and the egg”, who’s the real owner – Different perspectives Data Architecture Project • Top down review of our architecture, including non-IT components, recommended • Had to take a pragmatic approach • Standardised enterprise objects mapped to underlying sources • Three streams – Technical – Integration – Data analysis DAP – data analysis • Reverse engineered from existing sources • Review of current data flows – What to define • Review of existing standards – The design • Leveraged other project work – Some of the sources • Examination and comparison of content – Leverage “common knowledge” – Revealed issues Characteristics of a data standard • Definitions – Scope • Ontology & taxonomy – Relationships and classifications • Authoritative Source – Most correct source – To the attribute level – May change over the lifecycle • Unique identifiers – Shared or mappable – Contributors, consumers and legacy • Stakeholders – Creator, system owner and others with a significant interest Data Issues • 40+ identified • Categorised into five types – – – – – Competing sources of data Currency and applicability Inconsistent formats Structural Multiple sources of data • What happens if you share data and don’t fix these problems Too many cooks • Two systems store subject information • One system creates subject information, the other uses it for administration purposes • Both systems contain active and inactive subjects • When queried for the current set of active subjects, the results are completely different • Question – if a new system arrives tomorrow and wants subject data, which system is the best source? Data Governance – towards a solution? • Storing data for the enterprise • Possible to change, but is it worth it? – What is the benefit? – Departmental vs enterprise optimisation • The cost of inaction – The de facto standard • This is where we are now CSU Data Governance Board • Membership – Senior divisional managers – Executive Director and Architecture staff • Terms of Reference include: – “The Data Governance Board has the responsibility of ensuring the means by which data assets are defined, controlled, used and communicated for the benefit of CSU” • Prioritisation – Project versus issue matrix – Environmental scan Lessons learned • Data governance is hard – This isn’t about technology, its about organisational change • Where there is data sharing exists, there must also be data governance – No standard is a de facto standard • Technology is not a substitute for management – Garbage in garbage out, it’s a cliché but its true • The content of a standard is not important, the agreement is • Standards are not cast in stone – Things also change. – Understanding is a collaborative and iterative process that occurs over time. – Data governance is the process that manages this change • Don’t underestimate the value of education