A realistic look at open government data Sharon Dawes Open data philosophy • If government – publishes its data in structured, machine readable form – provides easy one-stop public access to all data from all departments – without fees or other restrictions on access or use • Then – social, economic, and democratic benefits will flow to society A built-in problem Supply-side perspective + Myths + Ambiguity = Disappointing results Supply-side perspective Focus is mainly on what government does or should do: • Adopt open data policies • Devise and implement open data practices • Publish the data • Step aside and great things will happen Supply-side perspective Focus is mainly on what government does or should do: • Adopt open data policies • Devise and implement open data practices • Publish the data • Step aside and great things will happen Problem 1: a government-centric view Problem 2: ignores the limits of resources and capabilities Problem 3: promises value will be created by someone else Myths • • • • • (Janssen, Charalabidis, & Zuiderwijk, 2012) Data publication = data use = public benefits All government needs to do is publish data All constituents can use published data All data should be published without restriction Open data will produce open government M. Janssen, Y. Charalabidis & A. Zuiderwijk (2012) Myths • • • • • (Janssen, Charalabidis, & Zuiderwijk, 2012) Data publication = data use = public benefits All government needs to do is publish data All constituents can use published data All data should be published without restriction Open data will produce open government Problem 1: simplistic Problem 2: naïve Problem 3: magical thinking Ambiguity • About the purpose of open data: (Yu & Robinson, 2012) – Transparency and accountability (to see what the government is doing and how it is doing it) – Economic and social development (to create new products and services for society) • About who is able to use open data – Rhetoric is about “citizens” – Actual users are expert analysts and application developers Ambiguity • About the purpose of open data: (Yu & Robinson, 2012) – Transparency and accountability (to see what the government is doing and how it is doing it) – Economic and social development (to create new products and services for society) • About who is able to use open data – Rhetoric is about “citizens” – Actual users are expert analysts and application developers Problem 1: sends mixed messages to the public Problem 2: sends mixed messages to administrators Result: disappointment • In uptake by citizens and businesses • In the type of applications produced • In the sustainability and economic value of applications in the market • In the effect on openness and democracy In a nutshell . . . Sidney Harris, 2012 An OGD miracle depends on whether. . . • • • • • The policies and purposes are clear The practices are effective The data are desirable and good quality The users (analysts and developers) are capable The public wants and can use what the users create The miracle depends on whether. . . • • • • • The policies and purposes are clear The practices are effective The data are desirable and good quality The users (analysts and developers) are capable The public wants and can use what the users create A different approach to the data dimension of OGD • The value of open data lies in data use. • This value of depends on perspective and capabilities of data users and consumers outside the government • Value generated depends on the quality of the data for a given use by a given user/consumer. • Consequently, there can be no one standard for data quality but rather data need to be “fit for use” (Wang & Strong, 1996, Ballou & Pazer, 1995) Data quality challenges Conventional wisdom Provenance Practices Consequences Underuse Misuse Non-use Shifting costs and responsibilities Conventional wisdom aka “untested assumptions” • Quantitative data is “better” than qualitative data • Digital data is “better” than other formats • The data you need – is available and sufficient – objectively neutral – understandable – relevant for your purpose • Government organizations record and organize their data in the same, predictable way Provenance or “where do open data come from?” Administrative systems Embedded in program or service operations Governed by specific policies and laws Gathered in particular contexts for certain internal purposes By people with different kinds and levels of knowledge and expertise Practices & processes that produce data • • • • • • • • Data definition Data collection Data management & maintenance Documentation Audit and quality control Change management Security Priorities & capabilities for all the above Three examples Example 1: Give me shelter Example 2: Cadastral records Example 3: Where does the money go? Data quality = fitness for use • Matters most from the user’s point of view • Depends on the user’s purpose • Usually involves trade offs – e.g., timeliness vs. completeness (Wang & Strong, 1996, Ballou & Pazer, 1995) Accessibility Extent to which data is available, or easily and quickly retrievable Appropriate Amount of Data Extent to which the volume of data is appropriate for the task at hand Believability Extent to which data is regarded as true and credible Completeness Extent to which data is not missing and is of sufficient breadth and depth for the task at hand Concise Representation Extent to which data is compactly represented Consistent Representation Extent to which data is presented in the same format Ease of Manipulation Extent to which data is easy to manipulate and apply to different tasks Free-of-Error Extent to which data is correct and reliable Interpretability Extent to which data is in appropriate languages, symbols, and units, and the definitions are clear Objectivity Extent to which data is unbiased, unprejudiced, and impartial Relevancy Extent to which data is applicable and helpful for the task at hand Reputation Extent to which data is highly regarded in terms of its source or content Security Extent to which access to data is restricted appropriately to maintain its security Timeliness Extent to which the data is sufficiently up-to-date for the task at hand Understandability Extent to which data is easily comprehended Value-Added Extent to which data is beneficial and provides advantages from its use Pipino, et al, 2002 Dimensions of data quality Metadata 637 pages Data quality “tools” • For open data providers • • • • Appreciate data as an asset, a source of value Adopt information policies to preserve and enhance usability Create and maintain metadata to support unknown users Adopt stewardship practices • For open data users • • • • • Be skeptical, ask questions Demand good quality metadata Understand the nature and context of the data Use data sets with caution Combine data sets with great caution What else can governments do? • Engage with the civic technology community about data needs and data problems • Set publication priorities around known data needs of potential users and beneficiaries • Support topical data communities • Direct contests or challenges toward solving specific public problems (e.g. affordable housing, transportation congestion, neighborhood safety) context data policy technology management Thank you @ssdawes www.ctg.albany.edu