A realistic look at open government data Sharon Dawes

advertisement
A realistic look at open
government data
Sharon Dawes
Open data philosophy
• If government
– publishes its data in structured, machine readable form
– provides easy one-stop public access to all data from all
departments
– without fees or other restrictions on access or use
• Then
– social, economic, and democratic benefits will flow to
society
A built-in problem
Supply-side perspective + Myths + Ambiguity
= Disappointing results
Supply-side perspective
Focus is mainly on what government does or should do:
• Adopt open data policies
• Devise and implement open data practices
• Publish the data
• Step aside and great things will happen
Supply-side perspective
Focus is mainly on what government does or should do:
• Adopt open data policies
• Devise and implement open data practices
• Publish the data
• Step aside and great things will happen
Problem 1: a government-centric view
Problem 2: ignores the limits of resources and capabilities
Problem 3: promises value will be created by someone else
Myths
•
•
•
•
•
(Janssen, Charalabidis, & Zuiderwijk, 2012)
Data publication = data use = public benefits
All government needs to do is publish data
All constituents can use published data
All data should be published without restriction
Open data will produce open government
M. Janssen, Y. Charalabidis & A. Zuiderwijk (2012)
Myths
•
•
•
•
•
(Janssen, Charalabidis, & Zuiderwijk, 2012)
Data publication = data use = public benefits
All government needs to do is publish data
All constituents can use published data
All data should be published without restriction
Open data will produce open government
Problem 1: simplistic
Problem 2: naïve
Problem 3: magical thinking
Ambiguity
• About the purpose of open data: (Yu & Robinson, 2012)
– Transparency and accountability
(to see what the government is doing and how it is doing it)
– Economic and social development
(to create new products and services for society)
• About who is able to use open data
– Rhetoric is about “citizens”
– Actual users are expert analysts and application developers
Ambiguity
• About the purpose of open data: (Yu & Robinson, 2012)
– Transparency and accountability
(to see what the government is doing and how it is doing it)
– Economic and social development
(to create new products and services for society)
• About who is able to use open data
– Rhetoric is about “citizens”
– Actual users are expert analysts and application developers
Problem 1: sends mixed messages to the public
Problem 2: sends mixed messages to administrators
Result: disappointment
• In uptake by citizens and businesses
• In the type of applications produced
• In the sustainability and economic value of
applications in the market
• In the effect on openness and democracy
In a nutshell . . .
Sidney Harris, 2012
An OGD miracle depends on
whether. . .
•
•
•
•
•
The policies and purposes are clear
The practices are effective
The data are desirable and good quality
The users (analysts and developers) are capable
The public wants and can use what the users create
The miracle depends on whether. . .
•
•
•
•
•
The policies and purposes are clear
The practices are effective
The data are desirable and good quality
The users (analysts and developers) are capable
The public wants and can use what the users create
A different approach to the data
dimension of OGD
• The value of open data lies in data use.
• This value of depends on perspective and capabilities
of data users and consumers outside the government
• Value generated depends on the quality of the data for
a given use by a given user/consumer.
• Consequently, there can be no one standard for data
quality but rather data need to be “fit for use”
(Wang & Strong, 1996, Ballou & Pazer, 1995)
Data quality challenges
Conventional wisdom
Provenance
Practices
Consequences
Underuse
Misuse
Non-use
Shifting costs and responsibilities
Conventional wisdom
aka “untested assumptions”
• Quantitative data is “better” than qualitative data
• Digital data is “better” than other formats
• The data you need
– is available and sufficient
– objectively neutral
– understandable
– relevant for your purpose
• Government organizations record and organize their data in
the same, predictable way
Provenance or “where do open
data come from?”
Administrative systems
Embedded in program or service operations
Governed by specific policies and laws
Gathered in particular contexts for
certain internal purposes
By people with different kinds and
levels of knowledge and expertise
Practices & processes that produce data
•
•
•
•
•
•
•
•
Data definition
Data collection
Data management & maintenance
Documentation
Audit and quality control
Change management
Security
Priorities & capabilities for all the above
Three examples
Example 1: Give me shelter
Example 2: Cadastral records
Example 3: Where does the money go?
Data quality = fitness for use
• Matters most from the user’s point of view
• Depends on the user’s purpose
• Usually involves trade offs
– e.g., timeliness vs. completeness
(Wang & Strong, 1996, Ballou & Pazer, 1995)
Accessibility
Extent to which data is available, or easily and quickly retrievable
Appropriate Amount of Data
Extent to which the volume of data is appropriate for the task at hand
Believability
Extent to which data is regarded as true and credible
Completeness
Extent to which data is not missing and is of sufficient breadth and depth for the
task at hand
Concise Representation
Extent to which data is compactly represented
Consistent Representation
Extent to which data is presented in the same format
Ease of Manipulation
Extent to which data is easy to manipulate and apply to different tasks
Free-of-Error
Extent to which data is correct and reliable
Interpretability
Extent to which data is in appropriate languages, symbols, and units, and the
definitions are clear
Objectivity
Extent to which data is unbiased, unprejudiced, and impartial
Relevancy
Extent to which data is applicable and helpful for the task at hand
Reputation
Extent to which data is highly regarded in terms of its source or content
Security
Extent to which access to data is restricted appropriately to maintain its security
Timeliness
Extent to which the data is sufficiently up-to-date for the task at hand
Understandability
Extent to which data is easily comprehended
Value-Added
Extent to which data is beneficial and provides advantages from its use
Pipino, et al, 2002
Dimensions of data quality
Metadata
637 pages
Data quality “tools”
• For open data providers
•
•
•
•
Appreciate data as an asset, a source of value
Adopt information policies to preserve and enhance usability
Create and maintain metadata to support unknown users
Adopt stewardship practices
• For open data users
•
•
•
•
•
Be skeptical, ask questions
Demand good quality metadata
Understand the nature and context of the data
Use data sets with caution
Combine data sets with great caution
What else can governments do?
• Engage with the civic technology community about
data needs and data problems
• Set publication priorities around known data needs
of potential users and beneficiaries
• Support topical data communities
• Direct contests or challenges toward solving
specific public problems (e.g. affordable housing,
transportation congestion, neighborhood safety)
context
data
policy
technology
management
Thank you
@ssdawes
www.ctg.albany.edu
Download