Provenance & Evidence‐Based Policy Research Pete Edwards Computing Science University of Aberdeen Lorna Philip Geography & Environment University of Aberdeen { p.edwards, l.philip} @ abdn.ac.uk Overview • • • • • • PolicyGrid Evidence‐Based Policy Research Provenance – Requirements & Scope Developing a (Partial) Solution Supporting Communication & Collaboration Limitations & Challenges PolicyGrid • Research Node of the National Centre for eSocial Science (NCeSS). – UK Economic & Social Research Council (2006 – 2012). • Collaboration between computer scientists, human geographers and others. • Phase I (2006‐2009) Objectives: – Support creation of an audit trail to allow assumptions underlying policy conclusions and recommendations to be understood. – Record information about the resources (data sets, documents, survey instruments, etc.) used in policy based research and how those resources were created and used. – Develop infrastructure and tools to support storage of resources, creation of metadata annotations, metadata browsing, formulation of policy arguments, collaboration support for researchers. • www.policygrid.org Evidence‐Based Policy Assessment • Evidence based policy and policy assessment are central themes in UK Government. Key requirements of policy assessment: – sufficient evidence to support policy recommendations – an audit trail to allow assumptions underlying conclusions and recommendations to be understood; – sufficient information to support any later evaluation. Evidence‐Based Policy Assessment Review and evaluate policy Develop new policy Stakeholders Policy ‘questions’ Conclusions and recommendations Data and analysis Key Questions • Was the evidence sifted and graded for quality? • Were the inclusion and exclusion criteria explicit? • Is the evidence easy to understand? • Has the strength of the evidence been assessed? Provenance & eSocial Science How were data created? How were data analysed/ interpreted? How were conclusions drawn? – characteristics of secondary data and why they were used; – primary data collection methods; – the analytical/ interpretative process; – what conclusions were drawn? – publications and dissemination materials. Provenance & eSocial Science Solution needs to accommodate: Quantitative data • Uses numbers and categories; data collection, analysis and the identification of significant findings follow well established conventions; the quantitative researcher is objective and detached from the subjects of study; in principle, re‐analysis of quantitative data would produce the same findings. Qualitative data • Non‐numeric, uses words and images; samples are illustrative; generalisation is not an objective, and often discouraged; subjectivity, reflexivity and personal engagement with research subjects and data; data collection and analysis is an iterative process. Combined methods approaches The Role of Provenance Producers of Evidence Decision makers Politicians / Minister Manifesto commitments Influence of senior civil servants i on i t rac e t n users of evidence Users of Evidence Academics, consultants Government policy teams Senior government advisors Other stakeholders interaction Users can examine the evidence leading to conclusions and recommendations interaction Filter Interface between producers and Academics, consultants, government employed researchers, various stakeholder groups Facilitates working in research teams & multi‐ centre research Construction of Evidence (supported by provenance) Research questions Data collection methods Data collection instruments Data (data set, transcripts etc) Analysis and interpretation Conclusions – linked to data Recommendations – linked to data Further research Developing a Solution • Initial (straw man) OWL ontology derived from the UK Data Archive www.data‐archive.ac.uk • Alignment with Open Provenance Model (Artifact, Process, Agent) – Moreau et al, 2007 • Focus groups for user requirements gathering identified: – research methods and connected resource types; – important properties to describe these concepts. • Analysis of DDI (Data Documentation Initiative) schema to enhance compatibility with wider metadata schemas (www.ddialliance.org/). • Gathering and testing on use cases provided by social scientists – Can these particular research projects be described using our framework? Use Case Whiteboard image of description of Philip and MacMillan’ s CV market stall case study. Use Case Modelling Provenance dc: creator ssad: used ssad: hasRespondents ssad: hasInterviewee ssad: hasInterviewer ssad: created foaf: knows ssad: created ssad: created Provenance “In the wild” • Our model allows us to capture (in RDF): – Information about people and organisations; – Provenance of resources and activities; – An audit trail. • What happens when we embed this framework into a VRE? – Users want to maintain blogs, post messages, annotate resources/activities with comments. – Social relationships. • How do we capture this additional layer of provenance? – The New eScience (De Roure, 2007) www.ourspaces.net Supporting Communication & Collaboration foaf: person “John Farrington” ssad: hasInterviewer ssad: interview “Policy Assessment” ssad: created ssad: hasInterviewee rel: workWith ssrd: transcript “Interview Notes” foaf: person “Colin Hunter” sioc: related_to foaf: person “Lorna Philip” foaf: holdsAccount sioc: user sioc: user “lp123” “lp123” sioc: post sioc: post sioc: creator_of “Where is the audio “Where is the audio recording of the interview?” recording of the interview?” REL Relationship Vocabulary Benefits of Provenance for Evidence‐Based Policy Research • Links data and other digital artefacts (questionnaires, transcripts …) together and allows research methods to be scrutinised. • Third parties can assess the robustness or truthfulness of data collection, analysis and other processes. • Missing evidence can be flagged. • What worked and what didn’t work and why can be formally recorded. • Ensures that contextual information is formally recorded, provides a clear audit trail for policy making and/ or academic research. Limitations/Challenges • Lab books/ very formal record keeping is not part of the social science way of doing research. • Can/ should the qualitative research process be formally recorded? • Will policy makers want to/ be able to use our tools? • Ethical/ moral drivers • Philosophical positions • Is a data collection and analysis event unique or is extension through generalisation appropriate? • The re‐use of data Limitations/Challenges • Capturing research goals, conclusions and arguments • Scientist’s Intent (Pignotti et al, 2008) • Trust and reputation • Do social annotations make this easier or harder? • Integrative research and multidisciplinary evidence bases • Temporal and spatial context • User‐interface issues • Creating provenance metadata, querying, browsing. Thanks • UK ESRC eSocial Science Programme • UK National Centre for eSocial Science • Members of the PolicyGrid team Browsing Provenance • How do we support users to construct RDF descriptions, and to query and browse such representations? • Kaufmann (2007) suggested that casual users of the Semantic Web prefer a natural language interface. • LIBER ‐ An ontology driven natural language interface. – Hielkema et al., 2008 – Support for creation of descriptions, query construction (SPARQL), presentation of RDF. Accessibility Policy Assessment Tool POLICY ANALYSIS (TOP DOWN) 5 1 A NA LY S E P OL IC Y (C M 1) INVE S T I‐ G AT E IMPA C T S A ND DE L IVE R Y Initial C M analysis , L ocal Authority Interviews E VAL UATE P OL IC Y F IT WITH NE E DS AND B E HAVIOUR S ervice provider F ocus G roups Q ues tionnaires P hone S urvey S E C TION 3 8 12 10 C M2 E VA L UAT E P OL IC Y E F F E C T IVE ‐ NE S S E VA L UAT E P OL IC Y ALTE R ‐ NAT IVE S 9 DE VE L OP ALTE R ‐ NAT IVE S 4 E S TA B L IS H A ND A NA LY S E NE E DS A ND B E HAVIOUR COMMUNITY (BOTTOM UP) 6 3 P olicy maker interviews, 2 S E C TION 2 S E C TION 1 7 INVE S T I‐ G AT E IMPAC T S Us er F ocus G roups 11 COST BENEFIT ANALYSIS R E DE S IG N P OL IC Y