Social Machines Three Shifts in Digital Scholarship David De Roure 6TH UCL BLOOMSBURY CONFERENCE Provocations to Frame Discussion Three shifts in scholarly practice 1. Hypothesis-led – data-centric 2. Scientist – public 3. Human – computer The Problem INT. VERSE VERSE BRIDGE VERSE BRIDGE VERSE OUT. Datascopes telescopes for the naked mind NRAO/AUI/NSF From Signal to Understanding BioEssays,, 26(1):99–105, January 2004 http://research.microsoft.com/en-us/collaboration/fourthparadigm/ PolicyGrid m Current Nodes Rural communities DE Hubs DAMES d ds Harnessing advances in digital technology and practice to achieve world-class social research with maximum impact DE DTCs Demonstrators & Sustainability Social Inclusion highwire NeISS CQeSS Genesis s m e-Social Science MoSeS m Obesity e-Lab ss HUB m DReSS Horizon Creative Industries Finance mm d MiMeG Healthcare Genesis OeSS GeoVUE mm eStat m d m Entertainment Web Science ncrmLifeGuide www.digitalsocialresearch.net Media NCRM phase 3 NCRM phase 2 A A B B C + - F F + C E E D D Theories of Self interest - Theories of Exchange Theories of Balance A A B B F + C F - + C E D Novice Expert D Theories of Collective Action Theories of Homophily Theories of Cognition E Web Observatory Community Group • Will articulate the business and technical requirements for the Web Observatory • Need all observatory stakeholders • Starting by identifying existing observatories, current practice and use cases • Preliminary discussions: need to describe observatories, datasets, data flows, policies, and to share results and methods • Announced at WWW2012 and WEBIST 2012 • http://www.w3.org/community/webobservatory/ method data http://www.myexperiment.org/ Three Provocations for Framing Boundaries changing between 1. Hypothesis-led and data-centric 2. Scientist and public 3. Human and computer The Zooniverse principles 1. Telling people about the purpose of the research and about its context is a good thing 2. Treat participants as collaborators not as subjects 3. Do not waste people’s time 4. All volunteers, and their contributions, are of equal value to the project Versus… • The Deficit model – the layperson is irrational, ignorant, and even intellectually vacuous • Human-based computation – a computer science technique in which a computational process performs its function by outsourcing certain steps to humans http://www.bodleian.ox.ac.uk/bodley/library/special/projects/whats-the-score 23,000 hours of recorded music Digital Music Collections Student-sourced ground truth Music Information Retrieval Community Community Software Supercomputer Linked Data Repositories SOCIAM The Theory and Practice of Social Machines The order of social machines Real life is and must be full of all kinds of social constraint – the very processes from which society arises. Computers can help if we use them to create abstract social machines on the Web: processes in which the people do the creative work and the machine does the administration… The stage is set for an evolutionary growth of new social engines. Berners-Lee, Weaving the Web, 1999 Some Social Machines? More machines Social Machines in Context Big Data Big Compute Social Machines Conventional Computation Social Networking More people The users of a website, the website, and the interactions between them, together form our fundamental notion of a “machine” Three Provocations for Framing Boundaries changing between 1. Hypothesis-led and data-centric 2. Scientist and public 3. Human and computer http://force11.org/ Paul’s Paul’s Pack Research Object QTL Workflow 16 Results produces Included in Published in Included in Feeds into Logs produces Included in Included in Metadata Slides produces Paper Published in Common pathways Workflow 13 Results The R dimensions Reusable. The key tenet of Research Replayable. Studies might involve Objects is to support the sharing and single investigations that happen in milliseconds or protracted processes reuse of data, methods and that take years. processes. Referenceable. If research objects Repurposeable. Reuse may also are to augment or replace traditional involve the reuse of constituent publication methods, then they must parts of the Research Object. be referenceable or citeable. Repeatable. There should be sufficient information in a Research Object to be able to repeat the study, perhaps years later. Reproducible. A third party can Revealable. Third parties must be able to audit the steps performed in the research in order to be convinced of the validity of results. Respectful. Explicit representations start with the same inputs and methods and see if a prior result can of the provenance, lineage and flow of intellectual property. be confirmed. Replacing the Paper: The Twelve Rs of the e-Research Record” on http://blogs.nature.com/eresearch/ The Executable Thesis new data PhD Student executable thesis new results Computational Research Objects Research Objects that are 1. The research record for repeatable, reproducible, ... etc 2. Describe process (method) for enactment/execution 3. Usable by machines as well as humans – Social Objects – Semantically described – Programmatically accessible – Designed for assistance and automation – Designed for scale and heterogeneity 4. Composable with a distributed computational model? Notifications and automatic re-runs Autonomic Curation Self-repair New research? Machines are users too Knowledge Infrastructures Knowledge infrastructures comprise robust networks of people, artifacts, and institutions that generate, share, and maintain specific knowledge about the human and natural worlds So what about impact? • • • Filter then publish -> publish then filter Scholarly publishing is a Social Machine* • Science is already crowd-sourced • Authors, Readers, Reviewers, … If outreach and impact are measured by scale of reuse, we have mechanisms for scaling * merry-go-round? david.deroure@oerc.ox.ac.uk www.oerc.ox.ac.uk/people/dder blogs.nature.com/eresearch @dder Slide credits: Christine Borgman, Ichiro Fujinaga, Malcolm Atkinson, Noshir Contractor, Nigel Shadbolt, Paul Fisher www.myexperiment.org/packs/286 • • • • • • • D. De Roure, C. Goble and R. Stevens. The Design and Realisation of the myExperiment Virtual Research Environment for Social Sharing of Workflows Future Generation Computer Systems 25, pp. 561-567. http://dx.doi.org/10.1016/j.future.2008.06.010 S. Bechhofer, I. Buchan, D De Roure et al. Why linked data is not enough for scientists, Future Generation Computer Systems http://dx.doi.org/10.1016/j.future.2011.08.004 D. De Roure, David and C. Goble, Anchors in Shifting Sand: the Primacy of Method in the Web of Data. WebSci10, April 26-27th, 2010, Raleigh, NC, US. D. De Roure, S. Bechhofer, C. Goble and D. Newman, Scientific Social Objects, 1st International Workshop on Social Object Networks (SocialObjects 2011). D. De Roure, K. Belhajjame, P. Missier, P. et al Towards the preservation of scientific workflows. 8th International Conference on Preservation of Digital Objects (iPRES 2011). Singapore. David De Roure, Kevin R. Page, Benjamin Fields, Tim Crawford, J. Stephen Downie and Ichiro Fujinaga An e-Research Approach to Web-Scale Music Analysis", Phil. Trans. R. Soc. A 28 August 2011 vol. 369 no. 1949 3300-3317 doi: 10.1098/rsta.2011.0171 Carole A. Goble, David De Roure and Sean Bechhofer Accelerating scientists’ knowledge turns. Will be available at www.springerlink.com http://ora.ox.ac.uk/objects/uuid:17de32c4-518f-4be6-bf78-1ecd6c761b81