Caught Between Worlds My Life as a digital (and not so digital) Curator Presented by: Rob Guralnick Who is: Associate Professor, Ecology and Evolutionary Biology Curator, CU Museum of Natural History Funding support: Global Biodiversity Information Facility, Defense Advanced Research Projects Agency, National Science Foundation. Briefest Outline Part 1 – Very brief history of curation Part 2 – Natural history curation from specimens to portals and other aggregators Part 3 – Broader perspectives on curation. How do we establish expert knowledge? Part 4 – Some models and closing thoughts Curator “The anglican or catholic “curate” was the “keeper of souls”. With the rise of Museum, came the rise of professional curation and curators. Curators for hundreds of years were “keepers of heritage” – whether cultural or natural. Typically a “content specialist” Almost always, curators deal with a collection of tangible objects “Professional curator” has layers of meaning (salary, authority, etc) There have always been amateur curators Curator In the 21st century, the nature of Curator is changing rapidly as society itself is transformed There are still physical collections of immense worth (and thus still Curators) Professional Curators remain content specialists Their work often still mixes practical care of the collections with research based on those collections Curators have also always managed assets using technology. Ledgers have given way to databases Curator “The anglican or catholic “curate” was the “keeper of souls”. Physical Collections Specimen ledgers Curator In the 21st century, the nature of Curator is changing rapidly as society itself is transformed With the rise of digital objects comes new kinds of professional curation: Digital curation – research on, and management of, digital objects. In biological realm, new type of curator called Biocurator: Genome data Curators – focus on storing, managing, sharing, annotating genomics resources Biodiversity data Curators – focus on storing, managing, sharing, annotating biodiversity data BIOCURATION EXAMPLE Where do we get primary data about global biodiversity? Biodiversity Data Index Taxonomic Name Service (ECAT) Biodiversity Databases THE PRESENT - FROM VIRUSES TO WHALES PORTALS LIKE GBIF HELP PROVIDERS DISTRIBUTE BIODIVERSITY OCCURRENCE DATA AT ALL SPATIAL, TEMPORAL, TAXONOMIC SCALES Compiling all biodiversity data for all species Curator In the 21st century, the nature of Curator is changing rapidly as society itself is transformed BUT AREN’T WE ALL NOW (AMATEUR) CURATORS? How many of us are collecting, sharing, tagging and creating new information and knowledge? How many are collating heritage from our lives, lives of families, neighborhoods, cities, countries? Curator 2.0 “Meta-utopia or Meta-garbage”? “Capturing and maintaining the correct metadata is increasingly being viewed as perhaps the key to the reuse and preservation of digital objects.” (DCC Digital Curation Manual Instalments, Michael Day) BUT … People lie People are lazy People are stupid People don’t accurately describe their behaviors Schemas aren’t neutral Metrics influence results There is more than one way to describe things1 1 Cory Doctorow, - Putting the torch to seven straw men of the meta-utopia… Curator 2.0 “Meta-utopia or Meta-garbage”? People lie People are lazy People are stupid People don’t accurately describe their behaviors Schemas aren’t neutral Metrics influence results There is more than one way to describe things1 IT IS THIS FUNDAMENTAL ISSUE ABOUT INDIVIDUAL /COLLECTIVE OUTCOMES THAT LEADS TO A DISCUSSION ABOUT “CONTENT EXPERTISE” AND ITS RELATION TO “TRUTH”. 1 Cory Doctorow, - Putting the torch to seven straw men of the meta-utopia… A Mixed Model for Curation All communities should share knowledge, but still a role for experts From: http://swiki.cs.colorado.edu/CreativeIT/uploads/286/gerhard-slides-panel.pdf A Mixed Model for Curation All communities should share knowledge, but still a role for experts So what exactly is an output filter? And how strong should they be to maximize collaborative content versus reliability? Case study: How does it work in Encyclopedia of Life? Curators must use their real names and offer credentials1 publicly on a profile page. If these credentials cannot be verified, curatorial privileges will be rescinded. 1Credentials may include one or more of the following: a) An affiliation with a relevant department at a university or college b) Membership in a professional society c) Published peer reviewed work d) Reference from a credentialed individual. How does it work in EOL? • To oversee and manage multiple curators, master curators may be appointed by the Species Pages Group in consultation with professional societies. How does it work in EOL? • Curators will examine content available for a species, particularly unvetted content. Unvetted content will typically come from the public or nonauthenticated large resources, such as from Flickr, or be uploaded directly. This content needs to be “approved” to appear on an authenticated page. How does it work in wikipedia and wikispecies? Three core content policies: 1. "No original research" 2. Neutral point of view 3. Verifiability. Movement towards stronger “editorial” control on wikipedia/wikispecies content? Collective opinion versus absolute fact ”Everyone's entitled to their own opinions, but not to their own facts” Collective opinion versus absolute fact ”Everyone's entitled to their own opinions, but not to their own facts” Putting the pieces together: 1. Data are raw collections from the natural or cultural world 2. Data quality is always an issue 3. Any data “fixes” done after the Initial collection are “inferential” 4. Data accuracy is essential for ultimately establishing true understanding of the world 5. Bad data means incorrect information – feeds up DIKY hierarchy Collective opinion versus absolute fact ”Everyone's entitled to their own opinions, but not to their own facts” Putting the pieces together: 1. Information, knowledge and wisdom generated from data 2. As much as possible, these “inferential steps” should be based on best practices. 3. Knowledge and wisdom has a fatual (data) basis but can ultimately represent multiple different perspectives CONCLUSIONS 1 INTERESTING SIMILARITIES/ DIFFERENCES ACROSS THE SPECTRUM: Does wikipedia ever deal with data? Curators often deal with raw data (sequences, observations of animals, specimens) Importance of reliability and accuracy. Most essential for data? CONCLUSIONS 1 INTERESTING SIMILARITIES/ DIFFERENCES ACROSS THE SPECTRUM: So maybe the strength of input and output filters should vary depending on whether primary or derived source and on intent? If collecting “stories” about a culture, how vital is historical accuracy? If predicting biodiversity for conservation of a region, how important is accuracy? More Question than Answers Caveats: - Idiosyncractic view as Curator and curator. - Grasp of literature on this issue is tenuous - Appreciate the chance to “think out loud” Thanks to: Note: The digital nautilus somehow seems appropriate - Elisa, for being Elisa - Gerhard, for the invite