Caught Between Worlds

advertisement
Caught Between Worlds
My Life as a digital (and not so digital) Curator
Presented by: Rob Guralnick
Who is: Associate Professor, Ecology and Evolutionary Biology
Curator, CU Museum of Natural History
Funding support: Global Biodiversity Information Facility, Defense Advanced
Research Projects Agency, National Science Foundation.
Briefest Outline
Part 1 – Very brief history of curation
Part 2 – Natural history curation from
specimens to portals and other aggregators
Part 3 – Broader perspectives on curation.
How do we establish expert knowledge?
Part 4 – Some models and closing thoughts
Curator
“The anglican or catholic “curate” was the “keeper of souls”.
With the rise of Museum, came the rise of
professional curation and curators.
Curators for hundreds of years were
“keepers of heritage” – whether cultural
or natural.
Typically a “content specialist”
Almost always, curators deal with a
collection of tangible objects
“Professional curator” has layers of
meaning (salary, authority, etc)
There have always been amateur curators
Curator
In the 21st century, the nature of Curator is changing rapidly as
society itself is transformed
There are still physical collections of
immense worth (and thus still Curators)
Professional Curators remain content
specialists
Their work often still mixes practical
care of the collections with research
based on those collections
Curators have also always managed
assets using technology.
Ledgers have given way to databases
Curator
“The anglican or catholic “curate” was the “keeper of souls”.
Physical Collections
Specimen ledgers
Curator
In the 21st century, the nature of Curator is changing rapidly as
society itself is transformed
With the rise of digital objects comes
new kinds of professional curation:
Digital curation – research on, and
management of, digital objects.
In biological realm, new type of curator called
Biocurator:
Genome data Curators – focus on
storing, managing, sharing, annotating
genomics resources
Biodiversity data Curators – focus on
storing, managing, sharing, annotating
biodiversity data
BIOCURATION EXAMPLE
Where do we get primary data about global biodiversity?
Biodiversity
Data
Index
Taxonomic
Name
Service
(ECAT)
Biodiversity Databases
THE PRESENT - FROM VIRUSES TO WHALES
PORTALS LIKE GBIF HELP PROVIDERS
DISTRIBUTE BIODIVERSITY OCCURRENCE DATA
AT ALL SPATIAL, TEMPORAL, TAXONOMIC SCALES
Compiling all biodiversity data for all species
Curator
In the 21st century, the nature of Curator is changing rapidly as
society itself is transformed
BUT AREN’T WE ALL NOW (AMATEUR) CURATORS?
How many of us are collecting, sharing, tagging
and creating new information and knowledge?
How many are collating heritage from our lives,
lives of families, neighborhoods, cities, countries?
Curator 2.0
“Meta-utopia or Meta-garbage”?
“Capturing and maintaining the correct metadata is increasingly being
viewed as perhaps the key to the reuse and preservation of digital
objects.” (DCC Digital Curation Manual Instalments, Michael Day)
BUT …
People lie
People are lazy
People are stupid
People don’t accurately describe their behaviors
Schemas aren’t neutral
Metrics influence results
There is more than one way to describe things1
1 Cory Doctorow, - Putting the torch to seven straw men of the meta-utopia…
Curator 2.0
“Meta-utopia or Meta-garbage”?
People lie
People are lazy
People are stupid
People don’t accurately describe their behaviors
Schemas aren’t neutral
Metrics influence results
There is more than one way to describe things1
IT IS THIS FUNDAMENTAL ISSUE ABOUT INDIVIDUAL
/COLLECTIVE OUTCOMES THAT LEADS TO A DISCUSSION ABOUT
“CONTENT EXPERTISE” AND ITS RELATION TO “TRUTH”.
1 Cory Doctorow, - Putting the torch to seven straw men of the meta-utopia…
A Mixed Model for Curation
All communities should share knowledge, but still a role for experts
From: http://swiki.cs.colorado.edu/CreativeIT/uploads/286/gerhard-slides-panel.pdf
A Mixed Model for Curation
All communities should share knowledge, but still a role for experts
So what
exactly is an
output filter?
And how
strong should
they be to
maximize
collaborative
content versus
reliability?
Case study: How does it work in
Encyclopedia of Life?
Curators must use their real names and offer
credentials1 publicly on a profile page. If these
credentials cannot be verified, curatorial
privileges will be rescinded.
1Credentials
may include one or more of the following: a) An affiliation with a relevant
department at a university or college b) Membership in a professional society c)
Published peer reviewed work d) Reference from a credentialed individual.
How does it work in EOL?
• To oversee and manage multiple curators, master
curators may be appointed by the Species Pages Group
in consultation with professional societies.
How does it work in EOL?
• Curators will examine content available for a species,
particularly unvetted content. Unvetted content will
typically come from the public or nonauthenticated large
resources, such as from Flickr, or be uploaded directly.
This content needs to be “approved” to appear on an
authenticated page.
How does it work in wikipedia
and wikispecies?
Three core content policies:
1. "No original research"
2. Neutral point of view
3. Verifiability.
Movement towards stronger “editorial”
control on wikipedia/wikispecies
content?
Collective opinion versus absolute fact
”Everyone's entitled to their own opinions, but not to their own facts”
Collective opinion versus absolute fact
”Everyone's entitled to their own opinions, but not to their own facts”
Putting the pieces together:
1. Data are raw collections from
the natural or cultural world
2. Data quality is always an issue
3. Any data “fixes” done after the
Initial collection are “inferential”
4. Data accuracy is essential for
ultimately establishing
true understanding of the world
5. Bad data means incorrect
information – feeds up DIKY
hierarchy
Collective opinion versus absolute fact
”Everyone's entitled to their own opinions, but not to their own facts”
Putting the pieces together:
1. Information, knowledge and
wisdom generated from data
2. As much as possible, these
“inferential steps” should be based
on best practices.
3. Knowledge and wisdom has a
fatual (data) basis but can
ultimately represent multiple
different perspectives
CONCLUSIONS 1
INTERESTING SIMILARITIES/
DIFFERENCES ACROSS THE
SPECTRUM:
Does wikipedia ever deal with
data?
Curators often deal with
raw data (sequences,
observations of animals,
specimens)
Importance of reliability and
accuracy. Most essential
for data?
CONCLUSIONS 1
INTERESTING SIMILARITIES/
DIFFERENCES ACROSS THE
SPECTRUM:
So maybe the strength of input
and output filters should vary
depending on whether primary
or derived source and on
intent?
If collecting “stories” about a
culture, how vital is historical
accuracy?
If predicting biodiversity for
conservation of a region, how
important is accuracy?
More Question than Answers
Caveats:
- Idiosyncractic view as Curator and
curator.
- Grasp of literature on this issue is
tenuous
- Appreciate the chance to “think
out loud”
Thanks to:
Note: The digital nautilus somehow
seems appropriate
- Elisa, for being Elisa
- Gerhard, for the invite
Download