Technology, workflow, and protocols in collaboratively edited digitical editions

advertisement
Technology, workflow,
and protocols
in collaboratively edited
digitical editions
Juan Garcés
British Library
eIS
20 June 2007
Overview
• Technology
– XML
• Workflow
– quality control
– quality improvement
• Protocols
– author attribution
– identification and retrieval
• What is ‘text’?
Technology
XML
• Text Encoding Initiative
– open standard and guidelines
– de facto standard for Humanities texts
– crucial: consistency (ODD), separation of
critical perspective (?)
• challenge: OHCO data model only allows
one hierarchy
• encoding disagreement
• texts are more complex
Desideratum
• simple editing environment that allows:
– encoding of heterogeneous aspects of the
text
– multiple instances of the same ‘layer’
(disagreement)
– analysis of interrelation between instances
and layers
Workflow
Quality control:
peer review/refereeing
• uphold standards of academic disciplines
• stricter application since the middle of the twentieth
century
• anonymity (seldom ‘double-masked’) and independence
• criticisms:
–
–
–
–
–
–
slow process (sometimes iterative process)
susceptible to control by elites and to personal jealousy
lacks accountability
may be biased and inconsistent
failure to catch all fundamental errors
fraud
Quality control:
wikipedia model
•
•
•
•
mass-publication tool converted into mass-authoring tool
everyone can edit contents
mistakes are eradicated by community
advantages:
– timeliness
– impressive workforce
– democracy
• problems:
–
–
–
–
susceptible to spam and vandalism
always a work in progress
downplays individual contribution
deters participation by scholars
Quality control:
hybrids
• alternatives to traditional peer review:
–
–
–
–
open peer review (reviewers’ names made known)
parallel open peer review
voluntary peer review (publication first)
extended peer review (beyond publication date)
• true hybrids:
– content-appropriate marriage of community-oriented,
collaborative editing and scholarly editorial process
Quality improvement:
sequential print publication
Editor 3
Manuscript/
surrogate
Edition 3
Editor 2
Edition 2
Editor 1
Edition 1
Quality improvement:
simultaneous digital publication
improved
Edition
improved
Edition
Editor 1
Editor 3
Manuscript/
surrogate
improved
Edition
Editor 2
Protocols
Author attribution
•
social, legal, and technical genealogy
– social: 18th c. introduced a new concept of individualised authorship based on the
idea of a creative genius working alone - the “privileged moment of
individualization in the history of ideas, knowledge, literature, philosophy, and the
sciences” (Foucault)
– legal: “1710 Copyright Act”, or “Act for the Encouragement of Learning and the
Securing the Property of Copies of Books to the Rightful Owners Thereof”
– technological: coincides with the perfection of the movable types printing press
•
•
essential for evaluating professional output of Humanists (grant application,
tenure, etc.)
solutions for collaborative ‘authoring’:
– hierarchy of authors (lead, assistant, etc. – pre-assigned?)
– editing profile (contribution broken down into modular or granular input – how to
quantify quality?)
– peer assessment
•
for any solution eeds to be accepted in professional evaluation scenarios!
The Canonical Text Services
(CTS) Protocol
• developed by Neel Smith in conjunction with the Center
for Hellenic Studies (Washington, DC)
• defines a network service for identifying and working with
texts
• permanence and citability
of scholarly published works – they are “works
possessing an explicitly identified edition and explicitly
identified citation scheme, that can be irrevocably and
identically replicated”
• digital library
distributed objects accessible via a suite of network
services (simple identification and retrieval)
The Canonical Text Services
(CTS) Protocol
• hierarchical TextInventory (following FRBR, includes
identification of how to validate a document):
–
–
–
–
TextGroup+ (author, collection)
Work+ (notional)
Edition/Translation* (specific versions)
Exemplar* (specific physical copies)
• hierarchical model for citation of sections of a work
(recursively nesting <citation>, mapping XPath
expression)
• requests
– requests expressed as URL parameters
– replies formatted as well-formed XML
– requests: GetCapabilities, GetWorks, GetValidReff,
GetDocumentMetadata, GetPassage, DownloadText
Desiderata
• impermanence (time stamps, editions)
• new entities (data repository vs. VRE
scenario)
Download