PIDs – a service for users - DARIAH-DE

advertisement
PIDs – a service for users
Natasa Bulatovic
Research and Development
Max Planck Digital Library
April 2012
This work is licensed under a Creative Commons Attribution 2.0 Germany License
http://creativecommons.org/licenses/by/2.0/de/
PIDs at the Max Planck Digital Library
MPG Landscape:
Several milions objects to identify (maintained
centrally by specialized repositories)
Many more to identifiy (maintained elsewhere)
Name
19.07.2012
A PID or a Persistently Identified Dilemma?
We mostly agree about
What to identify ?
When to identify?
But we discuss a lot about
Why to identify (why not use just Cool URIs) ?
Where to identify?
To merge in my data? How? (a lot of work)
What to do if it is already identified?
Does a specialized PID system offers us benefits
or just further headaches?
Which system is better?
Does the PID syntax allows me to ..?
Name
19.07.2012
So what many of us users do?
Leave it status-quo and hope to come back again at a later
stage
Just assign URLs and service them from our domains
Sometimes we even use PURLs (e.g. for metadata schema
profiles)
Publishers use mostly DOIs, we keep these in our metadata
We start building a new system: make thorough analysis of
current PID systems in place
We agree on Handle and use it, but only for some data
We talk to researchers, Go to step 1 again
Name
19.07.2012
What is our problem?
Do we have a PID API for users?
Simple
Well documented and understandable
Leverage the power of HTTP (REST style)
Provide additional value: client libraries and other services
Note: not a global resolver for various PID systems
Name
19.07.2012
A practical example
APSR SWORD 1.2-compatible OJS Plugin
http://pkp.sfu.ca/support/forum/viewtopic.php?f=28&t=3877
screencast
arXiv 1.3-compliant endpoint:
http://arxiv.org/help/submit_sword
Feedforward – personal information environment, with a SWORD interface , among other
features.
SWORD Widget – For Netvibes, IGoogle and embedding in web pages
The Depot – SWORD-compliant.
Foresite – using SWORD to deposit ORE resource maps describing journals within JSTOR
into a DSpace repository.
Biomedcentral’s Open Repository – implementing a SWORD interface
Intrallect – desktop drag and drop tool based on SWORD
Microsoft Article Authoring Add-in for Word 2007 – allows repository deposit direct from
Word.
Microsoft Zentity Research Output Repository platform supports SWORD deposit
Microsoft Client code – Microsoft Office SWORD deposit plugin http://www.codeplex.com/OfficeSWORD
Microsoft eJournal Service (Alpha) and Research Output Repository Platform (Beta):
http://www.microsoft.com/mscorp/tc/scholarly_communication.mspx
SOURCE project
BibApp – SWORD Ruby Client http://code.google.com/p/bibapp/
Facebook client http://fb.swordapp.org/
ICE-TheOREM – has demonstrated ‘ORE-over-SWORD’
TARDIS, the Australian Repository for Diffraction Images is implementing a SWORD
interface: http://tardis.edu.au/wiki/index.php/TARDIS2
PublicationsList.org is using SWORD for deposit into EPrints http://publicationslist.org/
EM-Loader project has used SWORD for batch deposit http://publicationslist.org/emloader/emloader-report-sword-experiences.html
Max Planck Digital Library’s eSciDoc solution, ‘PubMan’ has implemented SWORD
http://colab.mpdl.mpg.de/mediawiki/PubMan_Sword
Windows SWORD desktop client created by Hrvoje Jerković http://dspace-depositapp.blogspot.com/
CUNY, The City University of New York Libraries are using SWORD for deposit into DSpace
The National Strategies are using SWORD to deposit from their K-Int tagging tool into Drupal
The Collections Trust‘s Culture Grid is using SWORD for ‘RESTful deposit’. Technical
specification (PDF)
SWORD interfaces in various installations of DSpace, EPrints, Fedora and IntraLibrary
The YODL-ING project at York is developing a SWORD-based ‘one-stop’ deposit client
EU PEER Project will be implementing SWORD for deposit http://www.peerproject.eu/
The CLASM project will be developing a SWORD plugin for Moodle
http://dablog.ulcc.ac.uk/category/projects/clasm/
The National STEM Centre is implementing SWORD for client and bulk deposit
…
Name
• Different technologies
• Different repositories
• Code Libraries for easy adoption
by repositories
• Easy to use
• Easy to understand
19.07.2012
An Idea
Agree on common API
Plenty of experience already exists among PID providers
Implement additional service interface (no need to modify
already established ones)
the interface must cover basic PID system functionality
the interface may cover particular PID system (or community)
specifics
RESTful - a PID is a Resource as well – that makes it different
from a URL or just any string used to identify something
Clear definition e.g. PID services ontology
Think about value-added services
Name
19.07.2012
Actors
Service provider
The system manages the PID
Resource provider
The user who acquires PID for own resources
HTTP Requests
GET retrieve information
POST create new resource
PUT update existing resource
DELETE – not used for now
Name
19.07.2012
Ontology
•Service Provider
A user perspective
•Context – specific dimension offered by the
Service Provider (e.g naming authority, DOIprofile etc.). At least 1 must exist
•PID – the persistent identifier
•Type – the type of the persistent identifier (e.g.
Handle, URN, URL…)
•Resolution – a resolution to a particular
representation of a resource.
•Metadata profiles- metadata profiles
maintained within the context. Different contexts
may have different metadata profiles. Metadata
may be associated with PIDs depending on
context.
•Extensions (optional) - specific extensions
offered within a context. Within a service,
different contexts may have different extensions
Name
19.07.2012
On Resolution
A Resolution could be associated with the following attributes:
The URL to which to resolve to
Default (true, false) – if a particular resolution is default or not
Category (e.g. metadata, content, fragment, …)
Format (user defined URL to link to the representation format
related to the resolution)
And perhaps some Linked Data support in addition?
Outgoing Accept header definition (to facilitate content
negotiation at the Resource provider side)
Incoming Accept header definition (to facilitate content
negotiation at the Service provider side)
Name
19.07.2012
Service level operations
<text> - placeholder for a concrete value
italictext - example name of the operation
Usually implemented or not
GET <service>
Resolves to the home page of the service presentaion
Example: GET http://handle.gwdg.de
GET <service>/explain
Resolves to a service description i.e. “service document”
The service document is an e.g. RDF/XML formatted document which informs the
user agent about the type of the PID system, supported formats, system operations,
available contexts, metadata profiles, accept headers, info on content negotiation
support
Note
•
The second form of the interface could optionally be avoided by content
negotiation or simply by using LD principle such as: GET <service>/data.
Same for all further usage of “explain”
Name
19.07.2012
PID retrieval/resolution operations
GET <service>/<PID>
Resolves to the default resolution and default representation format of a resource
associated with this <PID>
Example: GET http://pid.gwdg.de/11858%2F00-001Z-0000-0001-41F3-C
Note: the basic PID retrieval does not have to be structured by the context (unless the
context is mandatory part of the PID value itself), as this has to be inherently handled by
the PID service provider
GET <service>/<PID>/explain
Returns a XML/RDF document containing all-in-one service provider
information about this particular <PID>
PID properties e.g. type, context, links to extensions, links to the metadata
profiles , links to the resolutions (and information about delivered format,
potentially fragment pattern, description by the resource provider, accept
headers)
GET <service>/<PID>/policy
See also http://tools.ietf.org/html/draft-kunze-ark-15#section-5.1.1
Name
19.07.2012
PID retrieval/resolution operations
GET <service>/<PID>/<formatId>
Resolves to the representation of a resource related to a particular format ID (i.e. “xml”,
“rdf”, “html” …)
<resolutions>
<format>
<id>xml></id>
<operation>
http://pid.gwdg.de/11858%2F00-001Z-0000-0001-41F3-C/xml
</operation>
<format-description>
The XML representation of the identified resource
</format-description>
</format>
<url>http://pubman.mpdl. mpg.de/item/escidoc:12345</url>
<accept-header>text/xml</accept-header>
</resolutions>
Useful to facilitate content negotiation
Useful to resolve to particular format of resource by resource provider
Perhaps fragment patterns can be treated as a specific format
Name
19.07.2012
Create/update a PID
POST <service>/pid
Input: takes input in defined format (e.g. XML, RDF/XML or JSON). The input
contains necessary properties and metadata that are to be associated with the PID.
Creates a new PID for the provided input, and returns the same result which user
would get when invoking GET <service>/<pid> /explain operation for the newly
created PID
PUT <service>/<PID>
Input: takes input in defined format (e.g. XML, RDF/XML or JSON). The input
contains the compliete properties and metadata, with modified values (or old
values) that are to be associated with the PID.
Updates the existing PID for the provided input, and returns the same result
which user would get when invoking GET <service>/<pid>/explain operation
for the newly created PID
Name
19.07.2012
User retrieval/maintenance operations
GET <service>/me
GET <service>/user/<id>
Returns information about the user agent
E.g. id, name, basic information, contexts allowed, could also be some
derived metrics such as: no of PIDs created etc.
PUT <service>/me
PUT <service>/user/<id>
Input: agreed input format containing data user can modify (e.g. RDF/XML,
XML or JSON)
Name
19.07.2012
Metadata information retrieval operations
GET <service>/<PID>/metadata
Returns links to the available metadata records the service
provider system maintains locally for the provided <PID>
<metadata>
<profile>
http://pid.gwdg.de/11858%2F00-001Z-0000-0001-41F3-C/metadata/profile-1
<profile>
</metadata>
GET <service>/<PID>/metadata/<profileId>
Returns the available metadata record with <profileId> for the
provided <PID>
Name
19.07.2012
Searching
GET <service>/search?query=<query>
Allows for searching for PIDs according to the search criteria
supported by querying standards e.g. SRW/CQL
GET <service>/search/explain
Provides a list of search fields that can be used as criteria in
the query
Name
19.07.2012
Extensions (summary)
As Extensions could be considered: all PID system specific
operations (i.e. available on context or service level), user community
extensions (i.e. available on context or PID level)
Depending on the purpose and aim of the extension, operations may
differ, however, operations could be
GET/PUT/POST <service>/extensions/<extension-id>
For PID service-level specific extension operations
GET/PUT/POST <service>/<pid>/extensions/<extension-id>
For PID-level specific extension operations
GET/PUT/POST <service>/<context-id>/extensions/<extension-id>
For context-level specific specific extension operations
Name
19.07.2012
Value added service for users
Statistics and metrics
http://de.wikipedia.org/wiki/QR-Code
Where is my resource identified, linked, cited, mentioned?
Communication with other PID systems
Google search
Which versions are out there
My PID is also a resource
User friendly interface for researchers
Google search
Client libraries to use the API
Sustainability
Focus on service and support i.e. best practices, how-tos
Developing long-term relationships with user communities
Name
19.07.2012
A user view only, however
We need an API and easy to use system
An API shall be
Feasible for adoption
Complete for basic operations (from end-user aspect)
Clearly defined messages (XML, RDF or JSON)
Used
Name
19.07.2012
Thank you for your attention!
Natasa Bulatovic, Max Planck Digital Library
bulatovic@mpdl.mpg.de
Name
19.07.2012
Download