metadata harvesting on ayurveda and the open archives initiative in

METADATA HARVESTING ON AYURVEDA AND THE OPEN ARCHIVES INITIATIVE IN PRESENT ELECTRONIC ENVIRONMENT AT CCRAS LIBRARY Dr. G.Gnana Sekari Library and Information Officer, Central Council for Research in Ayurvedic Sciences, 61-65 Institutional Area, Janakpuri, New Delhi-58. Ph:011-28524906; Mob:7838146013 ggsek@yahoo.com Miss. Shweta Dhingra Library Consultant, Central Council for Research in Ayurvedic Sciences, 61-65 Institutional Area, Janakpuri, New Delhi-58. Ph:011-28524906; Mob:9891426244 Shweta1610@gmail.com . Abstract: This paper gives a brief history of the OAI, an examination of the protocol itself, and lists some of the current projects, biomed central and the open archives initiative and future directions. The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is a collaborative effort that provides an applicationindependent interoperability framework based on Metadata Harvesting. Though the OAI-PMH is a very recent development it is being regarded as an important step towards information discovery in the digital library arena. The Open Archives Initiative (OAI) is an evolving protocol and philosophy regarding interoperability for digital libraries (DLs). The OAI is a move away from distributed searching, focusing on the arguably simpler model of ‘metadata harvesting’. Perhaps the strongest and distinguishing feature of OAI is its simplicity: by being ‘smaller’ than previous interoperability projects, it actually allows for more powerful and adaptable configurations and deployments. Keywords: OAI-PMH, Institutional Repository, Metadata Harvesting, Open Archives Initiatives, Protocol for Metadata Harvesting, Metadata Providers, Metadata Service Providers 1.0 Introduction In the digital environment new methodologies of information management and access, coupled with advancements in digital information systems, have transformed to a great extent the ways and means of information management. Metadata, the systematic arrangement of data elements, aids the identification and location of information resources, thereby facilitating improved access to them. However, there exists unpredictability in terms of the availability, accessibility and authenticity of digital objects. Many search mechanisms retrieve a plethora of information resources, but the majority lack effectiveness and comprehensiveness. [1] The solution for this, named Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH), has rapidly become known worldwide. [2]. At the same time, institutional repositories and digital libraries are adopting the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) to expose their belongings of white papers, some of which are indexed by search engines. Academic and research institutions are expending enormous efforts to digitize their collections of theses, white papers, technical reports, maps, images, and historical documents to make them available in institutional repositories or digital libraries. [3] Today many people uses the expression open archives to mean repositories of digital information that provides a machine interface for making their content available to external services and view the OAI-PHM as a mechanism to achieve interoperability among different types of repositories. This paper briefly introduces the 1 Open Archives Initiative (OAI) approach and surveys its application within the CCRAS Library and Ayurvedic Archival Community. The paper concludes by presenting our view of the future role of CCRAS Library that OAI-PMH can play in supporting the collaboration between Ayurvedic organisations and archival institutions globally [4]. It must be emphasised that OAI-PMH is not a search engine or a search tool or a database. It only provides a set of rules for moving the metadata (not the content) of the digital resource from one repository to another. The content remains in the source repository. A repository can act either as a service provider or harvester and data provider, or only as a service provider or data provider. The protocol is not restricted to supporting simple metadata (unqualified Dublin Core), but can support any metadata schema which can be provided in an XML format. [5] 1.1 Genesis of OAI-PMH The roots of OAI lie in the development of e-print repositories (so-called archives). E-print repositories were established in order to communicate the results of ongoing scholarly research prior to peer review and journal publication which began with high energy physics, mathematics, nonlinear sciences and computer science.[6] There are, however, a number of other established efforts (CogPrints2, NCSTRL3, RePEC4), which collectively demonstrate the growing interest of scholars in using the Internet and the Web as vehicles for immediate dissemination of research findings. Different interfaces were designed for different repositories, so end users were forced to learn diverse interfaces in order to access the various repositories and finding aids. Finally, the economic model of scholarly publishing has been severely strained by rapidly rising subscription prices and relatively stagnant research library budgets. The October 1999 meeting in Santa Fe of what then called the UPS (Universal Preprint Service) two key interoperability problems were identified : end users were faced with multiple search interfaces making resource discovery harder, and there was no machine-based way of sharing the metadata. It was suggested that a solution would be to get all the metadata records together in one place. The UPS prototype brought to the Santa Fe meeting demonstrated a cross-archive digital library providing services based on a collection of metadata harvested from multiple archives. The participants at the Santa Fe meeting decided that a low-barrier solution was critical towards widespread adoption among E-Print providers. The meeting therefore adopted an interoperability solution known as metadata harvesting. This solution allows EPrint (content) providers to expose their metadata via an open interface, with the intent that this metadata be used as the basis for value-added service development. The result of the meeting was a set of technical and organisational agreements known as the Santa Fe Convention. The technical aspects included the agreement on a protocol for metadata harvesting based on the broader Dienst protocol, a common metadata standard for EPrints (the Open Archives Metadata Set), and a uniform identifier scheme. [7]. 1.2 Basic OAI concept The essence of the open archives approach is to enable access to Web-accessible material through interoperable repositories for metadata sharing, publishing and archiving. It arose out of the e-print community, where a growing need for a low-barrier interoperability solution to access across fairly heterogeneous repositories lead to the establishment of the Open Archives Initiative (OAI). As it says in the OAI mission statement ‘The Open Archives Initiative develops and promotes interoperability standards that aim to facilitate the efficient dissemination of content.’ 1.3 Open Archive Initiative Protocol for Metadata Harvesting (OAI-PMH) The OAI-Protocol for Metadata Harvesting (OAI-PMH) defines a mechanism for harvesting records containing metadata from repositories. The OAI-PMH gives a simple technical option for data providers to 2 make their metadata available to services, based on the open standards HTTP (Hypertext Transport Protocol) and XML (Extensible Markup Language). The metadata that is harvested may be in any format that is agreed by a community (or by any discrete set of data and service providers). Thus, metadata from many sources can be gathered together in one database, and services can be provided based on this centrally harvested, or ‘aggregated’data. It simply makes it possible to bring the data together in one place. In order to provide services, the harvesting approach must be combined with other mechanisms. Perhaps most readily achievable are the goals of surfacing 'hidden resources' and low cost interoperability. [8] 1.4 Examples of Open Archive Software Tools Arc Source, CDSware, Dspace, Eprints, Greenstone, i-Tor, MyCoRe, etc. OAI Data Provider Script HUBerlin, OAIHarvester, OAI Implementation for Windows, OAI Java Implementation for Linux/Windows, oai-perl library, OAIster, OMNIS2, Open Video, OPUS, VTOAI-PMH Perl Implementation, XMLFile, ZMARCO, etc. 1.5 Structure: Data and Service Providers The UPS architecture identified two logical roles: ‘Data Providers’and ‘Service Providers’. In a very simple language Data Providers handle the deposit and publishing of resources in a repository and "expose’for harvesting the metadata about resources in the repository. They are the creators and keepers of the metadata and repositories of resources. Service Providers harvest metadata from Data Providers. They use the harvested metadata for the purpose of providing one or more services across all the data. The types of services that may be offered include a search interface, peer-review system, etc. Note that one 'provider' organisation can play both roles, offering both data for harvesting and end-user services. The key architectural shift was the move away from only supporting human end-user interfaces for each repository, to supporting both human end-user interfaces and machine interfaces for harvesting. 1.6 OAI-PMH: Structure Model Fig .1. OAI-PMH: Structure Model (Source: http://www.oaforum.org/tutorial/english/page3.htm) The OAI-PMH protocol is based on HTTP. Request arguments are issued as GET or POST parameters. OAI-PMH supports six request types (known as ‘verbs’), e.g., http://archive.org?verb=ListRecords&from=2002-11-01 3 Responses are encoded in XML syntax. OAI-PMH supports any metadata format encoded in XML. Dublin Core is the minimal format specified for basic interoperability. Error messages are HTTP-based. Data Providers may define a logical set hierarchy to support levels of granularity for harvesting by Service Providers. Date stamps flag the last change of the metadata set, and thus provide further support for granularity of harvesting.OAI-PMH supports flow control. [9] 1.7 Technical aspects of the OAI approach to improving scholarly communication Now we will look at the technical side of OAI. The OAI has laid down a minimal set of requirement for interoperability. It is understood that the OAI Protocol is primarily about the exchange of metadata. Though by its inception motivated by the need to find electronic resources, the protocol specifies virtually it expects as a minimum something like Dublin Core metadata. The Santa Fe recommendations on interoperability were restricted to interoperability at the level of Metadata Harvesting. For this they simply described a set of metadata elements, to enable ‘coarse granularity document discovery among archives; the agreement to use a common syntax, XML to represent and transport both the Open Archives Metadata Set (OAMS) and archive specific metadata sets; and thirdly, the definition of a common protocol (the Open Archives Dienst Subset) to enable extraction of OAMS and archive-specific metadata from participating archives’. The Santa Fe Convention presents a technical framework that is designed to facilitate the discovery of content stored in distributed e-print archives. Because the technical recommendations have been implemented by a number of institutions, it is now possible to access the data from e-print archives through end-user services. At the moment, mainly via harvesting services which provide Web interfaces to the aggregated metadata exposed by data providers.[10] 1.8 Flexible deployment of OAI-PMH It is a simple protocol based on HTTP and XML, it allows/enables rapid flexible deployment. Three different types of toolkits are available where OAI-PMH can be used between closed groups, for metadata sharing and in commercial applications. The first figure shows Muliple Service Providers can harvest from multiple Data Providers. Figure two illustrates the position of aggregrators in between data and service providers, where as figure three shows harvesting based on OAI-PMH as well as searching through Z39.50. or SRW. [11] Multiple Service Providers Aggregators Fig.2 Muliple Service Providers can harvest from multiple Data Providers Fig.3 Aggregators can sit between Data Providers and Service Providers 4 Harvesting combined with searching Fig. 4 The harvesting approach can be complemented with searching based, e.g., on Z39.50 or SRW (Source: http://www.oaforum.org/tutorial/english/page2.htm) 2.0 APPLICATION OF OAI-PMH 2.1 International Studies: Service Providers The following are the examples of Service Providers at the international level 2.1.1 Biomed Central and Open Archive Initiatives An increasing number of journals are being archived on the Internet. The development of common search standards to sift through this wealth of academic information will be crucial to making it both accessible and fully searchable for the academic community. BioMed Central fully supports the OAI Metadata Harvesting Protocol. Metadata for all the articles they publish is made available via their OAI interface. This data is already harvested and used by Citebase, myOAI, NASA Technical Reports and other services. Additionally, BioMed Central's open access policyis such that, repositories may also use their OAI interface to obtain the full text XML of any open access research article published by BioMed Central in agreement with ChemistryCentral and SpringerOpen which is available at http://www.biomedcentral.com/ [12] 2.1.2 InTech and Open Archive Initiatives InTech supports the OAI Metadata Harvesting Protocol ( OAI-PMH Version 2.0). All Publications are more widely accessible with resulting benefits for scholars, researchers, students, libraries, universities and other academic institutions. Through this means of exposing metadata, InTech enables citation indexes, scientific search engines, scholarly databases, and scientific literature collections to gather the metadata from our repository and make our publications available to a broader academic audience. As a Data Provider, metadata for published book chapters and journal articles is available via our interface at the base URL: http://www.intechopen.com/oai-pmh.html [13] 2.2 National Studies : Data Providers 2.2.1 Bhandarker Oriental Research Institute (BORI) 5 The collection contains around 350 Palm Leaf and around 150 Birch Bark Manuscripts e.g. Kashmir Manuscripts, Vishrambaghwada Collection of Manuscripts, Persian Manuscripts, Jaina Manuscripts, Rgveda Manuscripts, Bhagavata etc.. More of the Collection maintained by the UNESCO. Scripts: Devanagari, Sharada, Telugu, Tamil.etc. Subjects like Vedic Samhitas, , Vedangas, Vedanta, Yoga, etc. The details of digitisation are as follows: Descriptive Catalogues: 13000 Manuscripts and Microfilmed Manuscripts: 13000 Manuscripts. [14] 2.2.2 Rajasthan Oriental Research Institute (RORI) The number of the manuscripts in the total collection of the Institute amounting to 1.23 lakhs is deposited at the head quarters. The whole collection is enriched with the manuscripts of variety of subjects representing various types of languages, scripts and miniature paintings. In addition to unknown works in Sanskrit, Prakrit & Apabhramsha and comparatively a large number of manuscripts in vernacular language (Rajasthani) highlighting the cultural heritage of region, the collection preserves works on different subjects like Ayurveda, Jyotisha, Tantra-Mantra, Shilpa Ved-Vaidik, etc. They are also maintaing miniature paintings in different style like palm leaves and birch barks. There is a view to microfilm those manuscripts which either are in a deterioting or brittled condition or those illustrated manuscripts which are considered to be the best specimen representing the different schools of miniature paintings. [15] 2.2.3 Banaras Hindu University (IMS Library) BHU took roots in 1920 with the establishment of Department of Ayurveda under Faculty of Oriental Learning and Theology (1922-1927).The Institute of Medical Sciences Library was established in 1961 and shifted to present premises on 14 February 1967. It is the only one of its kind in the country holding collection related to modern as well as Ayurvedic System of Medicine and School of Nursing. The library is used not only by other departments of the University but also by the students and staff of other medical colleges of U.P., M.P. and Bihar. [16] 2.2.4 Gujarat Ayurved University (GAU):Central Library The central library of the University is housed in a building called Juwansinhji Museum. Library has collection of more than 33588 books on various subjects. This library caters to the need of the students and also supplies the books to the various departmental libraries. There are more than 3556 Post Graduate and Ph.D. theses, which are used as reference material. This library has a large collection of hand written manuscripts. Out of the total 7400 manuscripts good number of them are on Palm leaf or Bhojapatra. Library also subscribes to various national and international journals related to Ayurveda and other allied subjects. [17] 2.2.5 National Institute of Ayurveda (NIA) Library The Institute has a good Library having publications on various subjects on Ayurveda, Naturopathy, Allopathic, Philosophy, Sanskrit, Science, etc. The total number of collection has now risen to 23,360. 112 Journals and Newspapers were subscribed and 1,676 annual volumes of Journals were available for reference and research purposes. The numbers of readers was 14,993. The Books are classified in catalogue code and open access system is maintained. Rare and reference books are kept separately in the Research and Reference Cell for compiling index and bibliography. The Library has a collection of Thesis. Automation of Library work is in Progress. An Audio and Video Unit is also available in the Institute with one Photo Copier, TV, VCR, LCD Projector, Audio, Video Cassettes and CDs on various topics of Ayurveda, Modern subjects, Medicinal Plants etc. [18] 6 3.0 A model of Ayurvedic Service Provider (Central Council for Research in Ayurvedic Sciences) Fig.5 Model of Ayurvedic Service Provider The Central Council for Research in Ayurvedic Sciences (CCRAS) is an autonomous body of the department of AYUSH (Ayurveda, Yoga & Naturopathy, Unani, Siddha and Homeopathy), Ministry of Health & Family Welfare, and Government of India. It is an apex body in India for the formulation, co-ordination, development and promotion of research on scientific lines in Ayurveda system of medicine and also the SowaRigpa, commonly known as Tibetian or Amchi medicine. The library of CCRAS is heading towards the automation and digitisation and undergoing the process of metadata harvesting/developing institutional repositories and forming a protocol for open archives initiatives. This is the rudimentary proposal/structure for development and provision of service for Ayurvedic resources/repository. This prototype model would be made exhaustive by linking over 100-150 Ayurvedic organisations, both nationally and internationally. CCRAS including the collection of resources from all the 30 units of CCRAS, will be providing metadata to interested clients. It would become the primary institution for harvesting records containing metadata from different repositories and be the prime organisation as the service provider for all the other Ayurvedic organisations. Other Institution like AIIMS,ICMR, NML, IARI dealing with allied sciences can also be included in this type of Open Archives Initiative. Advance Access metadata (articles published online ahead of print) can also be included in OAI-PMH feeds. Articles would be made available immediately after publication online. 4.0 Conclusion From the point of view of the end−user, the perfect OAI implementation must be advertised. The OAI metadata harvesting protocol is a generic bulk metadata transport that has generated significant international interest as a tool for Digital Library interoperability. It utilizes other technologies when possible (http, XML schemas, Dublin Core), and defines its own features when necessary. The Protocol has been developed by the Open Archives Initiative, thus setting interoperability standards in order to ease and promote the broader and more efficient dissemination of content within the information seeker community. The OAI−PMH focuses only on metadata, not full−text, and is always a front−end to an existing DL. It is expected that it will yield greater flexibility and interoperability in distributed searching. In case of CCRAS, it provides the opportunity to fulfil existing gaps in data and for third parties (user) to fulfil their needs. Metadata can be harvested at any time, as per the requirements. The access to full text (if entitled) allows access to HTML full text and extracts in addition to the current provision of access to PDF and abstracts which increases the number of readers. All the published information would be more widely accessible 7 with resulting benefits for scholars, researchers, students, libraries, universities and other research/academic institutions specially in Ayurveda. Through this means of exposing metadata, citation indexes, Ayurvedic search engines, scholarly databases, and Ayurvedic literature collections also would be enabled to gather the metadata from our repository and make our publications available to a broader researcher and academic audience. References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. Hirwade, Mangala and Bherwani, Mohini T. 2009. Facilitating Searches in Multiple Bibliographical Databases: Metadata Harvesting Service Providers. Liber Quarterly 19 (2) : 140165. Castelli, Donatella. 2003. Open Archive Solutions to Traditional Archive/Library Cooperation. Liber Quarterly 13(1) : 290-298. Sharma, Shruti and Gupta, J.P. 2010. A Novel Architecture of Agent based Crawling for OAI Resources. International Journal on Computer Science and Engineering 2(4) : 1190-1195. Castelli, Donatella. 2003. Open Archive Solutions to Traditional Archive/Library Cooperation. Liber Quarterly 13(1) : 290-298. Hirwade, Mangala and Bherwani, Mohini T. 2009. Facilitating Searches in Multiple Bibliographical Databases: Metadata Harvesting Service Providers. Liber Quarterly 19 (2) : 140165. http://www.oaforum.org/tutorial/english/page2.htm (accessed on 19 July 2013) Lagoze, Carl and Sompel, Herbert Van de. 2001. The Open Archives Initiative: Building a LowBarrier Interoperability Framework. [available at http://www.openarchives.org/documents/jcdl2001-oai.pdf] (accessed on 05 Aug 2013) http://www.oaforum.org/tutorial/english/page1.htm (accessed on 19 July 2013) http://www.oaforum.org/tutorial/english/page3.htm (accessed on 2 Aug 2013) Hunter, Philip and Guy, Marieke. 2004. Metadata for harvesting: the Open Archives Initiative, and how to find things on the Web. The Electronic Library 22(2) : 168-174. http://www.oaforum.org/tutorial/english/page2.htm (accessed on 19 July 2013) http://www.biomedcentral.com/ (accessed on 15 July 2013) http://www.intechopen.com/oai-pmh.html (accessed on 2 Aug 2013) http://bori.ac.in/manuscript_department.html (accessed on 15 July 2013) http://www.rori.nic.in/main.htm (accessed on 15 July 2013) http://www.imsbhu.nic.in/units/imslibrary.htm (accessed on 18 July 2013) http://www.ayurveduniversity.edu.in/unigauca.php#central (accessed on 18 July 2013) http://nia.nic.in/?ref=12&id=33 (accessed on 20 July 2013) 8

metadata harvesting on ayurveda and the open archives initiative in

Related documents

Products

Support

metadata harvesting on ayurveda and the open archives initiative in

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib