Subject Metadata Development for Digital Resources in Latvia Aiva Stūrmane, Elita Eglīte, Mārīte Jankevica-Balode National Library of Latvia Abstract: The National Library of Latvia (NLL) made a decision to use the Library of Congress Subject Headings (LCSH) in 2000. At present the NLL Subject Headings Database in Latvian based on LCSH holds approximately 34 000 subject headings and it is used for subject cataloging of textual resources, including articles from serials in the Database of National Bibliography Analytic, and digital objects. NLL uses a system like the FAST (Faceted Application of Subject Terminology) for subject cataloging of digital resources - subject headings with simpler syntax, the subject string divided into separate access points. One of the milestones in the development of the subject cataloging of digital resources in Latvia was the project “In Search of Lost Latvia” (2010), which involved several memory institutions and individuals, and users were able to add their own keywords and comments. It was a challenge to use LCSH for subject indexing of these objects due to the retrieval of very specific terms of narrow scope. As our work progressed, we have made sure of LCSH comprehensive and detailed thematic coverage and applicability for the description of digital objects too. 1. Introduction At the time when the amount of information is rapidly increasing, fast and reliable search and retrieval methods are of decisive meaning. Today Internet technologies provide global access to heterogeneous information resources; databases, e-books and serial publications, digital online 1 resources, which beside the text also contain photographs, pictures, video and audio materials, etc. Library information resources are such type of information, and today libraries have to deal with the problem of providing good quality access to the diversity of resources. Access quality depends on the library chosen knowledge organization system. Knowledge organization systems serve as a bridge between user information needs and information resources in particular collections. They shall be organized in a way to enable a user, who has no preliminary knowledge about library knowledge organization system, to identify the object of interest and, similarly to Internet search, retrieve the information by natural language query. Crossing this bridge, a user will imperceptibly get from the prevailing uncontrolled Internet keyword language into the library formed controlled one. Until the introduction of electronic catalogs, the Universal Decimal Classification (UDC) was dominant in the libraries of Latvia. Except for a few branch libraries, which had subject catalogs, Latvia did not have any experience in the development of verbal indexing language. Also in the first electronic catalogs developed in the 90-ties of the last century, only UDC notations provided subject retrieval. It did not comply with readers` search habits, which were influenced by search options in the Internet portals. Therefore, in the year 2000, the National Library of Latvia (NLL) made a decision to use the Library of Congress Subject Headings (LCSH) for subject search alongside with the UDC notations. Our choice was based on the understanding that the LCSH is a subject heading language highly appreciated worldwide; mainly due to the fact that, being universal in content, it also functions as a thesaurus and is distinguished for a high degree of detailed elaboration. The fact that after regaining of independence several countries of Eastern Europe, especially 2 Lithuania (Lithuanian and Latvian are cognate languages), had chosen it as their national subject heading language, played an important role. 2. Development of the National Library of Latvia Subject Headings Translation of LCSH headings was done in parallel with the current cataloging process. Heading of a work being cataloged was retrieved from the LCSH and translated into Latvian. With every new case subject cataloging methodology improved. Gradually, within a ten-year period, a subject heading system – the National Library of Latvia Subject Headings (NLLSH), controlled by the Subject Headings Authority Database, was developed. At present the NLL Subject Headings Database holds ~ 34 000 subjects, mainly topical and geographic names authority records. Subject Cataloging Manual, Free-floating subdivisions index and the updates have been translated and published. Due to terminology peculiarities between American English and Latvian, and cultural and historical diversity we have named the translation of LCSH headings - adaptation, as the main attention is paid to terminology semantic compliance instead of a word-for-word translation. In the authority records, headings are made in Latvian, taking into account the conformities with the natural Latvian language and the branch terminology. According to the peculiarities of the Latvian language, equivalence relationships (references) are developed and homonymy is eliminated. The NLL fully complies with the LCSH system structural regularities – hierarchical relationships, associative relationships and syntax. Library of Congress original subject headings are included as See from references, thus standardizing the translation and providing for international data retrieval. When creating subject records, most effort is put into determination of mutual compliance between American English and Latvian 3 terminology. In the working process with a free-floating subdivisions index in both languages, language differences became apparent even more radically, because a subdivision term, which is to comply with a certain subject category semantically, in Latvian pretty often did not. 3. Application of the NLLSH In the NLL, the NLLSH are used for subject cataloging of monographs, among them, doctoral thesis and their summaries, periodicals, printed music, sound recordings and audiovisual materials, maps in the library electronic catalog. Similarly, they are used for subject cataloging in the National Bibliographic Databases. Application of the NLLSH to harvested resources, as well as small prints and graphic production (postcards, exlibris, engravings, posters) has been commenced. These resources are located in the special collections and their descriptions – in various NLL developed databases. Library strives to provide uniform subject access to all of them with NLLSH. Subject authority records are developed at the NLL Cataloging Department in a centralized manner in collaboration with specialists from the special collections and other structural units, as well as the Riga Technical University and the University of Latvia library staff, who use the NLLSH in subject cataloging of their collections. The specificity of these collections gives new experience, as not only the natural language differences, but also the special terminology differences must be overcome. Since 2008, subject records are added to the records of the National Bibliographic Analytical Database (articles from serials issued in Latvia: newspapers, magazines, bulletins, compilations, etc.). Since the Library of 4 Congress subject system is mainly intended for subject cataloging of books, in the initial period we had to make sure, if LCSH can be used for topical retrieval of articles, since articles and books are very different as to their layout and the degree of detailed elaboration. The specificity of topics dealt with in articles from serials caused doubt about the opportunity of finding corresponding subjects in the LCSH. In the course of work we made sure that, except for individual cases, when the appropriate term was to be added to the broader term, as Seen from form, for example, Senkapi (Tombs LCSH) - Seen from Ugunskapi (fire-pit graves), the search results were almost always positive. The reason is the wide topical coverage and the high degree of detailed elaboration of the Library of Congress subject system. We made sure that terminology peculiarities and cultural and historical diversity cause most difficulty irrespective of the type of resources, for example, Cultural animators (LCSH) – Kultūras darbinieki (cultural employees/workers). Since the application of the LCSH, we have gained certain experience in elimination of such differences. Although situations are diverse, we always provide access point to the specific term in Latvian not to lose the semantic power of subject headings, concurrently retaining the structure of LCSH authority records with the original heading, for example, Zinātne (science) Seen from Eksaktās zinātnes (hard sciences, sciences) Science (LCSH heading) 5 4. The digitalization in Latvia The beginning of digitalization in Latvia brought new challenges to the provision of development of the NLLSH and subject access in Latvia. Picture 1. Home page of the National Library of Latvia. Digital Library In the NLL the digitalization process began in 1998 with the digitalization of Latvian newspapers. In the period from 2000 to 2006 several digital collections were created, they included posters, postcards, exlibris, engravings, maps; unfortunately standardized metadata, subjects among them, were not attached. In 2006 the Project “Latvian National Digital Library “Letonika”” was commenced; the Project included uniform principles of digital object 6 processing, based on the Dublin Core metadata standard, The European Library (TEL) guidelines; at present we are working at the acquisition of EDM (Europeana Data Model), ESE (Europeana Semantic Elements). Digitizer’s Manual including metadata formation methodology for objects and collections was developed. A digital collection “Jāzeps Vītols”, dedicated to Jāzeps Vītols, one of the most famous Latvian composers, the founder of the Latvian National Opera, was made as a Pilot project; it included texts, letters, printed music, pictures, audio recordings and video materials. The Pilot project served as an original master class for the acquisition of complicated digitalization processes. A great variety of digitalization related issues were addressed within the Project: from the development of templates for subject metadata description to copyright matters. Also one of the Project general tasks was the development of collaboration model among various information keepers – libraries, archives, museums. 5. Subject access to digital resources Thus the matter of providing for uniform subject search has become the topic of the day. Pursuant to the “Dublin Core Metadata Element Set” standard, a metadata element – Subject is completed using controlled vocabularies; therefore a decision was taken to use NLLSH in compliance with the NLL subject cataloging method. When arriving at the decision we realized that our subject headings are a pre-coordinate indexing language with complicated syntax, which may cause inconvenience to search outside the electronic catalog environment. Examining the various opportunities of electronic document subject cataloging, we also studied the FAST (Faceted Application of Subject Terminology) Project – the LCSH based Subject 7 vocabulary, developed for subject cataloging of electronic resources and organized as a system of separate facets with simpler syntax. It was decided to use, as far as possible, the same principles in the subject cataloging of digital library materials, namely, assign subjects to documents according to the NLLSH subject cataloging methodology, but divide the subject string into separate facets and record them as separate subjects (access points). In the metadata field Subject, the topical subject headings are entered, for example, Krogi (Taverns (Inns), Brīvdabas estrādes (Outdoor concert facilities), Pastorāti (Parsonages), Ūdenstorņi (Water towers), Lapenes (Gazebos), Rijas (Threshing barns), Folklora—Izpildījums (Folklore-Performance), Latvija—Robežas—Lietuva (Latvia-- Boundaries--Lithuania), retaining the standard subject string [topic]-[topic] and [place]]-[topic]-[time] for historical topics. Form/genre headings are added as separate access points in the subject field, e. g. Postcards, Photographs. Geographic subject is entered into separate metadata field Coverage Spatial, for example, Rīga (Latvija). Since in the NLL electronic catalog chronological subjects are not used as separate access points (MARC 648 field), the metadata field Coverage Temporal is not completed, because temporal coverage is included in the topical subject heading as a chronological subdivision, for example: Latvija--Vēsture--Cīņas par Latvijas neatkarību, 1918-1920 (Latvia- HistoryWar of independence, 1918-1920). Consequently, subjects are added to metadata recordings, particularizing the form of the value to be indicated in the NLL Authority database. The methodology for subject cataloging of separate digital object types (maps, photos, etc.) is being developed in compliance with the NLL subject cataloging methodology. Currently, when developing metadata records, metadata developers add values from the NLL authority database to the element “subject”; metadata editor verifies them and adds appropriate 8 subjects in English. If the subject heading of digital objects is missing in the subject authority database, the NLL Cataloging Department staffs develop an authority record. The required authority records are also developed for names of institutions. For the time being metadata records in the digital library are not linked to the NLL Authority database, therefore the synonymy check and browsing assistance (users have no access to broader and narrower term subjects) are not provided yet. To settle the issue, we have an idea of transforming all NLL authority data (subjects included) into Library Linked Data; it would enable data use by any digital subject management system, avoiding the development of complicated data synchronization and/or connecting models. It would also ensure more extensive data use and supplementing, enriching them with links to other Linked Data resources. 6. The digital collection “In Search of Lost Latvia” Picture 2. Logo of the digital collection “In Search of Lost Latvia” on the home page of the National Library of Latvia The Project “In Search of Lost Latvia”, which was commenced in 2010, is going on successfully. The Project contains unique information about the gone and modified cultural and historical values of Latvia. With the lapse of time the cultural landscape of Latvia has significantly changed. A number of 9 nature, historical, art and architectural objects, as well as economic and public buildings have not survived till nowadays. The Latvian cultural heritage suffered heavily during both World Wars and the period of Soviet and German occupations. Quite many significant objects have perished due to economic activities, and they are also endangered today. “In Search of Lost Latvia” contains digitalized copies of old drawings, postcards and photos from the end of the 19th century till nowadays, architectural and art monument object descriptions; sights of towns and settlements; public buildings; economic buildings; objects of infrastructure; accommodations; agricultural buildings and objects of nature. NLL implements the Project “In Search of Lost Latvia” together with the memorial institutions of Latvia – libraries, museums, cultural heritage protection and conservation organizations, as well as private individuals. Participation in the European Digital Library eContentplus program project EDLocal provides for admission of information of “In Search of Lost Latvia” into the European Digital Library internet portal Europeana. Picture 3. Object from the digital collection “In Search of Lost Latvia” in the European Digital Library internet portal Europeana 10 The Project “In Search of Lost Latvia” is implemented using the open source code software Django. Adding of subject metadata records to digital objects has been significantly improved, however the link to the NLL Authority database has not been provided yet. The Project “In Search of Lost Latvia” is developed as an interactive project; it intends for participation of users and improvement of content. Users have an opportunity of adding their own photos and basic metadata, which the NLL metadata editor afterwards edits and supplements. In addition to controlled subjects, added by the Library specialists, users have an opportunity of adding their own uncontrolled keywords – tags, which provide for additional search options. Adding keywords is of high importance in the period, when the synonymy check has not been provided yet. The option of public subject supplementing and tagging opportunities has been provided quite recently, users are not well familiar with the options yet, so the use is scanty. We expect the situation to change for better after the scheduled publicity measures are implemented. One may add coordinates (indicate an exact location of the object) and comments to photos on the website. The Project manager regularly reads comments and replies to users. Although the comment option was made a year ago, people are quite active in providing their comments. Colleagues from museums, archives, history researchers submit their comments; the received information is often used for the supplement of metadata records. 11 12 Picture 4. Object of the digital collection “In Search of Lost Latvia” with attached map and comments. 7. NLLSH in the collection “In Search of Lost Latvia” In relation to this Project, bearing in mind various opportunities for providing subject access to digital pictures, the NLL specialists have got acquainted with different visual information indexing theories worldwide; however, a decision was made to use the NLLSH in subject cataloging of digital pictures to provide for uniform subject access to all Library resources. Using the NLLSH in subject cataloging of digital pictures was a challenge, because subjects of very narrow meaning to be assigned to digital objects, for example, Bērnu rati (Baby carriages), Rijas (Threshing barns), Lapenes (Gazebos), Grāvji (Ditches), Kamanas (Sleighs) were not included. We were not sure if we would find them in the LCSH. However, developing a subject after subject, we made sure that search is almost always successful, and the development of authority records for subject cataloging of digital objects is feasible. Digital objects often depict different work processes, measures or events having adequate subjects in the LCSH, for example, Pēcpusdienas tēja Afternoon teas. Another time a string [topic]—[topic] should be made, for example, Siens—Novākšana (Hay—Harvesting), Kartupeļi--Novākšana (Potatoes—Harvesting). If the specific term for subject cataloging of a digital object is not found in the LCSH, the semantically closest LCSH subject is chosen; upon development of authority recording the specific 13 term is included as a Seen from form, for example, Brīvprātīgo darbs (Voluntarism; Volunteer work), Seen from Talkas (joint work), Deju vakari (Dance parties), Seen from Zaļumballes (open-air dances). Picture 5. The National Library of Latvia. Subject authority database record In relation to content of the Project “In Search of Lost Latvia”, a number of authority records of names of gone churches, castles, manors, museums, theaters, etc. were developed to provide for controlled subject retrieval. 8. NLLSH in the future. Since 2011, the NLL has commenced adding subjects in Latvian and in English from the NLLSH harvested Latvian web page metadata records. The methodology for subject cataloging of these resources is being developed. However, the Web Curator Tool integrated DC metadata form suggests the Subject fields as unique text fields, where all values should be entered separated by “;” (semicolon); at present no link is provided to the 14 NLL Authority database. Another solution for saving of such pages and adding of metadata shall be looked for. At the beginning of 2012, work at the development and testing of the NLL Uniform resource aggregator is performed; it will provide joint search and, consequently, joint subject access, using the NLLSH in all NLL databases and digitalized resources. The Library has also commenced the development of the Latvian National Digital Collection conception. The purpose of such conception is to ensure the development of long-term, systematically supported national digital collection of all memorial institutions of Latvia. In this respect issues related to the provision of subject access to the Latvian National Digital Collection and the opportunities of collaboration for the provision of subject access among various Latvian memorial institutions shall be dealt with. In 2011, a research “National identity in the digital environment”, aimed at evaluation of memorial institutions` cultural heritage digital collections, was made. An important finding of the research was that “Subject matter” is the most popular search criterion; users prefer choosing from the current offer instead of suggesting their own definitions. It proves the important role of subjects in the identification of digital objects and the necessity for paying attention to subject quality. 9. Conclusion We have made sure that the LCSH is the guarantor of the quality. The LCSH adaptation in Latvian allows for good quality subject cataloging of textual resources, among them articles from serials, and non-textual resources – digital objects. Regardless of the resource specificity required changes, it is feasible to provide for uniform retrieval from different 15 collections and databases, provided that the core – Subject heading is retained unaltered. With the increase of the NLLSH application area, the NLLSH quality – the degree of detailed elaboration and subject coverage is improving, too; in turn it provides new opportunities for use of the NLLSH in the future. References Authoritative Database of National Library of Latvia. http://lira.lanet.lv/F/R6BKVCND2BH4VX9Q5GQAJL6INY9ALU5AK376BBHKF1 FPAFMH38-30403?func=find-b-0&local_base=lnc10 (consulted 15 April 2012). Digitizer's Handbook v 3.3. http://www.lnb.lv/en/digital-library/forpartners/digitzers-handbook.pdf. (consulted 15 April 2012). National Library of Latvia. Digital Library. http://www.lnb.lv/en/digital-library (consulted 15 April 2012). Stūrmane, A. Eglīte E. (2009) Sadarbības iespējas autoritatīvo datu izmantošanā. [Authority records : potential for cooperation]. Latvijas Nacionālās bibliotēkas zinātniskie raksti, 125-144. Retrieved April 15, 2012, from http://www.lnb.lv/lv/parlnb/zinatiskie-raksti/LNB-ZR-2009-Sturmane-Eglite.pdf Zarins, U. (2011). Latvia: Latvian National Digital Library "Letonica". Uncommon Culture, Vol 2, no. 1(3) : From Closed Doors to Open Gates. Retrieved April 15, 2012, from http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/UC/article/view/3628/3001 About the authors Elita Eglīte is the chief librarian in the Cataloging Department of the National Library of Latvia and the head of Indexing group since 2005. She is revising, editing and developing subject authority records as well as training other cataloguers and developing National subject cataloguing policy. Contact: Email: elita.eglite@lnb.lv 16 Aiva Stūrmane is the head of the Cataloging Department of National Library of Latvia since 1999. Contact: Email: aiva.sturmane@lnb.lv Mārite Jankevica-Balode works as a chief metadata editor for Digital Library "Letonica" since December 2006. Contact: Email: marite.jankevica@lnb.lv 17