Subject Metadata Development for Digital Resources in Latvia

advertisement
Subject Metadata Development for Digital Resources in Latvia
Aiva Stūrmane, Elita Eglīte, Mārīte Jankevica-Balode
National Library of Latvia
Abstract: The National Library of Latvia (NLL) made a decision to use the
Library of Congress Subject Headings (LCSH) in 2000. At present the NLL
Subject Headings Database in Latvian based on LCSH holds approximately
34 000 subject headings and it is used for subject cataloging of textual
resources, including articles from serials in the Database of National
Bibliography Analytic, and digital objects. NLL uses a system like the FAST
(Faceted Application of Subject Terminology) for subject cataloging of
digital resources - subject headings with simpler syntax, the subject string
divided into separate access points. One of the milestones in the
development of the subject cataloging of digital resources in Latvia was the
project “In Search of Lost Latvia” (2010), which involved several memory
institutions and individuals, and users were able to add their own keywords
and comments. It was a challenge to use LCSH for subject indexing of these
objects due to the retrieval of very specific terms of narrow scope. As our
work progressed, we have made sure of LCSH comprehensive and detailed
thematic coverage and applicability for the description of digital objects too.
1. Introduction
At the time when the amount of information is rapidly increasing, fast and
reliable search and retrieval methods are of decisive meaning. Today
Internet technologies provide global access to heterogeneous information
resources; databases, e-books and serial publications, digital online
1
resources, which beside the text also contain photographs, pictures, video
and audio materials, etc. Library information resources are such type of
information, and today libraries have to deal with the problem of providing
good quality access to the diversity of resources. Access quality depends on
the library chosen knowledge organization system. Knowledge organization
systems serve as a bridge between user information needs and information
resources in particular collections. They shall be organized in a way to
enable a user, who has no preliminary knowledge about library knowledge
organization system, to identify the object of interest and, similarly to
Internet search, retrieve the information by natural language query. Crossing
this bridge, a user will imperceptibly get from the prevailing uncontrolled
Internet keyword language into the library formed controlled one.
Until the introduction of electronic catalogs, the Universal Decimal
Classification (UDC) was dominant in the libraries of Latvia. Except for a
few branch libraries, which had subject catalogs, Latvia did not have any
experience in the development of verbal indexing language. Also in the first
electronic catalogs developed in the 90-ties of the last century, only UDC
notations provided subject retrieval. It did not comply with readers` search
habits, which were influenced by search options in the Internet portals.
Therefore, in the year 2000, the National Library of Latvia (NLL) made a
decision to use the Library of Congress Subject Headings (LCSH) for
subject search alongside with the UDC notations.
Our choice was based on the understanding that the LCSH is a subject
heading language highly appreciated worldwide; mainly due to the fact that,
being universal in content, it also functions as a thesaurus and is
distinguished for a high degree of detailed elaboration. The fact that after
regaining of independence several countries of Eastern Europe, especially
2
Lithuania (Lithuanian and Latvian are cognate languages), had chosen it as
their national subject heading language, played an important role.
2. Development of the National Library of Latvia Subject Headings
Translation of LCSH headings was done in parallel with the current
cataloging process. Heading of a work being cataloged was retrieved from
the LCSH and translated into Latvian. With every new case subject
cataloging methodology improved. Gradually, within a ten-year period, a
subject heading system – the National Library of Latvia Subject Headings
(NLLSH), controlled by the Subject Headings Authority Database, was
developed. At present the NLL Subject Headings Database holds ~ 34 000
subjects, mainly topical and geographic names authority records. Subject
Cataloging Manual, Free-floating subdivisions index and the updates have
been translated and published. Due to terminology peculiarities between
American English and Latvian, and cultural and historical diversity we have
named the translation of LCSH headings - adaptation, as the main attention
is paid to terminology semantic compliance instead of a word-for-word
translation. In the authority records, headings are made in Latvian, taking
into account the conformities with the natural Latvian language and the
branch terminology. According to the peculiarities of the Latvian language,
equivalence relationships (references) are developed and homonymy is
eliminated. The NLL fully complies with the LCSH system structural
regularities – hierarchical relationships, associative relationships and syntax.
Library of Congress original subject headings are included as See from
references, thus standardizing the translation and providing for international
data retrieval. When creating subject records, most effort is put into
determination of mutual compliance between American English and Latvian
3
terminology. In the working process with a free-floating subdivisions index
in both languages, language differences became apparent even more
radically, because a subdivision term, which is to comply with a certain
subject category semantically, in Latvian pretty often did not.
3. Application of the NLLSH
In the NLL, the NLLSH are used for subject cataloging of monographs,
among them, doctoral thesis and their summaries, periodicals, printed
music, sound recordings and audiovisual materials, maps in the library
electronic catalog. Similarly, they are used for subject cataloging in the
National Bibliographic Databases. Application of the NLLSH to harvested
resources, as well as small prints and graphic production (postcards, exlibris,
engravings, posters) has been commenced.
These resources are located in the special collections and their descriptions
– in various NLL developed databases. Library strives to provide uniform
subject access to all of them with NLLSH.
Subject authority records are developed at the NLL Cataloging Department
in a centralized manner in collaboration with specialists from the special
collections and other structural units, as well as the Riga Technical
University and the University of Latvia library staff, who use the NLLSH in
subject cataloging of their collections. The specificity of these collections
gives new experience, as not only the natural language differences, but also
the special terminology differences must be overcome.
Since 2008, subject records are added to the records of the National
Bibliographic Analytical Database (articles from serials issued in Latvia:
newspapers, magazines, bulletins, compilations, etc.). Since the Library of
4
Congress subject system is mainly intended for subject cataloging of books,
in the initial period we had to make sure, if LCSH can be used for topical
retrieval of articles, since articles and books are very different as to their
layout and the degree of detailed elaboration. The specificity of topics dealt
with in articles from serials caused doubt about the opportunity of finding
corresponding subjects in the LCSH. In the course of work we made sure
that, except for individual cases, when the appropriate term was to be added
to the broader term, as Seen from form, for example,
Senkapi (Tombs
LCSH) - Seen from Ugunskapi (fire-pit graves), the search results were almost
always positive. The reason is the wide topical coverage and the high degree
of detailed elaboration of the Library of Congress subject system. We made
sure that terminology peculiarities and cultural and historical diversity cause
most difficulty irrespective of the type of resources, for example, Cultural
animators (LCSH) – Kultūras darbinieki (cultural employees/workers).
Since the application of the LCSH, we have gained certain experience in
elimination of such differences. Although situations are diverse, we always
provide access point to the specific term in Latvian not to lose the semantic
power of subject headings, concurrently retaining the structure of LCSH
authority records with the original heading, for example,
Zinātne (science)
Seen from Eksaktās zinātnes (hard sciences, sciences)
Science (LCSH heading)
5
4. The digitalization in Latvia
The beginning of digitalization in Latvia brought new challenges to the
provision of development of the NLLSH and subject access in Latvia.
Picture 1. Home page of the National Library of Latvia. Digital Library
In the NLL the digitalization process began in 1998 with the digitalization of
Latvian newspapers. In the period from 2000 to 2006 several digital
collections were created, they included posters, postcards, exlibris,
engravings, maps; unfortunately standardized metadata, subjects among
them, were not attached.
In 2006 the Project “Latvian National Digital Library “Letonika”” was
commenced; the Project included uniform principles of digital object
6
processing, based on the Dublin Core metadata standard, The European
Library (TEL) guidelines; at present we are working at the acquisition of
EDM (Europeana Data Model), ESE (Europeana Semantic Elements).
Digitizer’s Manual including metadata formation methodology for objects
and collections was developed.
A digital collection “Jāzeps Vītols”, dedicated to Jāzeps Vītols, one of the
most famous Latvian composers, the founder of the Latvian National
Opera, was made as a Pilot project; it included texts, letters, printed music,
pictures, audio recordings and video materials. The Pilot project served as
an original master class for the acquisition of complicated digitalization
processes. A great variety of digitalization related issues were addressed
within the Project: from the development of templates for subject metadata
description to copyright matters. Also one of the Project general tasks was
the development of collaboration model among various information keepers
– libraries, archives, museums.
5. Subject access to digital resources
Thus the matter of providing for uniform subject search has become the
topic of the day. Pursuant to the “Dublin Core Metadata Element Set”
standard, a metadata element – Subject is completed using controlled
vocabularies; therefore a decision was taken to use NLLSH in compliance
with the NLL subject cataloging method. When arriving at the decision we
realized that our subject headings are a pre-coordinate indexing language
with complicated syntax, which may cause inconvenience to search outside
the electronic catalog environment. Examining the various opportunities of
electronic document subject cataloging, we also studied the FAST (Faceted
Application of Subject Terminology) Project – the LCSH based Subject
7
vocabulary, developed for subject cataloging of electronic resources and
organized as a system of separate facets with simpler syntax. It was decided
to use, as far as possible, the same principles in the subject cataloging of
digital library materials, namely, assign subjects to documents according to
the NLLSH subject cataloging methodology, but divide the subject string
into separate facets and record them as separate subjects (access points). In
the metadata field Subject, the topical subject headings are entered, for
example,
Krogi (Taverns (Inns), Brīvdabas estrādes (Outdoor concert
facilities), Pastorāti (Parsonages), Ūdenstorņi (Water towers), Lapenes
(Gazebos), Rijas (Threshing barns), Folklora—Izpildījums (Folklore-Performance), Latvija—Robežas—Lietuva (Latvia-- Boundaries--Lithuania),
retaining the standard subject string [topic]-[topic] and [place]]-[topic]-[time]
for historical topics. Form/genre headings are added as separate access
points in the subject field, e. g. Postcards, Photographs. Geographic subject
is entered into separate metadata field Coverage Spatial, for example, Rīga
(Latvija). Since in the NLL electronic catalog chronological subjects are not
used as separate access points (MARC 648 field), the metadata field
Coverage Temporal is not completed, because temporal coverage is included
in the topical subject heading as a chronological subdivision, for example:
Latvija--Vēsture--Cīņas par Latvijas neatkarību, 1918-1920 (Latvia- HistoryWar of independence, 1918-1920).
Consequently, subjects are added to metadata recordings, particularizing the
form of the value to be indicated in the NLL Authority database. The
methodology for subject cataloging of separate digital object types (maps,
photos, etc.) is being developed in compliance with the NLL subject
cataloging methodology. Currently, when developing metadata records,
metadata developers add values from the NLL authority database to the
element “subject”; metadata editor verifies them and adds appropriate
8
subjects in English. If the subject heading of digital objects is missing in the
subject authority database, the NLL Cataloging Department staffs develop
an authority record. The required authority records are also developed for
names of institutions. For the time being metadata records
in the digital
library are not linked to the NLL Authority database, therefore the
synonymy check and browsing assistance (users have no access to broader
and narrower term subjects) are not provided yet. To settle the issue, we
have an idea of transforming all NLL authority data (subjects included) into
Library Linked Data; it would enable data use by any digital subject
management system, avoiding the development of complicated data
synchronization and/or connecting models. It would also ensure more
extensive data use and supplementing, enriching them with links to other
Linked Data resources.
6. The digital collection “In Search of Lost Latvia”
Picture 2. Logo of the digital collection “In Search of Lost Latvia” on the
home page of the National Library of Latvia
The Project “In Search of Lost Latvia”, which was commenced in 2010, is
going on successfully. The Project contains unique information about the
gone and modified cultural and historical values of Latvia. With the lapse of
time the cultural landscape of Latvia has significantly changed. A number of
9
nature, historical, art and architectural objects, as well as economic and
public buildings have not survived till nowadays. The Latvian cultural
heritage suffered heavily during both World Wars and the period of Soviet
and German occupations. Quite many significant objects have perished due
to economic activities, and they are also endangered today. “In Search of
Lost Latvia” contains digitalized copies of old drawings, postcards and
photos from the end of the 19th century till nowadays, architectural and art
monument object descriptions; sights of towns and settlements; public
buildings; economic buildings; objects of infrastructure; accommodations;
agricultural buildings and objects of nature.
NLL implements the Project
“In Search of Lost Latvia” together with the memorial institutions of Latvia
– libraries, museums, cultural heritage protection and conservation
organizations, as well as private individuals. Participation in the European
Digital Library eContentplus program project EDLocal provides for
admission of information of “In Search of Lost Latvia” into the European
Digital Library internet portal Europeana.
Picture 3. Object from the digital collection “In Search of Lost Latvia” in
the European Digital Library internet portal Europeana
10
The Project “In Search of Lost Latvia” is implemented using the open
source code software Django. Adding of subject metadata records to digital
objects has been significantly improved, however the link to the NLL
Authority database has not been provided yet.
The Project “In Search of Lost Latvia” is developed as an interactive
project; it intends for participation of users and improvement of content.
Users have an opportunity of adding their own photos and basic metadata,
which the NLL metadata editor afterwards edits and supplements.
In
addition to controlled subjects, added by the Library specialists, users have
an opportunity of adding their own uncontrolled keywords – tags, which
provide for additional search options. Adding keywords is of high
importance in the period, when the synonymy check has not been provided
yet. The option of public subject supplementing and tagging opportunities
has been provided quite recently, users are not well familiar with the options
yet, so the use is scanty. We expect the situation to change for better after
the scheduled publicity measures are implemented. One may add
coordinates (indicate an exact location of the object) and comments to
photos on the website. The Project manager regularly reads comments and
replies to users. Although the comment option was made a year ago, people
are quite active in providing their comments. Colleagues from museums,
archives, history researchers submit their comments; the received
information is often used for the supplement of metadata records.
11
12
Picture 4. Object of the digital collection “In Search of Lost Latvia” with
attached map and comments.
7. NLLSH in the collection “In Search of Lost Latvia”
In relation to this Project, bearing in mind various opportunities for
providing subject access to digital pictures, the NLL specialists have got
acquainted with different visual information indexing theories worldwide;
however, a decision was made to use the NLLSH in subject cataloging of
digital pictures to provide for uniform subject access to all Library
resources.
Using the NLLSH in subject cataloging of digital pictures was a challenge,
because subjects of very narrow meaning to be assigned to digital objects,
for example, Bērnu rati (Baby carriages), Rijas (Threshing barns), Lapenes
(Gazebos), Grāvji (Ditches), Kamanas (Sleighs) were not included. We were
not sure if we would find them in the LCSH. However, developing a subject
after subject, we made sure that search is almost always successful, and the
development of authority records for subject cataloging of digital objects is
feasible.
Digital objects often depict different work processes, measures or events
having adequate subjects in the LCSH, for example, Pēcpusdienas tēja Afternoon teas. Another time a string [topic]—[topic] should be made, for
example, Siens—Novākšana (Hay—Harvesting), Kartupeļi--Novākšana
(Potatoes—Harvesting). If the specific term for subject cataloging of a
digital object is not found in the LCSH, the semantically closest LCSH
subject is chosen; upon development of authority recording the specific
13
term is included as a Seen from form, for example,
Brīvprātīgo darbs
(Voluntarism; Volunteer work), Seen from Talkas (joint work), Deju vakari
(Dance parties), Seen from Zaļumballes (open-air dances).
Picture 5. The National Library of Latvia. Subject authority database record
In relation to content of the Project “In Search of Lost Latvia”, a number of
authority records of names of gone churches, castles, manors, museums,
theaters, etc. were developed to provide for controlled subject retrieval.
8. NLLSH in the future.
Since 2011, the NLL has commenced adding subjects in Latvian and in
English from the NLLSH harvested Latvian web page metadata records.
The methodology for subject cataloging of these resources is being
developed. However, the Web Curator Tool integrated DC metadata form
suggests the Subject fields as unique text fields, where all values should be
entered separated by “;” (semicolon); at present no link is provided to the
14
NLL Authority database. Another solution for saving of such pages and
adding of metadata shall be looked for.
At the beginning of 2012, work at the development and testing of the NLL
Uniform resource aggregator is performed; it will provide joint search and,
consequently, joint subject access, using the NLLSH in all NLL databases
and digitalized resources. The Library has also commenced the development
of the Latvian National Digital Collection conception. The purpose of such
conception is to ensure the development of long-term, systematically
supported national digital collection of all memorial institutions of Latvia. In
this respect issues related to the provision of subject access to the Latvian
National Digital Collection and the opportunities of collaboration for the
provision of subject access among various Latvian memorial institutions
shall be dealt with.
In 2011, a research “National identity in the digital environment”, aimed at
evaluation of memorial institutions` cultural heritage digital collections, was
made. An important finding of the research was that “Subject matter” is the
most popular search criterion; users prefer choosing from the current offer
instead of suggesting their own definitions. It proves the important role of
subjects in the identification of digital objects and the necessity for paying
attention to subject quality.
9. Conclusion
We have made sure that the LCSH is the guarantor of the quality. The
LCSH adaptation in Latvian allows for good quality subject cataloging of
textual resources, among them articles from serials, and non-textual
resources – digital objects. Regardless of the resource specificity required
changes, it is feasible to provide for uniform retrieval from different
15
collections and databases, provided that the core – Subject heading is
retained unaltered. With the increase of the NLLSH application area, the
NLLSH quality – the degree of detailed elaboration and subject coverage is
improving, too; in turn it provides new opportunities for use of the NLLSH
in the future.
References
Authoritative Database of National Library of Latvia.
http://lira.lanet.lv/F/R6BKVCND2BH4VX9Q5GQAJL6INY9ALU5AK376BBHKF1
FPAFMH38-30403?func=find-b-0&local_base=lnc10 (consulted 15 April 2012).
Digitizer's Handbook v 3.3. http://www.lnb.lv/en/digital-library/forpartners/digitzers-handbook.pdf. (consulted 15 April 2012).
National Library of Latvia. Digital Library. http://www.lnb.lv/en/digital-library
(consulted 15 April 2012).
Stūrmane, A. Eglīte E. (2009) Sadarbības iespējas autoritatīvo datu izmantošanā.
[Authority records : potential for cooperation]. Latvijas Nacionālās bibliotēkas zinātniskie
raksti, 125-144. Retrieved April 15, 2012, from http://www.lnb.lv/lv/parlnb/zinatiskie-raksti/LNB-ZR-2009-Sturmane-Eglite.pdf
Zarins, U. (2011). Latvia: Latvian National Digital Library "Letonica". Uncommon
Culture, Vol 2, no. 1(3) : From Closed Doors to Open Gates. Retrieved April 15, 2012, from
http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/UC/article/view/3628/3001
About the authors
Elita Eglīte is the chief librarian in the Cataloging Department of the
National Library of Latvia and the head of Indexing group since 2005. She is
revising, editing and developing subject authority records as well as training
other cataloguers and developing National
subject cataloguing policy.
Contact: Email: elita.eglite@lnb.lv
16
Aiva Stūrmane is the head of the Cataloging Department of National
Library of Latvia since 1999. Contact: Email: aiva.sturmane@lnb.lv
Mārite Jankevica-Balode works as a chief metadata editor for Digital
Library
"Letonica"
since
December
2006.
Contact:
Email:
marite.jankevica@lnb.lv
17
Download