Finding stuff and using stuff
Gordon Dunsire
• What is metadata?
• What does it look like?
• What is it used for?
• How does it work?
• Where will it all end?
• “Data about data”
• Information about information
• Information about an information resource
• Useful information about a resource
• Useful information about specific aspects of a resource
• Whatever, there’s a lot of it about
http://www.slainte.org.uk/files/pdf/cilips/foisa04.pdf
Freedom of Information (Scotland) Act 2002: a guide for the information professional
“http” = how to get the document (protocol)
“www.slainte.org.uk” = where to find the document in cyberspace
(domain)
“files/pdf/cilips” = where the document is stored (path)
“foisa04” = the name of the document (file name)
“pdf” = the type of document (file type)
“:”, “/”, “.” = standard punctuation separating each piece of information (element)
The adventures of Sherlock Holmes / by A. Conan Doyle ; illustrations by
Sidney Paget. - London : G. Newnes,
1895.
“The adventures of Sherlock Holms” = title of the book
“by A. Conan Doyle; illustrations by Sidney Paget” = who is responsible for the creative content of the book
“London” = place of publication, “G. Newnes” = name of publisher
“1895” = date of publication
“/”, “.”, “-”, “:” = standard punctuation separating each element
Date |Title |Date|Sup|Price|Number
10/02/65|Physics is fun |1964|THI| 7/6| 20156
10/02/65|Physics is fun |1964|THI| 7/6| 20157
10/02/65|Berkeley physics v.1 |1964|FAR|3/9/6| 20158
10/02/65|Berkeley physics v.2 |1964|FAR|2/7/0| 20159
10/02/65|Berkeley physics v.3 |1964|FAR|2/7/6| 20160
10/02/65|Berkeley physics v.4 |1964|FAR|3/9/6| 20161
10/02/65|Berkeley physics v.5 |1964|FAR|3/9/6| 20162
• Information retrieval (finding stuff)
– Searching
• Lists of metadata elements (title, authors, publisher, etc.)
• Words in (digital) metadata (title, notes, etc.)
– Identifying
• Descriptive metadata (title, notes, edition, date, etc.)
– Finding
• Item metadata (shelfmark, barcode, etc.)
• Stock management (managing stuff)
– Acquisition
• Date, cost, supplier, etc.
– Storage
• Collection, shelfmark
– Circulation
• Barcode
– Preservation
• Format (serial, a-v, digital, etc.), date (age), etc.
• Automated processing (using stuff)
– Information retrieval
• OPACs
– Access to digital resources
• Getting via Web browser, file transfer, etc.
• Displaying using browser plug-ins, etc.
– Multiple metadata records in multiple electronic locations with different metadata formats
• A metadata record is (usually) significantly smaller than the stuff it describes
– Catalogue card vs book
– Metadata is a precis or abstract of those aspects of the data deemed useful for retrieval, management, processing, etc.
– Abbreviations and codes are often used
– Some exceptions include small manuscripts with a long history …
• Different types of information resource require different metadata elements
– Some elements are common; e.g. title, date
– Publication pattern and frequency are specific to serial resources
– URLs don’t apply to printed books
– Local preservation metadata is not required for remote digital resources
– Etc.
• Many resources are composed of other resources, so metadata can be applied at different levels of “granularity”
– In library catalogues, journals usually have metadata about the journal as a whole, and not about individual articles
• Articles have metadata in abstract and indexing services
– Some libraries catalogue multi-media kits as a whole; others catalogue each component
• A benefit of metadata is to provide consistency and coherency in using and processing resources
– Resources themselves come with the widest variation in “intrinsic” metadata
• Forms of title, etc.; layout; completeness; etc.
– Metadata can be created consistently and structured coherently to improve effectiveness and efficiency in its use
• Similarities and differences easier to spot
• Ensuring consistent metadata is not simple
– Common and format-specific elements as well as creative reaction to “the norm”
• “Ceci n’est pas une pipe”
– Natural variation in naming and describing things
• J. Smith, John Smith, John Smith (Labour), etc.
• Requires standards and guidance
• Coherent set of elements organised (structured and labelled) in a consistent way – a schema
(loosely)
– “Title” or “Caption”? Include the subtitle or use a
“Subtitle” element? Always include a title?
• Guidance on identifying and interpreting elements in the resource
– Title on spine, cover or title-page?
• Guidance on standardising content
– Include “The” at the start of the title?
• Achieving consistency benefits local users of metadata (efficient, effective)
• Self-propelled users become non-local, so there are benefits in achieving consistency between libraries
• And metadata creation is complex
(expensive), so there is value in sharing records
• So national and international standards have been used since the first modern library catalogues (100+ years)
• With significant evolution from the 1960s
– Computers; “machine-readable cataloguing”
• And again from the 1990s
– Internet/Web; “common information environment” including archives and museums
• MARC21 (21 st century machine-readable cataloguing)
– 40 years old; covers wide range of library stuff in depth
• Difficult to use - requires professional training
• DC (Dublin Core) – Ohio, that is
– 10 years old; covers wider range of stuff (archives, museums) at much less depth
• Easier to use by a wider range of people
• DC/MARC structures can interoperate via element mappings
• AACR (Anglo-American Cataloguing
Rules)
– Older than MARC; covers wide range of library stuff in depth
• Complements MARC; requires professional training
– Undergoing radical development as RDA
(Resource Description and Access)
• Becoming suitable for DC and other formats
• Content interoperability
• Many formats in use
• Wide variation in coverage and content
• No longer created exclusively by trained professionals
– Wider “interpretation” of the rules (if any)
• Needs to be joined-up so it can be used effectively at a global (non-local) level
– Interoperability!
• Caters to a wider range of users
• Public/life-long learners/local business; staff/students; teachers/learners/researchers; archives/libraries/museums
• Covers a wider range of resources
• Originals/digitised copies; complex websites/blogs/wikis; archives/libraries/museums
• Is created by a wider range of people
• Acquisitions/cataloguing/serials; webpage writers/online reviewers/wikis/folksonomists
• Metadata is useful information about specific aspects of a resource
• Specific aspects are structured and labelled as metadata elements
• Different types of resource have different sets of elements, with a common core set
• Non-local use is increasingly important
• Standards are evolving to improve usefulness
My card
Dunsire, Gordon
Me / My parents. - Kirkcaldy : The parents, 1951.
g.dunsire@strath.ac.uk