First steps in metadata Ann Chapman Policy and Advice team, UKOLN UKOLN is supported by: www.ukoln.ac.uk A centre of expertise in digital information management www.bath.ac.uk What is metadata? • Structured data about something • Encountered every day – – – – – – bus & rail timetables phone directories Internet shopping sites (e.g. Amazon) ingredient lists on food items calendars (public holidays, religious festivals) event (e.g. seminar, workshop) programme www.ukoln.ac.uk A centre of expertise in digital information management www.bath.ac.uk More about metadata • Structured data about resources – – – – – – Library catalogues Abstracting and indexing services Archival finding aids Museum documentation Collection description Community information • Carriers – Formats (e.g. MARC) – Markup languages (e.g. HTML, SGML, XML) www.ukoln.ac.uk A centre of expertise in digital information management www.bath.ac.uk Markup languages • SGML = Standard Generalised Markup Language - controls document formatting for publication • XML = Extensible Markup Language - “next generation” SGML • HTML = Hyper Text Markup Language - SGML subset, controls display of web pages All use tags (usually paired) to structure text into elements e.g. headings, paragraphs, lists, etc. <title> </title> <p> </p> <li> </li> www.ukoln.ac.uk A centre of expertise in digital information management www.bath.ac.uk Overview • • • • • • MARC ONIX Dublin Core & application profiles RSLP Collection Description MARC 21 Community Information Other metadata types www.ukoln.ac.uk A centre of expertise in digital information management www.bath.ac.uk MARC Formats • MAchine Readable Catalogue records – Library of Congress, 1960s – Now widespread use in many countries – Catalogue once, use record many times – Holdings can be attached – 1960s: books, serials, maps, music scores – 2006: any physical or digital resource www.ukoln.ac.uk A centre of expertise in digital information management www.bath.ac.uk MARC - structure • Structured format and carrier • Numeric and alpha tags • Fixed fields – Leader, 001-008, 010-099 • Variable fields – 100, 110, 111, 245, 260, etc. www.ukoln.ac.uk A centre of expertise in digital information management www.bath.ac.uk MARC - elements • • • • • • • • • 1XX Main entry 2XX Title, Statement of Responsibility, edition, publication 3XX Physical description 4XX Series information 5XX Notes 6XX Subject access 7XX Added entries (alternative titles, multiple authors, etc.) 8XX Added entries for series 9XX References and local use fields www.ukoln.ac.uk A centre of expertise in digital information management www.bath.ac.uk MARC 21 record 021 $a 0761952926 082 $s 338.9 $c 21 100 $a Nederveen Pieterse, Jan P. 245 $a Development theory: $b deconstruction. 260 $a London: $b Sage, $c 2001 300 $a xii, 195p. $c 25cm $e cased 440 $a Theory, culture and society 650 $a Economic development www.ukoln.ac.uk A centre of expertise in digital information management www.bath.ac.uk ONIX Formats • Primary use – Publishers to Internet booksellers – Rich product information • 3 Formats for product information metadata – Books, Serials, Licensing Terms • ONIX for Books in use: – First version 1999 – Current version release 2.0 (2001) • Carrier – XML • Elements – XML reference name and tag www.ukoln.ac.uk A centre of expertise in digital information management www.bath.ac.uk ONIX - elements • Message header • Product record – identifiers, author, title, edition, language, subject, audience, descriptions, publisher, dates – territorial rights, dimensions, suppliers, availability, promotions • Main series and sub-series records www.ukoln.ac.uk A centre of expertise in digital information management www.bath.ac.uk ONIX for Books - record <ISBN> 0123456789 </ISBN> <DistinctiveTitle> Alice in Wonderland <DistinctiveTitle> <Contributor> <ContributorRole> Author <ContributorRole> <PersonNameInverted> Carroll, Lewis </PersonNameInverted> </Contributor> <Publisher> Collins </Publisher> <PublicationDate> 2000 <PublicationDate> www.ukoln.ac.uk A centre of expertise in digital information management www.bath.ac.uk Dublin Core - structure • • • • Simple resource discovery DCMES – Dublin Core Metadata Element set HTML the most common ‘carrier’ Comprises 15 elements with – Element qualifiers – Element encoding schemes – Optional/mandatory elements • Application profiles www.ukoln.ac.uk A centre of expertise in digital information management www.bath.ac.uk Dublin Core - elements Title Creator Subject Description Publisher Contributor Date Resource Type Format Resource identifier Source Language Relation Coverage Rights www.ukoln.ac.uk A centre of expertise in digital information management www.bath.ac.uk Dublin Core - record <title> Alice in Wonderland </title> <creator> Lewis Carroll </creator> <subject><LCSH> Fiction </LCSH></subject> <publisher> Project Gutenberg </publisher> <date> 2000 </date> <format> ASCII file via FTP </format> <identifier> htttp://promo.net/pg/… </identifier> www.ukoln.ac.uk A centre of expertise in digital information management www.bath.ac.uk RSLP Collection Description • Schema developed May 2000 for RSLP programme • MS Access database for RSLP – summer 2001 • Web-based implementations: Revealweb, Cornucopia, Backstage, PADDI, MASC25, SCONE, Cecilia, RASCAL • Based on same model: SCONE • • • • • General attributes Subject Dates Associated agents External relationships www.ukoln.ac.uk A centre of expertise in digital information management www.bath.ac.uk Coll. Desc. - elements General: title, identifier, description, strength, physical characteristics, language, type, access control, accrual status, legal status, custodial history, note, location Subject: concept, object, name, place, time Dates: accumulation, contents Agents: creator, owner Relationships: sub & super-collections, catalogues and descriptions, associated collections and publications www.ukoln.ac.uk A centre of expertise in digital information management www.bath.ac.uk Coll. Desc. - record Title: Pitman Collection Strength: Shorthand – national significance Phys.Char.: printed texts and manuscripts Lang: English, Spanish, Esperanto, …. Access: Written request to the Librarian, University of Bath Accrual: passive, deposit Location: The Library, University of Bath, Bath Subject: shorthand, Sir Isaac Pitman, phonetic alphabets Owner: Pitman Publishing Co. Catalogue: University of Bath Library OPAC www.ukoln.ac.uk A centre of expertise in digital information management www.bath.ac.uk MARC 21 Community Information • Same principles as MARC 21 Bibliographic • Leader – Individual / organization / program / event / other • Fixed fields – 001-008, 010-099 fixed fields – 007 disability facilities – 008 special aspects • Variable fields www.ukoln.ac.uk A centre of expertise in digital information management www.bath.ac.uk M 21 Comm. Inf. – elements 1XX Name 2XX Title and Address 3XX Physical description 4XX Series (for events) 5XX Notes 6XX Subject access 7XX Added entries 8XX Other variable fields www.ukoln.ac.uk A centre of expertise in digital information management www.bath.ac.uk M 21 Comm. Inf. – record 110 $a CILIP 245 $a CILIP HQ 247 $a LA HQ $f 19?? – 2002 270 $a Ridgmount St, London WC1E 7AE $k 020 7255 0505 $m info@cilip.org.uk $r 9am to 6pm 311 $a Ewart Room $d seats 50 $g £100 per day 312 $a Overhead projector $f £10 per day 581 $a Library + Information Update 856 $a http://www.cilip.org.uk www.ukoln.ac.uk A centre of expertise in digital information management www.bath.ac.uk Other metadata formats • IEEE LOM – learning object metadata • EAD – Encoded Archival Description • Theatre Information Group DTD – performance data www.ukoln.ac.uk A centre of expertise in digital information management www.bath.ac.uk Metadata – fit for purpose • MARC 21 Bibliographic – libraries • • • • ONIX – book trade and libraries Dublin Core – Internet EAD – archives Collection description – archives, libraries, museums • M21 Community Information – primarily libraries www.ukoln.ac.uk A centre of expertise in digital information management www.bath.ac.uk Contact details Ann Chapman a.d.chapman@ukoln.ac.uk UKOLN University of Bath, Bath BA2 7AY www.ukoln.ac.uk www.ukoln.ac.uk A centre of expertise in digital information management www.bath.ac.uk