THE UNIVERSITY OF TEXAS AT AUSTIN School of Information Metadata Danielle Cunniff Plumer INF 385T: Knowledge Management Systems Prof. Don Turnbull February 11, 2003 Revised THE UNIVERSITY OF TEXAS AT AUSTIN School of Information What is Metadata? • “[structured] data about data” • Irrelevant factoid: A company called Metadata®, in attempting to protect its trademark, has threatened legal action against business for using the term generically. See http://www.metadata.com/ THE UNIVERSITY OF TEXAS AT AUSTIN School of Information Metadata definition • Meta data, n. [from L. meta-; Gr. meta- and L. data] Data that characterizes other data in a reflexive way, e.g., data about data. Analogous to words about words. In data processing, it is definitional data that provides information about or documentation of other data managed within an application or environment. For example, meta data would document data about DATA ELEMENTS or ATTRIBUTES, (name, size, data type, etc) and data about RECORDS or DATA STRUCTURES (length, fields/columns, etc) and data about DATA (where it is located, how it is associated, ownership, etc.). Meta data may include descriptive information about the context, quality and condition, or characteristics of the data. – Retrieved February 11, 2003, from http://www.metadata.com/word.htm THE UNIVERSITY OF TEXAS AT AUSTIN School of Information Dublin Core • “The Dublin Core Metadata Initiative (DCMI) is an organization dedicated to fostering the widespread adoption of interoperable metadata standards and promoting the development of specialized metadata vocabularies for describing resources to enable more intelligent resource discovery systems.” – Retrieved February 11, 2003, from http://www.dublincore.org/resources/faq/ THE UNIVERSITY OF TEXAS AT AUSTIN School of Information DCMI Elements • Originally 15 elements • Currently, the full set includes 46 elements – http://dublincore.org/usage/ terms/dc/current-elements/ THE UNIVERSITY OF TEXAS AT AUSTIN School of Information Basic DCMI Elements • • • • • • • • title creator subject description publisher contributor date type • • • • • • • format identifier source language relation coverage rights THE UNIVERSITY OF TEXAS AT AUSTIN School of Information Questions to ponder • HMTL <meta> tags are currently not used by search engines. Will this change? How/why? • Dornfest and Brickley talk about “implicit metadata” on p. 193. What are the implications of this? • Problems of ambiguous metadata (p. 194): how does XML resolve this (namespaces)? THE UNIVERSITY OF TEXAS AT AUSTIN School of Information Dublin Core Generators • Nordic DC metadata creator (including URN generator) http://www.lub.lu.se/cgibin/nmdc.pl • For more Dublin Core tools, see http://dublincore.org/tools/ THE UNIVERSITY OF TEXAS AT AUSTIN School of Information XML • eXtensible Markup Language • XML is a metalanguage; that is, it allows the creation of domainspecific markup languages based on a structured syntax • XML says nothing about meaning THE UNIVERSITY OF TEXAS AT AUSTIN School of Information RDF • Resource Definition Framework • “The Resource Description Framework (RDF) is a language for representing information about resources in the World Wide Web. It is particularly intended for representing metadata about Web resources” – Retrieved February 11, 2003, from http://www.w3.org/TR/rdf-primer/ THE UNIVERSITY OF TEXAS AT AUSTIN School of Information RDF Illustrated A simple RDF statement. Source: Manola, Frank, & Miller, Eric. (2003). RDF Primer. Retrieved February 11, 2003, from http://www.w3.org/TR/rdf-primer/ THE UNIVERSITY OF TEXAS AT AUSTIN School of Information Sample RDF Metadata <rdf:RDF xmlns:contact="http://www.w3.org/2000/10/swap/pim/contact#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:doc="http://www.w3.org/2000/10/swap/pim/doc#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.w3.org/2001/02pd/rec54#"> <WD rdf:about="http://www.w3.org/TR/2003/WD-rdf-primer-20030123/"> <rdf:type rdf:resource="http://www.w3.org/2001/02pd/rec54#LastCall"/> <dc:date>2003-01-23</dc:date> <dc:title>RDF Primer</dc:title> <lastCallFeedBackDue>2003-02-21</lastCallFeedBackDue> <cites><ActivityStatement rdf:about="http://www.w3.org/2001/sw/Activity"/></cites> <doc:versionOf rdf:resource="http://www.w3.org/TR/rdf-primer/"/> <doc:obsoletes rdf:resource="http://www.w3.org/TR/2002/WD-rdf-primer-20021111/"/> <editor rdf:parseType="Resource"> <contact:fullName>Frank Manola</contact:fullName> <contact:mailbox rdf:resource="mailto:fmanola@mitre.org"/></editor> <editor rdf:parseType="Resource"> <contact:fullName>Eric Miller</contact:fullName> <contact:mailbox rdf:resource="mailto:em@w3.org"/></editor> </WD> </rdf:RDF> RDF Primer Metadata. Retrieved February 21, 2003, from http://www.w3.org/TR/rdf-primer/metadata.rdf THE UNIVERSITY OF TEXAS AT AUSTIN School of Information Questions to ponder • Dornfest and Brickley claim that “Unique identifiers create markets” (p. 201). Their examples: – Collaborative filtering – E-commerce – Discovery • Are they right? Or is this more visionary hyperbole? THE UNIVERSITY OF TEXAS AT AUSTIN School of Information The Semantic Web • Elements: – XML – RDF – Ontologies – Agents – Digital signatures THE UNIVERSITY OF TEXAS AT AUSTIN School of Information Ontology • “an ontology is a document or file that formally defines the relations among terms. The most typical kind of ontology for the Web has a taxonomy and a set of inference rules” – Source: Berners-Lee, T., Hendler, J. and Lassila, O. (2001). The Semantic Web. Scientific American. Retrieved February 11, 2003, from http://www.scientificamerican.com/print_version.cfm ?articleID=00048144-10D2-1C7084A9809EC588EF21 THE UNIVERSITY OF TEXAS AT AUSTIN School of Information Semantics • • • SYLLABICATION:se·man·tics PRONUNCIATION: s -m n t ks NOUN:(used with a sing. or pl. verb) 1. 2. 3. – Linguistics The study or science of meaning in language. Linguistics The study of relationships between signs and symbols and what they represent. Also called semasiology. The meaning or the interpretation of a word, sentence, or other language form: We're basically agreed; let's not quibble over semantics. The American Heritage Dictionary. (4th ed., 2000). Retrieved February 11, 2003, from http://www.bartleby.com/61/83/S0248300.html THE UNIVERSITY OF TEXAS AT AUSTIN School of Information Questions to Ponder • What does “Semantic” mean in the context of the “Semantic Web?” • Should there be one central ontology or multiple ontologies? • What are the privacy and security implications of agents?