Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. A Five Minute Intro to XML Roger L. Costello The MITRE Corporation 1 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. 2 You have data … How should you structure it? Here's some data about a Canon camera: Canon-Sure-Shot-Z155 37-155mm zoom 4.8-11.7 $318 USD Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. The XML approach is to "wrap" each data item in start/end tags <Camera> <name>Canon-Sure-Shot-Z155</name> <f-stop>4.8-11.7</f-stop> <focal-length>37-155mm zoom</focal-length> <cost>$318 USD</cost> </Camera> Canon.xml 3 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. 4 XML Terminology <name>Canon-Sure-Shot-Z155</name> Start tag End tag Data Element Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. What's the date? <date>05-10-99</date> The XML tags are generally very useful in describing properties of the data. However, in this case more is needed. That is, more data about the data (metadata) is needed. 5 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. XML Attributes <date format="dd-mm-yy">05-10-99</date> attribute Things to note: 1. Attributes are "bundled" in with the start tag. 2. Attributes take this form: attribute-name="attribute-value" 6 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Common Design Pattern: IDentify classes Instead of this: <Camera> <name>Canon-Sure-Shot-Z155</name> <f-stop>4.8-11.7</f-stop> <focal-length>37-155mm zoom</focal-length> <cost>$318 USD</cost> </Camera> Design like this: <Camera ID="Canon-Sure-Shot-Z155"> <f-stop>4.8-11.7</f-stop> <focal-length>37-155mm zoom</focal-length> <cost>$318 USD</cost> </Camera> 7 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. The General Design Pattern <Class ID="instance"> <property1>data value</property1> <property2>data value</property2> ... <property-n>data value</property-n> </Class> Things to note: 1. Names of Classes by convention begin with uppercase. 2. Names of properties by convention begin with lowercase. 8 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Example <CameraStore ID="Hunts-Camera-Store"> <location>Malden, MA</location> <inventory> <Camera ID="Canon-Sure-Shot-Z155"> <f-stop>4.8-11.7</f-stop> <focal-length>37-155mm zoom</focal-length> <cost>$318 USD</cost> </Camera> <Camera ID="Olympus-mju-II"> <f-stop>2.8</f-stop> <focal-length>35mm</focal-length> <cost>$178 USD</cost> </Camera> ... </inventory> </CameraStore> Hunts.xml 9 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. 10 With OWL you can define the Camera Class (which is used in the XML document) "Qualify" the element (Class) to indicate that it is provided by OWL <owl:Class rdf:ID="Camera"> </owl:Class> "Qualify" Elements and attributes provided by OWL the attribute (ID) to indicate that it is provided by RDF Elements and attributes provided by RDF Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Why use XML? • It is a universally accepted standard way of structuring data (syntax). • It is a W3C recommendation (W3C = World Wide Web Consortium) • The marketplace supports it with a lot of free/inexpensive tools. • The alternative to using XML is to define your own proprietary data syntax, and then build your own proprietary tools to support the proprietary syntax (Not a very appealing idea). 11 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. A Quick Introduction to OWL Web Ontology Language Roger L. Costello David B. Jacobs The MITRE Corporation (The creation of this tutorial was sponsored by DARPA) 12 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. 13 What is OWL? Answer: OWL is a set of XML elements and attributes, with standardized meaning, that are used to define terms and their relationships. OWL extends RDF Schema: Class equivalentProperty sameIndividualAs ... subClassOf resource ID ... OWL OWL elements and attributes (i.e., OWL Vocabulary) RDF Schema Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Example of using OWL to define two terms and their relationship Example: Define the terms "Camera" and "SLR". State that SLRs are a type of Camera. Here's how these two terms (classes) and their relationship is defined using the OWL vocabulary: <owl:Class rdf:ID="Camera"/> <owl:Class rdf:ID="SLR"> <rdfs:subClassOf rdf:resource="#Camera"/> </owl:Class> 14 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Quick Intro Contents • In this quick intro we present an example to demonstrate one of the utilities of OWL: – The example shows how OWL can be used to bridge terminology differences and thus enhance interoperability. 15 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Example: Bridging the Terminology Gap using OWL • A key problem in achieving interoperability is to be able to recognize that two pieces of data are talking about the same thing, even though different terminology is being used. • The following slides presents an example to show how OWL may be used to bridge the "terminology gap". 16 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Interested in Purchasing a Camera • Scenario: – I am interested in purchasing a camera with a 75-300mm zoom lens size, that has an aperture of 4.5-5.6, and a shutter speed that ranges from 1/500 sec. to 1.0 sec. – I launch my personal "Web Bot" which crawls the Web looking for Web sites that can fulfill my request. – Assume that there exists an OWL Camera Ontology, which the Web Bot can "consult" upon its travels across the Web. 17 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Is this document relevant? The Web Bot finds this document at a Web site: Is it relevant? (Note: SLR = Single Lens Reflex) <PhotographyStore rdf:ID="Hunts" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <store-location>Malden, MA</store-location> <phone>617-555-1234</phone> <catalog rdf:parseType="Collection"> <SLR rdf:ID="Olympus-OM-10" xmlns="http://www.camera.org#"> <lens> <Lens> <focal-length>75-300mm zoom</focal-length> <f-stop>4.5-5.6</f-stop> </Lens> </lens> <body> <Body> <shutter-speed rdf:parseType="Resource"> <min>0.002</min> <max>1.0</max> <units>seconds</units> </shutter-speed> </Body> </body> <cost rdf:parseType="Resource"> <rdf:value>325</rdf:value> <currency>USD</currency> </cost> </SLR> </catalog> </PhotographyStore> 18 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. 19 A Match? <PhotographyStore rdf:ID="Hunts" xmlns:rdf="&rdf;#"> <store-location>Malden, MA</store-location> <phone>617-555-1234</phone> <catalog rdf:parseType="Collection"> <SLR rdf:ID="Olympus-OM-10" xmlns="http://www.camera.org#"> <lens> <Lens> <focal-length>75-300mm zoom</focal-length> <f-stop>4.5-5.6</f-stop> </Lens> </lens> <body> <Body> <shutter-speed rdf:parseType="Resource"> <min>0.002</min> <max>1.0</max> <units>seconds</units> </shutter-speed> </Body> </body> <cost rdf:parseType="Resource"> <rdf:value>325</rdf:value> <currency>USD</currency> </cost> </SLR> </catalog> </PhotographyStore> Match? I am interested in purchasing a camera with a 75-300mm zoom lens size, that has an aperture of 4.5-5.6, and a shutter speed that ranges from 1/500 sec. to 1.0 sec. To determine if there is a match, these questions must be answered: 1. What's the relationship between "SLR" and "Camera"? 2. What's the relationship between "focal-length" and "size"? 3. What's the relationship between "f-stop" and "aperture"? Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. 20 Relationship between SLR and Camera? The Web Bot "consults" the OWL Camera Ontology. This OWL statement tells the Web Bot that a SLR is a type of Camera: <owl:Class rdf:ID="SLR"> <rdfs:subClassOf rdf:resource="#Camera"/> </owl:Class> <PhotographyStore rdf:ID="Hunts" <SLR> … </SLR> </PhotographyStore> Hunts.xml Web Bot "Relationship between Camera and SLR?" <owl:Class rdf:ID="SLR"> <rdfs:subClassOf rdf:resource="#Camera"/> </owl:Class> "SLR is a type of Camera." Camera.owl Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Relationship between focal-length and lens size? This OWL statement tells the Web Bot that focal-length is equivalent to lens size: <owl:DatatypeProperty rdf:ID="focal-length"> <owl:equivalentProperty rdf:resource="#size"/> <rdfs:domain rdf:resource="#Lens"/> <rdfs:range rdf:resource="&xsd;#string"/> </owl:DatatypeProperty> "focal-length is synonymous with (lens) size. focal-length is to be used within a Lens. focal-length has a value that is a string." 21 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Relationship between f-stop and aperture? This OWL statement tells the Web Bot that f-stop is equivalent to aperture: <owl:DatatypeProperty rdf:ID="f-stop"> <owl:equivalentProperty rdf:resource="#aperture"/> <rdfs:domain rdf:resource="#Lens"/> <rdfs:range rdf:resource="&xsd;#string"/> </owl:DatatypeProperty> The Web Bot now recognizes that the XML document it found at the Web site - is talking about Cameras, and it - does show the lens size, and it - does show the aperture for the camera, and - the values for lens size, aperture, and shutter speed are met. Thus, the Web Bot recognizes that the XML document is a match! 22 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. 23 Semantic Definitions Separate from Application! <SLR rdf:ID="Olympus-OM-10" xmlns="http://www.camera.org#"> <lens> <Lens> <focal-length>75-300mm zoom</focal-length> <f-stop>4.5-5.6</f-stop> </Lens> </lens> <body> <Body> <shutter-speed rdf:parseType="Resource"> <min>0.002</min> <max>1.0</max> <units>seconds</units> </shutter-speed> </Body> </body> <cost rdf:parseType="Resource"> <rdf:value>325</rdf:value> <currency>USD</currency> </cost> </SLR> Hunts.xml "Relationship between Camera and SLR?" Semantic Definitions <owl:Class rdf:ID="SLR"> <rdfs:subClassOf rdf:resource="#Camera"/> </owl:Class> "SLR is a type of Camera." Web Bot (application) "Relationship between aperture and f-stop?" "f-stop is synonymous with aperture." "Relationship between size and focal-length?" "focal-length is synonymous with size." <owl:DatatypeProperty rdf:ID="focal-length"> <owl:equivalentProperty rdf:resource="#size"/> <rdfs:domain rdf:resource="#Lens"/> <rdfs:range rdf:resource="&xsd;#string"/> </owl:DatatypeProperty> <owl:DatatypeProperty rdf:ID="f-stop"> <owl:equivalentProperty rdf:resource="#aperture"/> <rdfs:domain rdf:resource="#Lens"/> <rdfs:range rdf:resource="&xsd;#string"/> </owl:DatatypeProperty> Camera.owl See the article "Why use OWL?" for a discussion of why it is good practice to separate the semantic definitions from the application. Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Summary: Interoperability despite terminology differences! • The example demonstrated how a Web Bot application was able to dynamically process an XML document from a Web site, despite the fact that the XML document used terminology different than was used to express the request. This interoperability was achieved by using the OWL Camera Ontology! • This example also demonstrated the architectural design principle of cleanly separating the application code (e.g., Web Bot) from the semantic definitions (e.g., Camera.owl). 24 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. 25 Demo of interoperability in a heterogeneous data environment Hunts.xml Camera Application What do you know about SLR? SLR is a type of Camera. Camera.owl Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Demo - searching for Camera, lens size, aperture info • The Camera Application is searching for documents that meet this desire: – I am interested in purchasing a Camera with a 75300mm zoom lens size, that has an aperture of 4.5-5.6, and a shutter speed that ranges from 1/500 sec. to 1.0 sec. • The Camera Application understands the terms (i.e., elements) Camera, lens size, and aperture. • If a document uses terms that it does not understand, then the Camera application "consults" the Camera Ontology. 26 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. 27 Hunts.xml - uses unfamiliar terminology ? <PhotographyStore> <catalog rdf:parseType="Collection"> <SLR rdf:ID="Olympus-OM-10" xmlns="http://www.xfront.com/owl/ontologies/camera/#"> <lens> <Lens> <focal-length>75-300mm zoom</focal-length> <f-stop>4.5-5.6</f-stop> </Lens> </lens> <body> <Body> <shutter-speed rdf:parseType="Resource"> <min>0.002</min> <max>1.0</max> <units>seconds</units> </shutter-speed> </Body> </body> </SLR> </catalog> </PhotographyStore> Need to consult the Camera Ontology! Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. 28 QuikPhoto.xml - uses familiar terminology <Camera> <lens> <Lens> <size>75-300mm zoom</size> <aperture>4.5-5.6</aperture> </Lens> </lens> <body> <Body> <shutter-speed rdf:parseType="Resource"> <min>0.002</min> <max>1.0</max> <units>seconds</units> </shutter-speed> </Body> </body> </Camera> No need to consult the Camera Ontology. Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. 29 Lesson Learned • The Camera Application is able to process documents that uses unfamiliar terminology. Interoperates! Community A uses terms SLR, f-stop, focal-length Community B uses terms Camera, aperture, lens size OWL Camera Ontology Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Who's Using Ontologies? • Real estate investment agencies are using Ontologies to exchange data with regulatory agencies (Data Consortium - Real Estate Data Standards). • Reuter's Health is using Ontologies to describe the content of articles and sort them into various news feeds (using SNOMED Ontology). • Electric utilities describe their networks using Ontologies for exchange purposes (CIM/XML). • SUN has a large knowledge management initiative called swoRDfish that uses Ontologies. 30 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Related Articles "Why use OWL?" by Roger L. Costello http://www.xfront.com/owl/motivation/sld001.htm "Why use OWL?" by Adam Pease http://www.xfront.com/why-use-owl.html "Using OWL to Avoid Syntactic Rigor Mortis" by Roger L. Costello http://www.xfront.com/avoiding-syntactic-rigor-mortis.html 31 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. The OWL Camera Ontology is Online! Here is the URL to a pictorial view of the Camera Ontology: http://www.xfront.com/owl/ontologies/camera/sld001.htm Here is the URL to the camera.owl document: http://www.xfront.com/owl/ontologies/camera/camera.owl Here are the URLs to 7 physical expressions (instance documents): http://www.xfront.com/owl/ontologies/camera/Query1.xml http://www.xfront.com/owl/ontologies/camera/Hunts.xml http://www.xfront.com/owl/ontologies/camera/Query2.xml http://www.xfront.com/owl/ontologies/camera/Hunts2.xml http://www.xfront.com/owl/ontologies/camera/RJs.xml http://www.xfront.com/owl/ontologies/camera/OlympusOutletStore.xml http://www.xfront.com/owl/ontologies/camera/OlympusCorp.xml 32 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. The Robber and the Speeder Roger L. Costello David B. Jacobs The MITRE Corporation (The creation of this tutorial was sponsored by DARPA) 33 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. 34 An OWL Ontology can be used to answer questions that are implicit in your data 4 How many guns/people are registered in a gun <GunLicense> license? <registeredGun> Can this gun be registered in other gun licenses? 3 1 How many guns can have this serial number? <Gun> <serial>ABCD</serial> 2 </Gun> </registeredGun> How many people can have this driver's license number? <holder> <Person> <driversLicenseNumber>ZXYZXY</driversLicenseNumber> </Person> </holder> </GunLicense> Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. 35 The OWL Gun License Ontology answers the questions! A gun license 4 registers one gun to one person. <GunLicense> 1 Only one gun can have this serial number. <registeredGun> A gun can be <Gun> registered in <serial>ABCD</serial> only one gun 2 </Gun> license. </registeredGun> Only one person can have this 3 driver's license number. <holder> <Person> <driversLicenseNumber>ZXYZXY</driversLicenseNumber> </Person> </holder> </GunLicense> Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. The Robber and the Speeder • On the next few slides is an example that shows how an OWL Ontology provides the necessary information to link a robber and a speeder. • Thanks to Ian Davis for this example! 36 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Robber drops gun while fleeing! First of all a robbery takes place. The robber drops his gun while fleeing. This report is filed by the investigating officers: <RobberyEvent> <date>...</date> <description>...</description> <evidence> <Gun> <serial>ABCD</serial> </Gun> </evidence> <robber> <Person /> <!-- an unknown person --> </robber> </RobberyEvent> 37 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Speeder stopped Subsequently a car is pulled over for speeding. The traffic officer files this report electronically while issuing a ticket: <SpeedingOffence> <date>...</date> <description>...</description> <speeder> <Person> <name>Fred Blogs</name> <driversLicenseNumber>ZXYZXY</driversLicenseNumber> </Person> </speeder> </SpeedingOffence> 38 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. The speeder owns a gun with the same serial number as the robbery gun! At police headquarters (HQ), a computer analyzes each report as it is filed. The computer uses the driver's license information to look up any other records it has about Fred Blogs (the speeder) and discovers this gun license: <GunLicense> <registeredGun> <Gun> <serial>ABCD</serial> </Gun> </registeredGun> <holder> <Person> <driversLicenseNumber>ZXYZXY</driversLicenseNumber> </Person> </holder> </GunLicense> 39 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Case Solved? • Not yet! These questions must be answered before the speeder can be arrested as the robbery suspect: 1 – Can multiple guns have the same serial number? • If so, then just because Fred Blogs owns a gun with the same serial number as the robbery gun does not mean it was his gun that was used in the robbery. 2 – Can multiple people have the same driver's license number? • If so, then the gun license information may be for someone else. 3 – Can a gun be registered in multiple gun licenses? • If so, then the other gun licenses may show the holder of the gun to be someone other than Fred Blogs. 4 – Can a gun license have multiple holders of a registered gun? • If so, then there may be another gun license document (not available at the police HQ) which shows the same registered gun but with a different holder. • The OWL Gun License Ontology provides the information needed to answer these questions! 40 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. 41 Can multiple guns have the same serial number? This OWL statement tells the computer at police HQ that each gun is uniquely identified by its serial number: <owl:InverseFunctionalProperty rdf:ID="serial"> <rdfs:domain rdf:resource="Gun"/> <rdfs:range rdf:resource="http://www.w3.org/2000/01/rdf-schema#Literal"/> </owl:InverseFunctionalProperty> 1 Only one gun can have this serial number. <Gun> <serial>ABCD</serial> </Gun> Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. 42 Can multiple people have the same driver's license number? The following OWL statement tells the computer that a driver's license number is unique to a Person: <owl:InverseFunctionalProperty rdf:ID="driversLicenseNumber"> <rdfs:domain rdf:resource="Person"/> <rdfs:range rdf:resource="http://www.w3.org/2000/01/rdf-schema#Literal"/> </owl:InverseFunctionalProperty> 2 Only one person can have this driver's license number. <Person> <driversLicenseNumber>ZXYZXY</driversLicenseNumber> </Person> Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. 43 Can a gun be registered in multiple gun licenses? The next OWL statement tells the computer that the registeredGun property uniquely identifies a GunLicense, i.e., each gun is associated with only a single GunLicense: <owl:InverseFunctionalProperty rdf:ID="registeredGun"> <rdfs:domain rdf:resource="GunLicense"/> <rdfs:range rdf:resource="Gun"/> </owl:InverseFunctionalProperty> A gun can be registered in only one gun license. 3 <GunLicense> <registeredGun> <Gun> <serial>ABCD</serial> </Gun> </registeredGun> ... </GunLicense> Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. 44 Can a gun license have multiple holders of a registered gun? The police computer uses the following OWL statement to determine that the gun on the license is the same gun used in the robbery. This final statement seals the speeder's fate. It tells the computer that each GunLicense applies to only one gun and one person. So, there is no doubt that the speeder is the person who owns the gun: <owl:Class rdf:ID="GunLicense"> <owl:intersectionOf rdf:parseType="Collection"> <owl:Restriction> <owl:onProperty rdf:resource="#registeredGun"/> <owl:cardinality>1</owl:cardinality> </owl:Restriction> <owl:Restriction> <owl:onProperty rdf:resource="#holder"/> <owl:cardinality>1</owl:cardinality> </owl:Restriction> </owl:intersectionOf> </owl:Class> A gun license 4 registers one gun to one person. <GunLicense> <registeredGun> ... <holder> ... </GunLicense> Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Summary • An OWL Ontology provides additional information about your data. – Example: The Gun License Ontology provided the data needed for the police computer to link the Robber and the Speeder! • OWL is intended to be used when processing Web documents. Thus, OWL enables an ad-hoc exploitation of Web documents, i.e., the Semantic Web! 45