CS6999 SWT Lecture 1 Introduction to the Semantic Web Bruce Spencer NRC-IIT Fredericton Sept 12, 2002 National Research Council Research Institutes and Facilities across Canada 17 research institutes 4 innovation centres 3,500 employees; 1,000 guest workers National science facilities S&T information for industry and scientific community CISTI: Candian Inst. for Science and Tech Information Network of technology advisors supporting SME IRAP: Industrial Reseach Assistanceship Program 1 CS 6999 SW Semantic Web Techniques 12-Sep-02 Institute for Information Technology There are two aspects to IIT – – 2 A mature research organization of ~80 people in Ottawa New labs being developed in four cities in New Brunswick and Nova Scotia involving ~60 new people The whole organization is evolving to accommodate our new distributed nature CS 6999 SW Semantic Web Techniques 12-Sep-02 NRC’s plans for New Brunswick What? – – NRC is building an e-business research team in New Brunswick E-business includes e-learning, e-government, e-health. Using information and communication technology to help us to educate, govern and take care of ourselves, to create wealth. – – – 3 New Brunswick and Canadian companies already have strengths in all three areas NB’s communications infrastructure and interested telco Bilingual workforce CS 6999 SW Semantic Web Techniques 12-Sep-02 NRC’s plans for New Brunswick NRC will act locally, and think nationally and globally – – – Will work with new Brunswick community to develop clusters in e-business This is also NRC’s national lab in e-business NRC will build international links Where? – – – 4 Main group (40 staff) in Fredericton, at UNBF Satellite in Saint John (6 staff), at E-Comm Centre, UNBSJ Satellite in Moncton (6 staff), at U. de Moncton CS 6999 SW Semantic Web Techniques 12-Sep-02 Bruce MMath 83, BNR 83-86, Waterloo PhD 86-90, UNB prof 90-01, NRC 01-now Automated reasoning – – – – 7 data structures in theorem proving eliminate redundant searching smallest proofs deductive databases Java in curriculum since 1997 CS 6999 SW Semantic Web Techniques 12-Sep-02 Overview and Course Mindmap Increasing demand for formalized knowledge Namespaces on the Web: AI’s CSS DTDs chance! XSLT XML- & RDF-based DAML Stylesheets markup languages Agents provide a 'universal' Transformations Ontobroker storage/interchange format for such WebXQL HornML distributed knowledge Rules Queries representation XQuery RuleML Course introduces XML-QL SHOE knowledge markup & resource semantics: Frames RDF[S] Acquisition we show how to marry AI representations (e.g., logics and frames) with TopicMaps XML & RDF [incl. RDF Protégé Schema] XML 9 CS 6999 SW Semantic Web Techniques 12-Sep-02 The Semantic Web Activity of the W3C “The Semantic Web is a vision: the idea of having data on the Web defined and linked in a way that it can be used by machines not just for display purposes, but for • automation, • integration and • reuse of data across various applications.” (http://www.w3.org/2001/sw/Activity) 10 CS 6999 SW Semantic Web Techniques 12-Sep-02 What your computer sees in HTML <b>Joe’s Computer Store </b> <br> 365 Yearly Drive Presentation information What your computer sees in XML <location> <name>Joe’s Computer Store </name> <address> 365 Yearly Drive </address> </location> 11 Content description (ambiguous) CS 6999 SW Semantic Web Techniques 12-Sep-02 What a computer could understand <mail:address xmlns:mail=“http://www.canadapost.ca”> <mail:name>Joe’s Computer Store </mail:name> <mail:street> 365 Yearly Drive </mail:street> </mail:address> www.canadapost.ca could define address, name, street, … Search engines could then identify mail addresses Consider shopbots being able to find – 12 price, quantity, feature, model number, supplier, serial number, acquisition date Assumes that namespaces will be used consistently CS 6999 SW Semantic Web Techniques 12-Sep-02 Semantic Web Semantics = meaning Good Idea: Dictionary – – – – Create a dictionary of terms Put it on the web Mark up web pages so that terms are linked to these dictionary-entries This allow more precise matching Better – – has hierarchies of terms shades of meaning Best – 13 idea: Thesaurus idea: Ontology hierarchy of terms and logic conditions CS 6999 SW Semantic Web Techniques 12-Sep-02 Semantic Web An agent-enabled resource “information in machine-readable form, creating a revolution in new applications, environments and B2B commerce” W3C Activity launched Feb 9, 2001 DAML: DARPA Agent Markup Language – – OIL is Ontology Inference Layer – 14 US Gov funding to define languages, tools 16 project teams DAML+OIL is joint DARPA-EU Knowledge Representation is a natural choice CS 6999 SW Semantic Web Techniques 12-Sep-02 15 CS 6999 SW Semantic Web Techniques 12-Sep-02 •SmokedSalmon is the intersection of Smoked and Salmon Smoked Salmon 16 CS 6999 SW Semantic Web Techniques 12-Sep-02 •SmokedSalmon is the intersection of Smoked and Salmon Smoked Salmon •Gravalax is the intersection of Cured and Salmon, but not Smoked Gravalax 17 CS 6999 SW Semantic Web Techniques 12-Sep-02 •SmokedSalmon is the intersection of Smoked and Salmon Smoked Salmon •Gravalax is the intersection of Cured and Salmon, but not Smoked Lox •Lox is Smoked, Cured Salmon 18 Gravalax CS 6999 SW Semantic Web Techniques 12-Sep-02 The Semantic Web is about having the Internet use common sense. 19 A search for keywords Salmon and Cured should return pages that mention Gravalax, even if they don’t mention Salmon and Cured A search for Salmon and Smoked will return smoked salmon, should also return Lox, but not Gravalax Smoked Salmon Lox Gravalax CS 6999 SW Semantic Web Techniques 12-Sep-02 Smoked Salmon Lox Gravalax 20 CS 6999 SW Semantic Web Techniques 12-Sep-02 Tim Berners- Lee’s Semantic Web 21 CS 6999 SW Semantic Web Techniques 12-Sep-02 RDF Resource Description Framework 22 Beginning of Knowledge Representation influence on Web Akin to Frames, Entity/Relationship diagrams, or Object/Attribute/Value triples CS 6999 SW Semantic Web Techniques 12-Sep-02 RDF Example <rdf:ProductSpecs about= “http://www.lemoncomputers.ca/model_2300”> <specs:colour>yellow</specs:colour> <specs:size>medium</specs:size> </rdf:ProductSpecs> model_2300 size medium 23 colour yellow CS 6999 SW Semantic Web Techniques 12-Sep-02 RDF Class Hierarchy All lemon laptops get packed in cardboard boxes Allows one to customize existing taxonomies – Example: palmtop computers still get packed in boxes is_a lemon_palmtop_ 20000 model_2300 size medium 24 CS 6999 SW Semantic Web Techniques colour yellow 12-Sep-02 Tim Berners- Lee’s Semantic Web 25 CS 6999 SW Semantic Web Techniques 12-Sep-02 Ontology Web Language: W3C Previously known as DAML+OIL – – US: DARPA Agent Markup Language EU: Ontology Interchange Layer (Language) Composed of a hierarchy with additional conditions Based on Description logic, limited expressivenss – – 26 Reasoning procedures are well-behaved Just enough power CS 6999 SW Semantic Web Techniques 12-Sep-02 Identifying Resources URL/URI – – – Uniform resource locator / identifier Information sources, goods and services financial instruments “Where do you want to go today?” – 27 money, options, investments, stocks, etc. becomes “What do you want to find?” CS 6999 SW Semantic Web Techniques 12-Sep-02 Ontology Branch of philosophy dealing with the theory of being Tarski’s assumption: – “A common vocabulary and agreed-upon meanings to describe a subject domain” – – What real-world objects do my tags refer to? How are these objects related? Communication requires shared terms – 28 individuals, relationships and functions others can join in CS 6999 SW Semantic Web Techniques 12-Sep-02 Ontology Layer Widens interoperability and interconversion – More meta-information – – Which attributes are transitive, symmetric Which relations between individuals are 1-1, 1-many, many-many Communities exist – – 29 knowledge representation DL, OIL, SHOE (Hendler) New W3C working group CS 6999 SW Semantic Web Techniques 12-Sep-02 Transitive, Subrole example 30 One wants to ask about modes of transportation from Sydney to Fredericton “connected by Acadian Lines bus” is a role in a Nova Scotia taxonomy “connected by SMT bus” from New Brunswick Both are subroles of “connected” “connected” is transitive Note that ontologies can be combined at runtime CS 6999 SW Semantic Web Techniques 12-Sep-02 Combining Rich Ontologies Only these facts are explicit – – Connected by Acadian Lines in separate ontologies “Connected by bus” – Amherst is superset is symmetric and transitive Amherst Truro Connected by Acadian Lines Sydney Route from Sydney to Fredericton is inferred Connected by SMT Lines Sussex Connected by SMT Lines Fredericton 31 CS 6999 SW Semantic Web Techniques 12-Sep-02 Tim Berners- Lee’s Semantic Web 32 CS 6999 SW Semantic Web Techniques 12-Sep-02 Logic Layer Clausal logic encoded in XML – Special cases of first-order logic – – Various implementations: SQL, KIF, SLD (Prolog), XSB J-DREW reasoning tools in Java. Modus operandi: build tractable reasoning systems – 33 Horn Clauses for if-then type reasoning and integrity constraints Standard inference rules based on Resolution – RuleML, IBM CommonRules trade away expressiveness, gain efficiency CS 6999 SW Semantic Web Techniques 12-Sep-02 Logic Architecture Example Contracting parties integrate e-businesses via rules Seller E-Storefront Business Rules Buyer’s ShopBot Contract Rules Interchange OPS5 34 Business Rules Prolog CS 6999 SW Semantic Web Techniques 12-Sep-02 Negotiation via rules usualPrice: price(per-unit, ?PO, $60) purchaseOrder(?PO, supplierCo, ?AnyBuyer) shippingDate(?PO, ?D) (?D 24April2001). volumeDiscountPrice: price(per-unit, ?PO, $55) purchaseOrder(?PO, supplierCo, ?AnyBuyer) quantityOrdered(?PO, ?Q) (?Q 1000) shippingDate(?PO, ?D) (?D 24April2001). overrides(volumeDiscount, usualPrice). 35 CS 6999 SW Semantic Web Techniques 12-Sep-02 Hot Research Topics: Tools to create ontologies – – – – Tools to learn ontologies from a large corpus such as corporate data – Merging / aligning two different ontologies from different sources on the same topic Searching cum reasoning tools – 36 Ontolingua Protégé-2000 (Stanford) OILED … SHOE CS 6999 SW Semantic Web Techniques 12-Sep-02 Eventual Goal of these Efforts Agents locate goods, services – – – – – Gives rise to need of trust, privacy and security – 37 use ontologies unambiguous business rules expressive language but reasoning tractable combine from various sources e.g. semantic web project to determine eligibility of patients for a clinical trial CS 6999 SW Semantic Web Techniques 12-Sep-02