An Introduction to Semantic Web Portal Ching-Long Yeh θζ Άι Department of Computer Science and Engineering Tatung University chingyeh@cse.ttu.edu.tw http://www.cse.ttu.edu.tw/chingyeh Outline • • • • • A brief description to the Semantic Web Portal Architectural basis of the Semantic Web KA2: an ontology-based web portal An architecture of RDF triple data store Conclusions Semantic Web Portal 2 Introduction to Semantic Web • Facilities to put machine-understandable data on the Web are becoming a high priority for many communities. • The Web can reach its full potential only if it becomes a place where data can be shared and processed by automated tools as well as by people. • For the Web to scale, tomorrow's programs must be able to share and process data even when these programs have been designed totally independently. Semantic Web Portal 3 Introduction to Semantic Web • The Semantic Web is a vision: the idea of having data on the web defined and linked in a way that it can be used by machines not just for display purposes, but for automation, integration and reuse of data across various applications • See “W3C Semantic Web Activity,” by Marja-Riitta Koivunen, for more descriptions. Semantic Web Portal 4 The Semantic Web Layered Architecture Proof Proof Tim Berners-Lee: “Axioms, Architecture and Aspirations” W3C all-working group plenary Meeting 28 February 2001 (http://www.w3.org/2001/T alks/0228-tbl/slide5-0.html) Logic Rules Ontology Sig RDF Schema RDF M&S XML Schema XML URI Semantic Web Portal Namespaces Unicode 5 AI’s Chance Namespaces • Increasing demand for formalized knowledge on the Web: AI’s chance! • XML- & RDF-based markup languages provide a 'universal' storage/interchange format for such Webdistributed knowledge representation. CSS DTDs XSLT DAML Stylesheets Agents Ontobroker HornML RuleML Rules Transformations XML XQL Queries XQuery XML-QL SHOE Frames RDF[S] Acquisition TopicMaps Protégé Semantic Web Portal 6 The Big Picture of SW Web Portals • A web portal is a web site that provides information content on a common topic. – General portals, e.g., Yahoo, Excite, Netscape, Lycos, CNET, MSN, and AOL.com – Specialized portal e.g., gardeners.com, semanticweb.org • Making valuable information to be found – – – – directory service, search facility news, e-mail, community forum Semantic Web Portal 8 Ontology-Based Web Portals • Ontology represents – common knowledge and interests sharing within their community • Tasks that ontology can be used to support a portal – Accessing a portal • Conceptual search and navigation • Inference capabilities – Providing information • Methods and tools accounting for the diversity of information sources Semantic Web Portal 9 Introduction to XML-Based Ontology Proof Proof Logic Rules Ontology Sig RDF Schema RDF M&S XML Schema XML URI Semantic Web Portal Namespaces Unicode 10 What is XML? • • • • Extensible Markup Language A Syntax for Documents A Meta-Markup Language A Structural and Semantic Language, not a Formatting Language • Not just for Web pages Semantic Web Portal 11 Namespace Bindings • Prefixes are bound to namespace URIs by attaching an xmlns:prefix attribute to the prefixed element or one of its ancestors, prefix:name1 ,..., prefix:namen • The value of the xmlns:prefix attribute is a URI, which may or (unlike for DTDs!) may not point to a description of the namespace’s syntax • An element can use bindings for multiple name-spaces via attributes xmlns:prefix1 ,..., xmlns:prefixm Semantic Web Portal 12 XML Document Instance and DTD <mail:address xmlns:mail="http://www.deutschepost.de/" xmlns:tele="http://www.telekom.de/"> <mail:name>Xaver M. Linde</mail:name> <mail:street>Wikingerufer 7</mail:street> <mail:town>10555 Berlin</mail:town> <mail:bill>12.50</mail:bill> <tele:phone>030/1234567</tele:phone> <tele:phone>030/1234568</tele:phone> <tele:fax>030/1234569</tele:fax> <tele:bill>76.20</tele:bill> <!ELEMENT address (name, street, town, </ mail:address> bill, phone+, fax+, bill)> <!ELEMENT name (#PCDATA)> <!ELEMENT street (#PCDATA)> <!ELEMENT town (#PCDATA)> <!ELEMENT bill (#PCDATA)> <!ELEMENT phone (#PCDATA)> <!ELEMENT fax (#PCDATA)> Semantic Web Portal 13 <!ELEMENT ARTIST (#PCDATA)> XML Schema • An XML syntax that's an alternative and/or supplement to DTDs • Data typing of element and attribute content Semantic Web Portal 14 RDF M&S • RDF (Resource Description Framework) – Beyond Machine readable to Machine understandable • RDF consists of two parts – RDF Model (a set of triples) – RDF Syntax (different XML serialization syntaxes) • RDF Schema for definition of Vocabularies (simple Ontologies) for RDF (and in RDF) RDF Data Model • Resources – A resource is a thing you talk about (can reference) – Resources have URI’s – RDF definitions are themselves Resources (linkage, see requirement 1) • Properties – slots, define relationships to other resources or atomic values • Statements – “Resource has Property with Value” – (Values can be resources or atomic XML data) • Similar to Frame Systems Semantic Web Portal 16 A Simple Example • Statement – “Ora Lassila is the creator of the resource http://www.w3.org/Home/Lassila” • Structure – Resource – Property – Value (subject) http://www.w3.org/Home/Lassila (predicate) http://www.schema.org/#Creator (object) "Ora Lassila” • Directed graph s:Creator Ora Lassila http://www.w3.org/Home/Lassila Semantic Web Portal 17 Another Example • To add properties to Creator, point through an intermediate Resource. http://www.w3.org/Home/Lassila s:Creator Person://fi/654645635 Name Ora Lassila Semantic Web Portal Email lassila@w3.org 18 Example: Bag • The students in course 6.001 are Amy, Tim, John, Mary, and Sue /courses/6.001 Rdf:Bag rdf:type /Students/Amy students rdf:_1 rdf:_2 bagid1 /Students/Tim rdf:_3 /Students/John rdf:_4 rdf:_5 /Students/Mary /Students/Sue Semantic Web Portal 19 Example: Alternative • The source code for X11 may be found at ftp.x.org, ftp.cs.purdue.edu, or ftp.eu.net http://x.org/package/X11 rdf:Alt rdf:type source altid rdf:_1 ftp.x.org rdf:_2 ftp.cs.purdue.edu rdf:_3 ftp.eu.net Semantic Web Portal 20 RDF Syntax I • Data model does not enforce particular syntax • Specification suggests many different syntaxes based on XML • General form:Starts an RDF-Description Subject (OID) <rdf:RDF> <rdf:Description about="http://www.w3.org/Home/Lassila"> <s:Creator>Ora Lassila</s:Creator> <s:createdWith rdf:resource=“http://www.w3c.org/amaya”/> </rdf:Description> </rdf:RDF> Literal Properties Resource (possibly another RDF-description) Semantic Web Portal 21 Resulting Graph http://www.w3.org/Home/Lassila s:createdWith s:Creator http://www.w3c.org/amaya Ora Lassila <rdf:RDF> <rdf:Description about="http://www.w3.org/Home/Lassila"> <s:Creator>Ora Lassila</s:Creator> <s:createdWith rdf:resource=“http://www.w3c.org/amaya”/> </rdf:Description> </rdf:RDF> Semantic Web Portal 22 RDF Syntax II: Syntactic Varieties Typing Information Subject (OID) In-Element Property <s:Homepage rdf:about="http://www.w3.org/Home/Lassila” s:Creator=“Ora Lassila”/> <s:createdWith> <s:HTMLEditor rdf:about=“http://www.w3c.org/amaya”/> </s:createdWith> </s:Homepage> rdf:type Property s:Homepage http://www.w3.org/Home/Lassila s:Creator s:createdWith rdf:type Ora Lassila http://www.w3c.org/amaya HTMLEditor RDF Schema (RDFS) • RDF just defines the data model • Need for definition of vocabularies for the data model - an Ontology Language! • The RDF Schema mechanism provides a basic type system for use in RDF models. • The RDF schema specification language is less expressive, but much simpler to implement, than full predicate calculus languages such as CycL and KIF. Semantic Web Portal 24 Most Important Modeling Primitives • Core Classes – Root-Class rdfs:Resource – MetaClass rdfs:Class – Literals rdfs:Literal • rdfs:subclassOf-property • Inherited from RDF: properties (slots) • rdfs:domain & rdfs:range • rdfs:label, rdfs:comment, etc. • Inherited from RDF: InstanceOf (rdf:type) Semantic Web Portal 25 DAML+OIL: an Ontology Language • Extension of RDF Schema • Ontology Language DAML+OIL: Result of a Joint (European + US-American) Committee • Extension of RDF Schema – – – – Class Expressions (Intersection, Union, Complement) XML Schema Datatypes Enumerations Property Restrictions • Cardinality Constraints • Value Restrictions Semantic Web Portal 26 Example: Intersection & Synonyms <daml:Class rdf:ID="TallMan"> <daml:intersectionOf rdf:parseType="daml:collection"> <daml:Class rdf:about="#TallThing"/> <daml:Class rdf:about="#Man"/> </daml:intersectionOf> </daml:Class> <daml:Class rdf:ID="HumanBeing"> <daml:sameClassAs rdf:resource="#Person"/> </daml:Class> Semantic Web Portal 27 Example: Disjoint & Complement <daml:Disjoint rdf:parseType="daml:collection"> <daml:Class rdf:about="#Car"/> <daml:Class rdf:about="#Person"/> <daml:Class rdf:about="#Plant"/> </daml:Disjoint> Disjoint not strictly necessary, since expressible via pairwise subClassOf of complementOf, as for Car and Person: <daml:Class rdf:ID="Car"> <rdfs:comment>no car is a person</rdfs:comment> <rdfs:subClassOf> <daml:Class> <daml:complementOf rdf:resource="#Person"/> </daml:Class> </rdfs:subClassOf> </daml:Class> Semantic Web Portal 28 Example: Properties (Transitive, Inverse, subProperty, UniqueProperty, range, Datatypes) <daml:TransitiveProperty rdf:ID="hasAncestor"/> <daml:ObjectProperty rdf:ID="hasChild"> <daml:inverseOf rdf:resource="#hasParent"/> </daml:ObjectProperty> <daml:UniqueProperty rdf:ID="hasMother"> <rdfs:subPropertyOf rdf:resource="#hasParent"/> <rdfs:range rdf:resource="#Female"/> </daml:UniqueProperty> <daml:DatatypeProperty rdf:ID="age"> <rdf:type rdf:resource="http://www.daml.org/2001/03/daml+oil#UniqueProperty"/> <rdfs:range rdf:resource="http://www.w3.org/.../XMLSchema#nonNegativeInteger"/> </daml:DatatypeProperty> Semantic Web Portal 29 Using User-defined Datatypes (based on XML Schema) <xsd:simpleType name="over17"> <!--over17 is an XMLS datatype based on decimal--> <!--with the added restriction that values must be >=18--> <xsd:restriction base="xsd:decimal"> <xsd:minInclusive value="18"/> </xsd:restriction> </xsd:simpleType> <daml:Class rdf:ID="Adult"> <daml:intersectionOf rdf:parseType="daml:collection"> <daml:Class rdf:about="#Person"/> <daml:Restriction> <daml:onProperty rdf:resource="#age"/> <daml:hasClass rdf:resource="somefile#over17"/> </daml:Restriction> </daml:intersectionOf> Semantic Web Portal 30 </daml:Class> Instances (Individuals) <daml:Class rdf:ID="Person"> . . . </daml:Class> <Person rdf:ID="Adam"> <rdfs:label>Adam</rdfs:label> <rdfs:comment>Adam is a person.</rdfs:comment> <age><xsd:integer rdf:value="13"/></age> <shoesize> <xsd:decimal rdf:value="9.5"/> </shoesize> </Person> Semantic Web Portal 31 Web Services Service descriptions Service registry Find Publish WSDL, UDDI Service requester WSDL, UDDI Bind Service provider Services Service descriptions Semantic Web Portal 32 DAML-S • Users and software agents should be able to discover, invoke, compose, and monitor Web resources offering particular services and having particular properties. • As part of the DARPA Agent Markup Language program, we have begun to develop an ontology of services, called DAML-S. Semantic Web Portal 33 Top Level of the Service Ontology Resource provides presents ServiceProfile (what it does) Service (how to access it) supports (how it works) described by Service Grounding ServiceModel Semantic Web Portal 34 Process Modeling Ontology Semantic Web Portal 35 Concept of ebXML Business Process Specification Schema Semantic Web Portal 36 Business Process Specification in XML Semantic Web Portal 37 KA2 An Ontology-Based Community Web Portal KA2 • Knowledge Annotation Initiative of the Knowledge Acquisition Community • The basic scenario – WWW documents of the KS community were annotated according to the schema of an ontology. – The annotations enable intelligent access to these documents and infer implicit knowledge from explicitly stated facts and rules from the ontology. Semantic Web Portal 39 The KA2 Ontology Person-ontology Class hierarchy Person Employee Academic-Staff Lecturer Researcher Administrative-Staff Secretary Technical-Staff Student Phd-Student Relations Address, Affiliation, Cooperates-With, Editor-Of,Email, First-Name, HasPublication, Head-Of-Group, Head-OfProject, Last-Name, Member-OfOrganization, Member-Of-ProgramCommittee, Member-Of-Research-Group, Middle-Initial, Organizer-Of-Chair-Of, Person-Name, Photo, Research-Interest, Secretary-Of, Studies-At, Supervises, Supervisor, Works-At-Project Publication-ontology Class hierarchy On-Line-Publication Publication Article Article-In-Book Conference-Paper Journal-Article Technical-Report Workshop-Paper Book Journal IEEE-Expert IJHCS Special-Issue Relations Abstract, Book-Editor, ConferenceProceedings-Title, Contains-Article-InBook, Contains-Article-In-Journal, Describes-Project, First-Page, HasAuthor, Has-Publisher, In-Book, InConference, In-Journal, In-Organization, In-Workshop, Journal-Editor, JournalNumber, Journal-Publisher, Journal-Year, Last-Page, On-Line-Version, … Accessing the Community Web Portal • Query capability – In F-Logic mechanism • Navigating capability – As the easy-to-use front-end of the query mechanism Semantic Web Portal 41 Query Capabilities For all publications of the researcher “Steffen Staab”. FORALL Pub <EXISTS ResID ResID:Researcher[NAME->> Steffen "Staab"; PUBLICATION ->> Pub]. Two researchers cooperate, if a Researcher X works at a Project Proj and if a Researcher Y works at the same Project Proj and X is another person than Y FORALL X; Y; Proj X:Researcher[COOPERATESWITH ->> Y:Researcher] <X:Researcher[WORKSATPROJECT->>Proj:Project] AND Y:Researcher[WORKSATPROJECT->>Proj:Project] AND NOT equal(X; Y ): Which researchers are cooperating with other r? FORALL ResID1, ResID2 <ResID1:Researcher[COOPERATESWITH ->> ResID2]. Semantic Web Portal 42 Navigating and Querying the Portal • Query the portal using F-Logic is too inconvenient. • Navigating and querying the web – A hypertext link may contain a query dynamically evaluated when clicking on the link (See P. 45) – Query generated by using the hyperbolic view interface (See P. 46) – Queries may be personalized and for the different users and available for the user in a selection list. – An expert mode: Querying the portal by typing an F-Logic statement Semantic Web Portal 43 Providing Information • Integrating various syntactic and semantic formats based on the common ontology • Three different modes of information provision are supported – Metadata-based information – Wrapper-based information – Fact-based information Semantic Web Portal 46 Metadata-based Information In RDF, XML Metadata Information source Annotation tool (OntoPad Annotea, …) Semantic Web Portal 47 Wrapper-based Information • Annotating information sources by hand is timeconsuming. Semi-structured Information sources (e.g., HTML) Structured Information sources (e.g., RDB) Wrapper program Metadata (Onto Wrapper) Semantic Web Portal 48 Fact-Based Information New facts Knowledge warehouse create add, modify, delete Fact Editor (Ontoedit, OilED, Protégé 2000) Semantic Web Portal 49 Development of Web Portals Semantic Web Portal 50 The System Architecture Semantic Web Portal 51 An Architecture of RDF Triple Data Store API's (in Prolog and SQL) User interface Crawler Triples in Prolog and RDB tables RDF parser and generator RDF documents Editing and Annotation RDF Parser and Generator • Using DCG in Prolog as the development tool • Formal grammar in RDF M&S Spec http://www.w3.org/TR/1999/REC-rdf-syntax-19990222/#grammar Formal grammar rules in DCG Prolog’s working storage Prolog’s inference engine Semantic Web Portal User interface 53 [6.1] RDF [6.2] obj [6.3] description [6.4] container ::= ['<rdf:RDF>'] obj* ['</rdf:RDF>'] ::= description | container ::= '<rdf:Description' idAboutAttr? bagIdAttr? propAttr* '/>' | '<rdf:Description' idAboutAttr? bagIdAttr? propAttr* '>' propertyElt* '</rdf:Description>' | typedNode ::= sequence | bag | alternative rdf(Objs) --> (['<?'], ([nm/'xml'];[nm/'XML']; [nm/'xmls'];[nm/'XMLS']), ([nm/'version','=',qs/_];[]), ([nm/'encoding','=',qs/_];[]), ['?>'],{wl('XML heading')} ;[]), (fullSTG('RDF',NS) ; halfSTG('RDF',NS), ['>'] ; [ ]), objStar(Objs), (fullETG('RDF',NS);[ ]). objStar(Out) --> obj(O), objStar(R),{Out=[O|R]} ; [],{Out=[]}. obj(Obj) --> container(Obj) ; description(_),{getAllTriples(Obj)}. getAllTriples(Obj):findall(statement(A,B,C),statement(A,B ,C),Obj). description(_) --> halfSTG('Description',NS), (idAboutAttr(IdAboutAttr);[]), (bagIdAttr(BagIdAttr);[]), propAttrStar(IdAboutAttr), (['/>'] ; ['>'], propertyEltStar(IdAboutAttr), fullETG('Description',NS)), {reificationOfStatements(IdAboutAttr,B agIdAttr)} ; typedNode. <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/metadata/dublin_core#"> <rdf:Description about="http://www.foo.com/cool.html"> <dc:Creator> <rdf:Seq ID="CreatorsAlphabeticalBySurname"> <rdf:li>Mary Andrew</rdf:li> <rdf:li>Jacky Crystal</rdf:li> </rdf:Seq> </dc:Creator> <dc:Identifier> <rdf:Bag ID="MirroredSites"> <rdf:li rdf:resource="http://www.foo.com.au/cool.html"/> <rdf:li rdf:resource="http://www.foo.com.it/cool.html"/> </rdf:Bag> </dc:Identifier> <dc:Title> <rdf:Alt> <rdf:li xml:lang="en">The Coolest Web Page</rdf:li> <rdf:li xml:lang="it">Il Pagio di Web Fuba</rdf:li> </rdf:Alt> </dc:Title></rdf:Description></rdf:RDF> RDF input [statement(about/http://www.foo.com/cool.html, Title, alt(_87882,[string/The Coolest Web Page, string/Il Pagio di Web Fuba])), statement(about/http://www.foo.com/cool.html, Identifier, bag(id/MirroredSites,[resource/http://www.foo.com.au/cool.html, resource/http://www.foo.com.it/cool.html])), statement(about/http://www.foo.com/cool.html, Creator, seq(id/CreatorsAlphabeticalBySurname,[string/Mary Andrew, string/Jacky Crystal]))] Prolog output RDF Statements in RDB Tables RDF_NameSpace Id 1 2 … NsName http://www.w3.org/2000/01/rdf# http://purl.org/metadata/dublin_core# … RDF_Resource Id 1 2 … ResourceName http://www.dlib.org http://www.foo.com … RDF_Predicate Id 1 2 … NS 2 2 … PredicateName Creator Language … RDF_Literal Id 1 2 … LiteralName Jacky Crystal English … RDF_Statement Id 1 2 … Resource 1 1 … Predicate 1 2 … Literal 1 2 … User Interface Conceptual Search • Natural language interface is the best choice, but is very difficult to obtain a good result. • Form-based interface is easy to implement, but is too restricted. • Hierarchical directory browsing is perhaps a good idea, if the domain in question is well organized. • We also propose a frame-based query interface. target_concept[(attribute1[:value1], …)] teacher(office:6th_floor) Semantic Web Portal 57 User Interface Natural Language Generation Content extraction Knowledge Warehouse (RDF) Text planning Content planning Multi-sentential text NLG Semantic Web Portal 58 Future Work • Building an ontology for the Digital Archive Services – Consisting of resource and process parts • Declarative approach to semantic web service – Using an workflow engine to interpret the process descriptions • Intelligent Q&A for digital archive Semantic Web Portal 59 Summary • Semantic Web portals – Machine-understandable information – RDF store – Accessing information • Navigation and query – Providing information • Annotation, wrapper, fact editing – Enabling automatic processing by software agents Semantic Web Portal 60