Semantic Web Technologies • Web Site syllabus still developing - http://www.ischool.utexas.edu/~i385t-sw • Readings Discussion • Discussion: What isn't the Semantic Web? • Class work: Using feed reader applications and blog posting demonstrations • Research Presentation Topics Semantic Technologies Stack Semantic Web elements • XML - Structured markup languages • RDF • DAML + OIL • XHTML - Universal Resource Identifiers • URLs of course • Structured, parsable addressing - http://www.shadows.com/tags/semantic_web - http://www.flickr.com/photos/tags/austin - http://www.amazon.com/exec/obidos/externalsearch/103-39923787183068?keyword=ajax&tag=donturnbullweb&mode=b ooks Structure is (still) the gateway • Web Services - The URI describes the functional parameters - The system does the REST - The client is a smart interpreter of the results • Web services have a grammar - Defined by standards - Initiated by the URI • The request - Implemented by the system • The supplied • Logic, Classification & Ontologies all provide additional functionality & structure • Never underestimate the power of plain text - Machine readable w/o extra work - Human understandable (for lightweight semantics) Documents are the Structure • XML: markup language for encoding semantics • Everyone understands XML - Especially browsers & Web crawlers - Or thinks they do, which still expands adoption <CATALOG> <CD> <TITLE>Empire Burlesque</TITLE> <ARTIST>Bob Dylan</ARTIST> <COUNTRY>USA</COUNTRY> <COMPANY>Columbia</COMPANY> <PRICE>10.90</PRICE> <YEAR>1985</YEAR> </CD> <CD> <TITLE>Hide your heart</TITLE> <ARTIST>Bonnie Tyler</ARTIST> <COUNTRY>UK</COUNTRY> <COMPANY>CBS Records</COMPANY> <PRICE>9.90</PRICE> <YEAR>1988</YEAR> </CD> … XML: Lingua Franca for SWT • “XML may become the primary syntax for all enterprise data” p 27-28 - Application independent Standard syntax for metadata Standard structure for documents & data It’s already in use • It isn’t about the CPU, it’s about being open • Structured documents use logic for semantic descriptions - And it’s not all about metadata • If it’s not easily readable, you get a legend - Schemas, DTDs, … The XML Philosophy • XML is the syntax guidelines for markup • Common structural elements are specific to each genre of use • Markup is based on elements - A container with start and end tags - Elements can have sub elements • Roots & trees - Roots define the structure - Trees are the hierarchy within - Inheritance defines the relationships • Like HTML, but stricter with the structure (XHTML) - Validated XML (or XHTML) means it is usable, not correct • XML Schemas are the specific rules for validation XML Schemas • A “definition language” to constrain semantic vocabulary & hierarchical structure • Taken from database schemas, that defines the data types, fields & tables in a DBMS • Most are not complex - But validation is key to making Semantics useful • Schemas by another name: - Document Type Definition (DTD) - RELAX NG - Schematron (XPath) XML Schema Specifics • An XML Schema defines: - elements that can appear in a document attributes that can appear in a document which elements are child elements the order of child elements the number of child elements whether an element is empty or can include text data types for elements & attributes default and fixed values for elements & attributes XML Namespaces • Namespaces define the markup globals - Building blocks: metadata & local <xsd: integer> - Calls from others - <xsd: schema xmlns:xsd:http://www.w3.org/2001/XMLSchema targetNamespace=http://www.utexas.edu/markup> • What you commonly see: - <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-US" lang="en-US"> Schemas & Instances Document Object Model • Part of the machine executable rules of the markup language & schema • Controls behavior in Web browsers too • DOM Level 3 supports Semantics • We’ll see more about the DOM in later weeks - Web 2.0, AJAX & REST rely on it heavily Resource Description Framework • What’s not a Resource? - That’s good & bad • “RDF captures meta data about the ‘externals’ of a document, like the author, the creation date, and type” p 85 - Non-text & discrete objects (images, music, bookmarks) - A triplet defining anything • Subject • Predicate • Object RDF Grammar • Describing the author of a document • http://www.utexas.edu/index.html has a author whose value is Don Turnbull • the RDF terms for the various parts of the statement are: - the subject is the URL http://www.utexas.edu/index.html - the predicate is the word author - the object is the phrase “Don Turnbull” • Describing knowledge is subtle, metadata definition is not always easy. RDF Barriers • People don’t use reification well or at all (provenance metadata) - Inheritance is tricky & the logic must be parsed • Containers are very flexible - Bags allow any order - Sequences can be more complex than alphabetical - Alternates depend on the instance • Syntax is varied • Examples are “simple”, but still not completely utilized - Dublin Core - RSS • Tools will help as will industry use - Podcasts (Media RSS) • More on this and RDF Schemas themselves later Xpath • Control syntax for all manner of XML interaction & addressing • Allows for finding, parsing & manipulating data in a document - See XSLT • Examples: - selects the document root (which is always the parent of the document element) - child::para selects the para element children of the context node Xquery & Xforms • A structured query language for XML - Allows for building virtual documents from parts of other documents - Understands the rules of schemas, markup & metadata to perform application-level functions on data - Tool support is growing including DBMS vendors - Works with Xforms to provide RDBMS access to URI addressable data More Semantic Standards • Xlink - Conditional link syntax far beyond anchors & addressing • Xpointer - Allows for building (& including) aggregated, distributed applications & interfaces • Xinclude - Provides “make file” syntax for building master documents or constructing complex Semantic inheritance & interaction • XMLBase - Syntax for resolving & recommending relevant URIs • Style Sheets - XSL - XSLT - XSLFO Feed Readers & blog posting • How do you use Semantic Web technologies? - Browsing - Retrieval - Sharing • Readers • Blogging is easy What isn’t the Semantic Web? • “bring structure to the meaningful content of Web pages, creating an environment where software agents roaming from page to page can readily carry out sophisticated tasks for users” (Berners-Lee, 2001) • What do you think now? • How promising can SWT be? - As everyday systems • Is it a new way to solve problems? - Or • A new set of capabilities & solutions? Topic Selection • Choose a topic (and corresponding week) to overview • Topic Presentations should include: - Overview of the technology Provide examples of the technology in use Show how to build using the technology (examples) A list of citations and readings that you drew from and for extended reference • Do not rely on wikipedia & blogs as your only sources • Academic journal & conference papers • Books (development or conceptual design) • How can these Semantic Web technologies help coordinate, discover, organize information and knowledge? • Your own point of view about the practicality & promise of these tools & procedures Current list of Topics • • • • • • • • • • • • • RDF Metadata (e.g. Dublin Core, MediaRSS) Ontology building (applications) REST, XMLHttpRequest & AJAX Greasemonkey Javascript: Introduction Javascript: Advanced TagClouds GIS, Maps & Mapping Mashups XSLT WordNet Semantic Commerce Trust Next Week • • • • Readings & Discussion Blogging & Tagging (ongoing) Finalize topics & presentation dates Suggestions for speakers