02.sw-introduction.ppt

advertisement
Semantic Web
Towards a Web of Knowledge Introduction
Spring 2006
Computer Engineering Department
Sharif University of Technology
Outline
• Introduction
– From Original Web to Semantic Web
• Building Blocks
–
–
–
–
–
•
•
•
•
URI
XML
RDF
Ontology
Agent
History of SW
SW Architecture
Motivation for the SW
Knowledge Representation
2
Semantic web - Computer Engineering Dept. - Spring 2006
The Original Web
•
Web of relationships amongst named objects
Information Management: A Proposal, Tim Berners-Lee, CERN, March 1989, May
1990, http://www.w3.org/History/1989/proposal.html
Semantic web - Computer Engineering Dept. - Spring 2006
3
Current Web: the Syntactic Web
4
Semantic web - Computer Engineering Dept. - Spring 2006
The Syntactic Web is…
• A place where computers do the presentation (easy)
and people do the linking and interpreting (hard).
• Why not get computers to do more of the hard work?
• “The bane of my existence is doing things that
I know the computer could do for me.”
—Dan Connolly, “The XML Revolution”
5
Semantic web - Computer Engineering Dept. - Spring 2006
Hard Work using the Syntactic Web…
Find images
Rev. Alan M. Gates, Associate Rector of the
Church of the Holy Spirit, Lake Forest, Illinois
Semantic web - Computer Engineering Dept. - Spring 2006
6
Impossible using the Syntactic Web…
• Complex queries involving background knowledge
– Find information about “animals that use sonar but
are not either bats or dolphins”
• Locating information in data repositories
– Travel enquiries
– Prices of goods and services
– Results of human genome experiments
• Finding and using “web services”
• Delegating complex tasks to web “agents”
– Book me a holiday next weekend somewhere
warm, not too far away, and where they speak
French or English
7
Semantic web - Computer Engineering Dept. - Spring 2006
What is the Problem?
• Consider a typical web page:
•
Markup consists of:
– rendering
information (e.g.,
font size and
colour)
– Hyper-links to
related content
• Semantic content
is accessible to
humans but not
(easily) to
computers…
8
Semantic web - Computer Engineering Dept. - Spring 2006
What information can we see…
WWW2002
The eleventh international world wide web conference
Sheraton waikiki hotel
Honolulu, hawaii, USA
7-11 may 2002
1 location 5 days learn interact
Registered participants coming from
australia, canada, chile denmark, france, germany, ghana, hong
kong, india, ireland, italy, japan, malta, new zealand, the
netherlands, norway, singapore, switzerland, the united kingdom,
the united states, vietnam, zaire
Register now
On the 7th May Honolulu will provide the backdrop of the eleventh
international world wide web conference. This prestigious event
…
Speakers confirmed
Tim berners-lee
Tim is the well known inventor of the Web, …
Ian Foster
Ian is the pioneer of the Grid, the next generation internet …
9
Semantic web - Computer Engineering Dept. - Spring 2006
What information can a
machine see…
























Semantic web - Computer Engineering Dept. - Spring 2006
10
Solution: XML markup with
“meaningful” tags?
<name>
</name>
<location>
</location>
<date></date>
<slogan></slogan>
<participants>




</p
articipants>
<introduction>



</introduction>
<speaker></speaker>
<bio></bio
>…
11
Semantic web - Computer Engineering Dept. - Spring 2006
Still the Machine only sees…
<>
</>
<>
</>
<></>
<></>
<>




</
>
<>



</>
<></>
<></>
<></>
<></> 13
Semantic web - Computer Engineering Dept. - Spring 2006
Need to Add “Semantics”
• External agreement on meaning of annotations
– E.g., Dublin Core for annotation of library/bibliographic information
• Agree on the meaning of a set of annotation tags
– Problems with this approach
• Inflexible
• Limited number of things can be expressed
• Use Ontologies to specify meaning of annotations
– Ontologies provide a vocabulary of terms
– New terms can be formed by combining existing ones
• “Conceptual Lego”
– Meaning (semantics) of such terms is formally specified
– Can also specify relationships between terms in multiple
ontologies
14
Semantic web - Computer Engineering Dept. - Spring 2006
The Building Blocks
1.
2.
3.
4.
5.
URI
XML
RDF
Ontologies
Agents
15
Semantic web - Computer Engineering Dept. - Spring 2006
URIs
• URI = Uniform Resource Identifier
• "The generic set of all names/addresses that are
short strings that refer to resources"
• URLs (Uniform Resource Locators) are a particular
type of URI, used for resources that can be
accessed on the WWW.
16
Semantic web - Computer Engineering Dept. - Spring 2006
XML
“XML allows users to add arbitrary structure
to their documents but says nothing about
what the structures mean”
17
Semantic web - Computer Engineering Dept. - Spring 2006
RDF
o Meaning encoded in sets of ‘triples’:
entities have properties which have
values
o Entities, properties and values all have
distinct URIs
studentOF
Ganji
departmentOF
Sharif
CE
hasHomePage
http://ce.sharif.edu
18
Semantic web - Computer Engineering Dept. - Spring 2006
Example
We are looking for a
woman who works
for one of my client
and her son studies at
University of
Rovaniemi
19
Semantic web - Computer Engineering Dept. - Spring 2006
Ontologies
o Database A and Database B may use different
fields to contain ‘zip code’ (e.g. zip code and
postal code)
o Ontologies sort this out
o Ontology = ‘a document or file that formally
defines the relations among terms’
o Ontologies for the web normally have
o A taxonomy
o A set of inference rules
o e.g. “If an animal eats a food, and this food is expensive, then
that animal is not suitable for poor people.”
20
Semantic web - Computer Engineering Dept. - Spring 2006
Agents
o Programs which
o collect Web content from various sources
o process the information
o exchange results with other
o effectiveness increases exponentially with
number of available agents
21
Semantic web - Computer Engineering Dept. - Spring 2006
Ambient Intelligence
(Pervasive Computing)
“In the next step, the Semantic Web will break out
of the virtual realm and extend into our physical
world. URIs can point to anything, including
physical entities, which means we can use the
RDF language to describe devices such as cell
phones and TVs.”
Berners-Lee, T, Hendler, J & Lassila, O ‘The semantic web’, Scientific
American, May 2001
22
Semantic web - Computer Engineering Dept. - Spring 2006
History of the Semantic Web
• Web was “invented” by Tim Berners-Lee (amongst
others), a physicist working at CERN
• TBL’s original vision of the Web was much more
ambitious than the reality of the existing (syntactic)
Web:
“... a goal of the Web was that, if the interaction between person and
hypertext could be so intuitive that the machine-readable information
space gave an accurate representation of the state of people's
thoughts, interactions, and work patterns, then machine analysis
could become a very powerful management tool, seeing patterns in
our work and facilitating our working together through the typical
problems which beset the management of large organizations.”
• TBL (and others) have since been working towards
realising this vision, which has become known as
the Semantic Web
– E.g., article in May 2001 issue of Scientific American…
23
Semantic web - Computer Engineering Dept. - Spring 2006
Scientific American, May 2001:
• Realising the complete “vision” is too hard for now (probably)
• But we can make a start by adding semantic annotation to web
resources
24
Semantic web - Computer Engineering Dept. - Spring 2006
Semantic Web Architecture
You
are
here
25
Semantic web - Computer Engineering Dept. - Spring 2006
Web of Trust
26
Semantic web - Computer Engineering Dept. - Spring 2006
An example scenario (1)
(Originally from Berners Lee, Scientific American paper)
• The entertainment system was belting out the Beatles’ “We
Can Work It Out” when the phone rang
• When Pete answered, his phone turned the sound down by
sending a message to all the other local devices that had a
volume control
• His sister, Lucy, was on the line from the doctor’s office:
“Mom needs to see a specialist and then has to have a
series of physical therapy sessions. Biweekly or something.
I’m going to have my agent set up the appointments.”
• Pete immediately agreed to share the work
27
Semantic web - Computer Engineering Dept. - Spring 2006
An example scenario (2)
• At the doctor’s office, Lucy instructed her semantic web
agent through her handheld web browser
• The agent promptly retrieved information about Mom’s
prescribed treatment from the doctor’s agent, looked up
several lists of providers, and checked for the ones in plan
for Mom’s insurance within a 20-mile radius of her home
and with a rating of excellent or very good on trusted rating
services
• It then began trying to find a match between available
appointment times (supplied by the agents of individual
providers through their web sites) and Pete’s and Lucy’s
busy schedules
28
Semantic web - Computer Engineering Dept. - Spring 2006
An example scenario (3)
• In a few minutes the agent presented them with
a plan
• Pete didn’t like it – University hospital was all the
way across the town from Mom’s place, and he’d
be driving back in the middle of rush hour
• He set his own agent to redo the search with
stricter preferences about location and time
• Lucy’s agent having complete trust in Pete’s
agent in the context of the present task,
automatically assisted by supplying access
certificates and shortcuts to the data it had
already been sorted through
29
Semantic web - Computer Engineering Dept. - Spring 2006
An example scenario (4)
• Almost instantly the new plan was presented
• A much closer clinic and earlier times – but there were
two warning notes
• First, Pete would have to reschedule a couple of his less
important appointments. He checked what they were –
not a problem
• The other was something about insurance company’s list
failing to include this provider under physical therapists:
“Service type and insurance plan status securely verified
by other means, “the agent reassured him.”
30
Semantic web - Computer Engineering Dept. - Spring 2006
An example scenario (5)
• Lucy registered her assent at about same
moment Pete was muttering, “Spare me the
details,” and it was all set.
• (Of course, Pete couldn’t resist the details and
later that night had his agent explain how it had
found that provider even though it wasn’t on the
proper list.)
31
Semantic web - Computer Engineering Dept. - Spring 2006
Insurance Co.
Rating
Mom
Physician’s Agent
required
treatment
Provider sites
in-plan?
close-by?
Specialist?
Schedule appointment
Driving schedule
Lucy’s Agent
Pete’ Agent
32
Semantic web - Computer Engineering Dept. - Spring 2006
Expressed Meaning (1)
• Pete and Lucy could use their agents to carry out
all these tasks thanks not to the World Wide Web of
today but rather the Semantic Web that it will
evolve into tomorrow
• Most of the Web’s content today is designed for
humans to read, not for computer programs to
manipulate meaningfully
• Computers can adeptly parse Web pages for layout
and routine processing – here a header, there a link
to another page – but in general computers have
no reliable way to process the semantics:
– This is the home page of the Hartman and Strauss
Physio Clinic
– This link goes to Dr. Hartman’s curriculum vitae
33
Semantic web - Computer Engineering Dept. - Spring 2006
Expressed Meaning (2)
• The semantic web will bring structure to the meaningful
content of web pages
– Creating an environment where software agents roaming from
page to page can readily carry out sophisticated tasks for users
• Such an agent coming to the clinic’s web page will know
not just that
– The page has keywords such as “treatment, medicine, physical,
therapy” (as might be encoded today) but also that
– Dr. Hartman works at this clinic on Mondays, Wedmesdays and
Fridays and that
– The script takes a yyyy-mm-dd format and returns appointment
times
34
Semantic web - Computer Engineering Dept. - Spring 2006
Expressed Meaning (3)
• And it will “know” all this without needing AI on the
scale of 2001’s Hal or Star War’s C-3PO.
• Instead these semantics were encoded into Web
page when the clinic’s office manager (who never
took computer courses) massaged it into shape
using off-the-shelf software for writing
– Semantic Web pages along with
– Resources listed on the Physical Therapy Association’s
site
35
Semantic web - Computer Engineering Dept. - Spring 2006
Expressed Meaning (4)
• The semantic Web is not a separate Web but an extension
of the current one, in which information is given welldefined meaning, better enabling computers and people to
work in cooperation
• The first steps in weaving the semantic web into the
structure of the existing Web are already under way
• In the near future, these developments will usher in
significant new functionality as machines become much
better able to process and “understand” the data that they
merely display at present.
36
Semantic web - Computer Engineering Dept. - Spring 2006
Expressed Meaning (5)
• The essential property of the World Wide Web is its
university
• The power of a hypertext link is that “anything can link to
anything”
• One important observation is the difference between
– Information produced primarily for human consumption and that
– Produced mainly for machines
• At one end of the scale we have everything from fivesecond TV commercial to poetry
• At the other end we have databases, programs and
sensor output
37
Semantic web - Computer Engineering Dept. - Spring 2006
Expressed Meaning (6)
• To date, the Web has developed most rapildy as a
medium of documents for
– People rather than for
– Data and information that can be processed
automatically
• The Semantic Web aims to make up for this
• Like the internet, the Semantic Web will be as
decentralized as possible
38
Semantic web - Computer Engineering Dept. - Spring 2006
Expressed Meaning (7)
• Such Web-like systems generate a lot of
excitement at every level
– From major corporation to individual user, and
– Provide benefits that are hard or impossible to predict in
advance
• Decentralization requires compromises the Web
had to throw away the ideal of total consistency of
all of its interconnections ushering in the infamous
message “Error 404 Not Found” but allowing
unchecked exponential growth
39
Semantic web - Computer Engineering Dept. - Spring 2006
Knowledge Representation (1)
• For the semantic web to function, computers must
have access to
– Structured collections of information and
– Sets of inference rules
• They can use these to conduct automated
reasoning
• AI researchers have studied such systems since
long before the Web was developed
40
Semantic web - Computer Engineering Dept. - Spring 2006
Knowledge Representation (2)
• Knowledge representation is currently in a state
comparable to that of hypertext before the advent
of the Web
– It is clearly a good idea, and some very nice
demonstrations exist
– But it has not yet changed the world
• It contains the seeds of important applications, but
to realize its full potential it must be linked into
single global system:
– The Semantic Web
41
Semantic web - Computer Engineering Dept. - Spring 2006
Knowledge Representation (3)
• Traditionally knowledge-representation systems
typically been centralized
• Requiring everyone to share exactly the same
definition of common concepts such as “parent” or
“vehicle”
• But central control is stifling, and increasing the
size and the scope of such a system rapidly
becomes unmanageable
42
Semantic web - Computer Engineering Dept. - Spring 2006
Knowledge Representation (4)
• Knowledge representation systems usually
carefully limit the questions that can be asked so
that the computer can answer reliably – or answer
at all
• The problem is reminiscent of Godel’s theorem
from mathematics:
– Any system that is complex enough to be useful also
encompasses unanswerable questions
– Much like sophisticated versions of the basic paradox
“This sentence is false”
43
Semantic web - Computer Engineering Dept. - Spring 2006
Knowledge Representation (5)
• To avoid such problems, traditional knowledge
representation systems generally have their own
set of rules for making inferences about their data
• For example, a genealogy system, acting on
databases of family trees, might include the rule “a
wife of an uncle is an aunt”
• Even if the data could be transferred from one
system to another, the rules, existing in a
completely different form, usually could not
44
Semantic web - Computer Engineering Dept. - Spring 2006
Knowledge Representation (6)
• Semantic Web researchers, in contrast, accept that
paradoxes and unanswerable questions are a price that
must be paid to achieve versatility
• They make the language for the rules as expressive as
needed to allow the web to reason as widely as desired
• This philosophy is similar to that of the conventional Web:
– Early in the Web’s development detractors pointed out that it could
never be a well-organized library
– Without a central database and tree structure, one would never be
sure of finding everything
– They were right
45
Semantic web - Computer Engineering Dept. - Spring 2006
Knowledge Representation (7)
• The expressive power of the Web made vast amounts of
information available
• Search engines now produce remarkably complete indices
of a lot of the material out there
– Which would have seemed quite impractical a decade ago
• The challenge of the Semantic Web, therefore, is to
provide a language that expresses both data and rules for
reasoning about the data and that allows rules from any
existing knowledge representation system to be exported
onto the Web
46
Semantic web - Computer Engineering Dept. - Spring 2006
Knowledge Representation (8)
• Adding logic to the Web that means to use rules to
make
– Inferences,
– Choose courses of action and
– Answer questions
• This are tasks before the Semantic Web
community at the moment
• A mixture of mathematical and engineering
decisions complicate this task
47
Semantic web - Computer Engineering Dept. - Spring 2006
Knowledge Representation (9)
• The logic must be powerful enough
– To describe complex properties of objects
– But not so powerful that agents can be tricked by being
asked to consider a paradox
• Fortunately, a large majority of the information we
want to express is along the lines of
– “a hex-head bolt is a type of machine bolt”
– This is really written in existing languages with a little
extra vocabulary
48
Semantic web - Computer Engineering Dept. - Spring 2006
Knowledge Representation (10)
• Two important technologies for developing the Semantic
Web already in place
– eXtensible Markup Language (XML) and
– The Resource Description Framework (RDF)
• XML lets everyone create their own tags
– Hidden labels that annotate Web pages or sections of text on a
page
– Scripts, or programs, can use of these tags in sophisticated ways,
but the script writer has to know what the page writer uses each
tag for
– XML allows user to add arbitrary structure to their documents but
says nothing about what the structures mean
49
Semantic web - Computer Engineering Dept. - Spring 2006
Knowledge Representation (11)
• Meaning is expressed by RDF
– Which encodes it in sets of triples
– Each triple being rather like the subject, verb and object of an
elementary sentence
• These triples can be written using XML tags
• In RDF, a document makes assertions that
– Particular things (people, Web pages or whatever) have
– Properties (such as “is a sister of”, “is the author of”) with
– Certain values (another person, another Web page)
50
Semantic web - Computer Engineering Dept. - Spring 2006
Knowledge Representation (12)
• This structure turns out to be a natural way to describe the
vast majority of the data processed by machines
• Subject and object are each identified by a Universal
Resource Identifier (URI)
– Just as used in link on a Web page
– (URLs, Uniform Resource Locators, are the most common type of
URI)
– The verbs are also identified by URIs
– This allows anyone to define
• A new concept
• A new verb
– Just be defining a URI for it somewhere on the Web
51
Semantic web - Computer Engineering Dept. - Spring 2006
Knowledge Representation (15)
• Human language thrives when using same term to mean
somewhat different things, but automation does not
• Imagine that a clown messenger service is hired to deliver
balloons to customers on their birthdays
• Unfortunately, the service transfers the addresses from the
customer’s database to its own database,
• It does not know that the “addresses” in the customer’s
database are where bills are sent and that many of them
are post office boxes
52
Semantic web - Computer Engineering Dept. - Spring 2006
Knowledge Representation (14)
• The hired clowns end up entertaining a number of
postal workers
– Not necessarily a bad thing but certainly not the
intended effect
• Using a different URI for each specific concept
solves that problem
• An address that is a mailing address can be
distinguished from one that is a street address,
and both can be distinguished from an address
that is a speech
53
Semantic web - Computer Engineering Dept. - Spring 2006
Knowledge Representation (15)
• The triples of RDF form Webs of information about
related things
• RDF uses URIs to encode this information in a
document
• URIs ensure that concepts are not just words in a
document but are tied to a unique definition that
everyone can find on the Web
54
Semantic web - Computer Engineering Dept. - Spring 2006
Knowledge Representation (16)
• For example, imagine that we have access to a
variety of databases with information about
people, including their addresses
• If we want to find people living in a specified zip
code, we need to know which fields in each
database represent
– Names and which represent
– Zip codes
• RDF can specify that “(field 5 in database A) (is a
filed of type) (zip code),” using URIs rather than
phrases for each term
55
Semantic web - Computer Engineering Dept. - Spring 2006
Knowledge Representation (17)
• RDF has a simple data structure which can be used to
express vast amount of knowledge on the web
• However, it lacks enough power to express higher level of
knowledge:
– For example it is not possible to build hierarchy of concepts in
RDF
• RDF-S (RDF Schema) adds the ability of building
hierarchies to RDF
• A layer above RDF-S is named as OWL which extends
RDF and RDF-S with amongst other things, the ability to
have restrictions on concepts participating in relations
• In this course we will talk about XML, RDF, RDF-S and
OWL
56
Semantic web - Computer Engineering Dept. - Spring 2006
References
• Semantic web official home page:
– www.w3.org/2001/sw/
• Previous term course page:
– http://ce.sharif.edu/courses/83-84/2/ce926/
• T. Berners-Lee et al., The Semantic Web,
Scientific American, 2001
• Manchester university
• Montreal university
57
Semantic web - Computer Engineering Dept. - Spring 2006
Download