City University of Hong Kong Department of Computer Science

advertisement
City University of Hong Kong
Department of Computer Science
BSCCS/BSCS Final Year Project Report 2004-2005
(04CS019)
Semantic web - RDF to HTML tool
(Volume
1
Student Name
: Lam Wai Tung
Student No.
: 50316414
of
1 )
Programme Code : BSCCS(FT)
Supervisor
: Dr. CHUN, H W Andy
1st Reader
: Dr. Liu, W Y
2nd Reader
: Dr. Ip, Horace
For Official Use Only
This page is intentionally left blank.
Semantic Web – RDF to HTML tool
Lam Wai Tung Ivan
Department of Computer Science
City University of Hong Kong
Supervisor: Dr Chun H W Andy
April 7, 2005
Abstract
The Resource Description Framework (RDF) helps to describe content, such as
HTML pages, graphics, audio files and other documents. It assists the machines to
interpret the Semantic Web. However, RDF document is not a perfect form for
presentation. That is, it is not easy to interpret, hence a tool to render RDF content
to semantically linked HTML pages for better human reading is proposed.
The method of the rendering is using HTML templates with tags to define the page
layout, semantic linkage and logical rules which is based on the RDF data model. A
tool named RDF2HTML tool is designed and then implemented in Java with the use
of a specially designed template structure to generate semantically linked HTML
pages from RDF data model. In order to provide extensibility and flexibility to the
tool, a plug-in structure is introduced.
To work as a case application, RDF2HTML tool provides an effective mean to generated
web photo album.
ii
Acknowledgments
First of all, I would like to give special thank for the Department of Computer
Science. Without its support, I will not have such a precious opportunity to
experience a real research project.
Then, I would like to extend a very special thanks to my supervisor, Dr. CHUN,
Hon Wai Andy for giving me valuable advices and for his continuous support
throughout the process.
Lastly, special thanks to two readers, Dr. Liu, Wen Yin and Dr. Ip, H S Horace for
their helps and assistances.
Contents
1.
2.
3.
Introduction
1
…………………………………………………………………….. 1
1.1
Background
1.2
Motivation – The Problem of RDF …………………………………………………
2
1.3
Project Objectives ………………………………………………………………….
3
1.4
Scope of project ………………………………………………………………
4
Background Research
5
2.1
Semantic Web …………………………………………………………………….... 5
2.2
Extensible Markup Language ………………………………………………
6
2.3
Resource Definition Framework …………………………………………….
6
2.4
RDF Schema ………………………………………………………………..
8
2.5
Extensible Stylesheet Language Transformations (XSLT) ………………………… 8
2.6
Streaming Transformations for XML (STX) ………………………………………
Related Works
3.1
9
11
BrownSauce ………………………………………………………………………… 11
3.1.1 BrownSauce limitations ……………………………………………………… 12
3.2
Spectacle……………………………………………………………………………
13
3.2.1 Spectacle limitations …………………………………………………………. 15
3.3
Mirador…………………………………………. …………………………………
16
3.3.1 Mirador demonstration limitations …………………………………………… 19
3.4
SWeHG ……………………………………………………………………………… 20
3.4.1 SWeHG limitations ………………………………………………………….. 21
iii
4.
The Solution – RDF2HTML Tool
4.1
22
Design the transformation of RDF to HTML ………………………………………
22
4.1.1
Design issue – The Conceptual views …………………………… 23
4.1.2
Design issue – Handling RDF document ………………………… 24
4.1.3
Design issue – Defining the page context ………………………… 25
4.1.4
Design issue – System flow
4.1.5
Design issue – Template Structure
……………………………………… 25
……………………………… 26
4.2 Template Language ………………………………………………………………… 29
4.3 The design of the system
4.3.1 Component design
4.3.1.1
…………………………………………………………. 37
…………………………………………………………. 37
RDF Model Loader
…………………………………………….. 38
4.3.1.1.1 Jena RDF Model
4.3.1.2
HTML Template Parser
…………………………………….. 38
………………………………………… 39
4.2.1.2.1 HTMLParser…………………………………………
Query Executor
4.3.1.4
Template Processor
4.3.1.5
Operation Tag Group
4.4 Case Application
5.
………………………………………………… 39
4.3.1.3
4.3.2 System Architecture
…………………………………………….. 40
…………………………………………... 40
………………………………………………………. 41
…………………………………………………………………… 45
Conclusion
5.1 Benefits Obtained
39
52
………………………………………………………………….. 52
5.2 Limitations…………………………………………………………………………
53
5.3 Further Work ………………………………………………………………………… 53
Appendix A
55
Supported Query Language ………………………………………………………………. 55
Appendix B
57
Interim Report ……………………………………………………………………………. 58
Appendix C
75
Case Application Templates ………………………………………………………………. 75
Bibliography
85
1
Chapter 1
Introduction
1.1. Background
Nowadays, Semantic Web is a hot topic in web technology. It provides a common
framework, which allows data to be shared and reused across application, enterprise,
and community boundaries. In order to achieve sharing and reusing of data,
metadata is and essential ancillary to describe resources, such as web pages,
documents, photos, and real world objects in a machine understandable format. In
other words, it creates a simple extension to the current web by using binary
relationships to capture the meaning between links and data. [9]
Most of the computer industry has agreed and used XML standards to give a
syntactic structure to describe data. However, XML can be used in many different
ways to describe the same data; as a result, it becomes too open and arbitrary to
1
2
CHAPTER 1 INTRODUCTION
support the type of widespread and ad hoc data integration envisaged for the
Semantic Web. To tackle this issue, a standard syntax - Resource Description
Framework (RDF) was introduced. In Semantic Web, RDF acts as an important role
since it defines a graph model to provide a consistent and standardized way of
describing and querying internet resources, from text pages and graphics to audio
files and video clips. It not only offers semantic interoperability, but also provides
the base layer for building a Semantic Web. [10]
1.2. Motivation - The Problem of RDF
Currently, RDF/XML is the most common way to store RDF data, but XML is not a
good representation structure for human to read. It is not necessary that RDF need
to be human readable, since RDF is intended for machine interpretation. However,
the main objective of a web is to present semantic content in a human readable way.
Therefore, a better visual representation is essential towards the success of semantic
web.
Hypertext is the most common way to represent content over the Internet nowadays,
so HTML is the best choice to display RDF data.
In fact, this can be done by semantic portal1, which provides dynamic HTML pages.
However, there are still some limitations to the content provider when using it.
Firstly, it limits the ontology use, since only content of certain type ontology, which
is defined, can be published. Secondly, the publication is controlled by the portal
owner, thus content provider is not able to publish particular content easily as
publishing static web pages if they never own a portal application. Thirdly, it is hard
to search dynamic content compare to static content in search engines.
1
e.g. http://ubp.learninglab.uni-hannover.de/EducaNext/ubp/home
1.3. PROJECT OBJECTIVES
3
The truth is that more content will be in RDF format in the coming future. Hence,
publishing semantic web in human readable form easily is the key to success. To
achieve this mission, a tool - RDF2HTML which is manage to transform RDF model
to static HTML pages is proposed.
1.3. Project Objectives
By doing this project, the following are meant to achieve:
l
To get familiar with Semantic Web concept.
l
To learn Extensible Markup Language (XML), Extensible Stylesheet
Language Transformations (XSLT), Streaming Transformations for XML
(STX), Resource Description Framework (RDF), and RDF Schema
(RDFS).
l
To implement a tool – RDF2HTML to transform RDF to HTML in Java.
l
To provide an easy to use template structure.
4
CHAPTER 1 INTRODUCTION
1.4. Scope of project work
The scope of work will include the followings:
l
Study the language specifications of XML, XSLT, STX, RDF and RDFS.
l
Study and evaluate the existing solution on the transformation of RDF to
HTML.
l
Create RDF documents for testing.
l
Design the template structure.
l
Design and implement a tool to transform RDF to HTML in Java with
Jena (a Java RDF APIs).
l
Estimate a small and semantically linked site of static HTML pages, which
is generated by the RDF2HTML tool.
Chapter 2
Background Research
2.1. Semantic Web
Figure 2.2.1.1. Architecture of Semantic Web2
2
http://www.w3.org/2002/Talks/04-sweb/slide12-3.html
5
6
CHAPTER 2 BACKGROUND RESEARCH
As Figure 2.2.1.1 reveals, the Semantic Web is based on a layered architecture. Each
layer is built on the top of the lower layer. It provides better capabilities to represent
the knowledge. Until the modern era, little works has been done on the first three
layers, since the ontology layer has just become W3C recommendations. For the
Universal Resource Identifier (URI) and Unicode layer, they are well developed for
a long time. Moreover, XML and Namespaces layer are widely adopted in web
development presently. Finally yet importantly, both RDF and RDF Schema layer
are recommended by W3C.
In this project, the focus is put onto the representation of RDF layer in human
readable form.
2.2. Extensible Markup Language (XML)
XML is a simple and flexible markup language, which is derived from SGML. XML
is originally designed for large-scale electronic publishing and now it is very
important in the exchange of a wide variety of data on the Web and elsewhere. [11]
XML is able to achieve the exchange of wide variety of data because of it arbitrary
structure. That is, XML does specify neither semantics nor a tag set, and provides a
facility to define tags and the structural relationships between them. Consequently,
the semantics of an XML document can be defined by the applications. Having such
an arbitrary structure, XML is, therefore, used for the upper layers in Semantic Web
architecture to define their own semantics.
2.3. Resource Definition Framework (RDF)
RDF is a framework, which provides a consistent as well as standardized way to
describe and query internet resources, from text pages and graphics to audio files
and video clips. It expresses the relations between objects, gives syntactic
interoperability, and provides the base layer for building up a Semantic Web. [10]
2.3 RESOURCE DEFINITION FRAMEWORK (RDF)
7
To describe the relations between objects and achieve consistency, RDF defines a
model for representing resources, properties and properties values. In fact RDF data
model is independent of the representation method, in other words, it is syntaxneutral. Thus, RDF data can be represented in many formats such as RDF/XML, N3,
N-triples etc.
Consider the following example:
1: <?xml version="1.0"?>
2: <rdf:RDF
3: xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
4: xmlns:cd="http://www.recshop.fake/cd">
5: <rdf:Description
6: rdf:about="http://www.recshop.fake/cd/Hide your heart">
7:
<cd:price>9.90</cd:price>
8: </rdf:Description>
9: </rdf:RDF>
From the RDF/XML document above, the data model can be represented in Triples,
Directed Graph and Sentence as shown below:
Subject (Resource)
Predicate (Property)
http://www.recshop.fake/cd/Hide
http://www.recshop.fake/cdprice
your heart
Object (Value)
"9.90"
Table 2.1. Triples of the Data Model
RDF data model can be represented by using directed graphs, see Figure 2.2.3.1. In
the figure, subject is represented by node, the property is represented by a directed
arc and the property value is represented by rectangles. In the node and on the arc,
Universal Resource Identifier (URI)3 – http://www.recshop.fake/cd/Hide your heart
and http://www.recshop.fake/cdprice are used. In RDF, URI is used to describe all
the objects to give them a unique and universal means of identification.
3
http://www.w3.org/Addressing/
CHAPTER 2 BACKGROUND RESEARCH
8
Figure 2.2.3.1. Graph of the data model
The sentence: "The CD Hide your heart costs $9.90"
RDF provides unambiguous semantic of statement assertion which can be carried in
web documents, nonetheless, this can hardly be done by utilizing pure XML. Hence,
it becomes the base layer for building a Semantic Web.
2.4. RDF Schema (RDFS)
RDF needs a way to define application-specific classes and properties. Applicationspecific classes and properties must be defined using extensions to RDF. One of
such extension is RDFS.
RDFS does not provide actual application-specific classes and properties. Instead,
RDF Schema provides the framework to describe application-specific classes and
properties.
Classes in RDF Schema are similar to classes in object oriented programming
languages. This allows resources to be defined as instances of classes, and subclasses
of classes. [16]
2.5. Extensible Stylesheet Language Transformations (XSLT)
XSLT is the most essential part in the XSL Standards and it is a W3C
Recommendation. It is the part of XSL that is used to transform an XML document
into another XML document, or another type of document that is recognized by a
browser, like HTML and XHTML. Normally XSLT does this by transforming each
XML element into an (X)HTML element.
2.6 STREAMING TRANSFORMATIONS FOR XML
9
XSLT can also add new elements into the output file, or remove elements. It is able
to rearrange and sort elements, as well as test and make decisions about which
elements to be displayed and a lot more.
A common way to describe the transformation process is to say that XSLT
transforms an XML source tree into an XML result tree.
During the transformation process, XSLT uses XPath to define parts of the source
document that match one or more predefined templates. On one hand, if a match
was found, XSLT will transform the matching part of the source document into the
result document. On the other hand, if the parts of the source document that did not
match a template will end up unmodified in the result document. [12]
In fact, XSLT can be used to transform RDF/XML document to HTML and there
are some style sheet on the web did it 4 . However, that transformation is more
suitable for a small and simple RDF/XML document instead of the whole RDF
model. For example, it is difficult to transform the RDF/XML document with some
rules such as display books, which is related to Classification but not Data Mining.
Hence, this project is proposed. Although XSLT is least suitable for the
transformation of RDF/XML Model with rules, it is still very useful for the
transformation of simple RDF data.
2.6. Streaming Transformations for XML (STX)
STX is almost similar to XSLT; it is a one-pass transformation language for XML
documents. It is intended as a high-speed, low memory consumption alternative to
XSLT. Since it does not require the construction of an in-memory tree, it is suitable
for use in resource constrained scenarios. [13]
4
http://rssxpress.ukoln.ac.uk/view.cgi?rss_url=http://journals.iucr.org/a/rss10.xml
10
CHAPTER 2 BACKGROUND RESEARCH
STX provides a streaming analog for XSLT by adopting some of the recent familiar
concepts from XSLT (e.g., matching based on templates and an XPath 1.0-like
expression language - STXPath) but using SAX as the underlying interface to the
XML document. SAX is the event-oriented sibling of the DOM API, which provides
a sequential view of an XML document through a stream of events. [14] That is why
STX is high-speed and low memory consumption when comparing with XSLT.
The STX transformation is achieved by associating various events with templates. A
template pattern is matched against events and their context. The best matching
template is then instantiated to create a part of the result stream. A template is
always instantiated with respect to the current context, a set of additional
information maintained during the transformation. In constructing the result stream,
events from the source stream can be filtered and arbitrary events can be added.
Events can also be reordered using a working storage. [15]
Obviously, the main different between STX and XSLT is the underlying interface to
the XML document. XSLT uses DOM API while STX uses SAX. As a result, STX
is faster and lower memory consumption than XSLT. However, it does not mean
that SAX is better than XSLT. It is because the streaming character of STX, only
current node and its ancestors accessible, while random access to all data in the
document in XSLT is allowed. A sentence can describe the differences clearly “XSLT like a book while STX like reading”
Chapter 3
Related Works
3.1. BrownSauce
BrownSauce 5 is a generic RDF browser. It was written by Damian Steer whilst
employed at HP Labs Bristol. It is freesoftware, released under a BSD style licence.
5
http://brownsauce.sourceforge.net
11
CHAPTER 3 RELATED WORKS
12
BrownSauce breaks the problem into two parts: coarse-graining (breaking the data
down into usable chunks, like "information about person X") and aggregation
(making those chunks from multiple sources). The first part is done, and users can
browse more than one source by using rdfs:seeAlso references. Aggregation is
currently being worked on.[17]
Figure 0.1.1. A screenshot of BrownSauce
3.1.1. BrownSauce limitations
Even though BrownSauce is a nice generic RDF browser, it does not allow user
to customize the HTML presentation page expect styling it using CSS. In
addition, it does not provide function for user to control the data to be seen.
These are some functions that BrownSauce does not provide.
3.2 SPECTACLE
13
Although BrownSauce is not the perfect solution to the representation problem
of RDF, it is a nice and easy to use RDF browser for general use.
3.2. Spectacle
Aduna Spectacle 6 provides a powerful way of finding information on Aduna
Metadata servers7 . It combines the effectiveness and flexibility of full-text search
with the ease-of-use of faceted navigation: the ability to find information based on
properties such as its location, file type, modification dates, author, etc.
Spectacle uses Guided Exploration technology to guide the user through large
information environments by continuously offering contextual hints for further
exploration and by preventing "dead ends": all links you see in a Spectacle
navigation structure are guaranteed to lead to information, so the user is never let
down by "zero hits".[18]
6
7
http://aduna.biz/products/spectacle/
http://aduna.biz/products/metadataserver/
14
CHAPTER 3 RELATED WORKS
Figure 3.2. A screenshot of Spectacle
3.2.1 SPECTACLE LIMITATIONS
15
Figure 3.3. The architecture of Spectacle
3.2.1. Spectacle limitations
Spectacle did a good job on the RDF to HTML transformation, it provides good
representation of the metadata. Nevertheless, the architecture shows (see Figure
3.3), a Metadata Server is involved, and this indicates that the user needs to
handle the metadata server in order to present metadata. In addition, the RDF to
HTML transformation in Spectacle is based on APIs. Thus, the user needs to
write programs that use the API.
With the above reasons, Spectacle is targeted on enterprise users other than
simple users.
CHAPTER 3 RELATED WORKS
16
3.3. Mirador
The full name of Mirador
8
is Multimedia Information Retrieval Aided by
Descriptions of Online Resources, this project is aimed at investigating how the user
can better be supported in searching for resources on the WWW, by exploiting
metadata associated with a resource.
In Mirador initial demonstration9, there is an attempt on using XSLT with RDF. In
this demonstration, it shows that transforming RDF/XML to HTML by using XSLT
is possible. The following is the example from Mirador initial demonstration, Figure
3.4 shows the RDF example, Figure 3.5 is part of the style sheet, which is used to
transform RDF/XML to HTML, Figure 3.6 represents the transformation result.[19]
8
9
http://www.cee.hw.ac.uk/~mirador/
http://www.cee.hw.ac.uk/~mirador/demos.html
3.3 MIRADOR
17
Figure 3.4. Simple RDF example
Figure 3.5. Partial style sheet for displaying RDF
CHAPTER 3 RELATED WORKS
18
Description of: http://www.dlib.org
Title
D-Lib Program - Research in Digital Libraries
The D-Lib program supports the community of
Description people with research interests in digital
libraries and electronic publishing.
Publisher Corporation For National Research Initiatives
Date
1995-01-07
Research; statistical methods
Subject
Type
Format
Education, research, related topics
World Wide Web Home Page
text/html
Language en
Figure 3.6. Example metadata table result transform using XSLT
3.3.1. MIRADOR DEMONSTRATION LIMITATIONS
19
3.3.1. Mirador demonstration limitations
Considering this example, some limitations of which using XSLT to transform
RDF to HTML directly have shown. Firstly, using XSLT to transform
RDF/XML to HTML, can transform into one HTML page only in each time.
Therefore, if the transformation of each resource in RDF model was needed, the
only way was to divide the RDF model into a set of RDF/XML documents
which only one resource is contained. However, dividing the RDF model into a
set of RDF/XML documents introduce another limitation, that is the lost of
relationship between RDF resources. For instance, the relationship that both
resources had the same creator cannot be shown in this approach. Secondly,
using XSLT to transform RDF to HTML directly is only can be used on
RDF/XML, however, in section 2.3 - Resource Definition Framework (RDF)
has been mentioned, RDF can be represented in many formats such as
RDF/XML, N3, N-triples etc.
Other than the limitation, the usability of using XSLT to transform RDF to
HTML creates also a problem, Figure 3.5 shows the complexity of using XSLT.
To conclude, despite XSLT is powerful, its complexity is also high.
20
CHAPTER 3 RELATED WORKS
3.4. SWeHG
SWeHG10 full names is Semantic Web HTML Generator. The goal of SWeHG is the
same as this project. The goal of SWeHG is to provide a "poor man's" publication
tool for the Semantic Web. Hence, it aims to generate a semantically linked and
conceptually indexed static HTML page site from an RDF(S) repository.
See Figure 3.7, the internal architecture of SWeHG. As mentioned in the paper
explaining SWeHG. It separates the transformation of RDF to HTML into two
levels. One is the HTML level; a layout template, which can be designed by a layout
designer, specifies the layout of the rendered HTML pages. In addition,
Programming skills are not needed in this level. The other level is the RDF level; the
semantic linkage between the pages is determined using logical predicates. These
predicates define the semantics of the tags used on the HTML level and an
application programmer provides the definitions. [20]
10
http://www.cs.helsinki.fi/group/seco/swehg/
3.4.1. SWEHG LIMITATIONS
21
Figure 3.7. The Internal architecture of SWeHG
3.4.1. SWeHG limitations
SWeHG really did a good job on the transformation of RDF to HTML. It has
provided a good conceptual view and structure for the transformation of RDF to
HTML. Since the proposed web site structure is simple and meets the key role
of classification and indexing in view-based searching, this web site structure is
used in RDF2HTML tool.
Although SWeHG is so good, the complex architecture of it also increases the
complexity of using it. For instance, user needs to know XSLT or has a layout
designer in order to design the layout of generated HTML. Although XSLT is a
powerful tool to transform XML document, it is also a complex language to use
it. In addition, an application programmer should provide the definitions of
logical predicates. This means user need to program when a new logical
predicate is needed. Obviously, using SWeHG can achieve the transformation of
RDF to HTML, however, the user need to have the logic, Prolog knowledge and
programming skills to define the semantics tags and relation, also XSLT
knowledge for the layout design of HTML pages. These knowledge and skills
requirements of using SWeHG, reduce the usability of the tool.
Chapter 4
The Solution – RDF2HTML Tool
4.1. Design the transformation of RDF to HTML
There are several objectives which RDF2HTML tool are wanted to achieve. Firstly,
define a web site structure for the transformation of RDF documents to HTML
pages. Secondly, makes the tool independent of the RDF format. Thirdly, provides
an easy way to define the page context. Lastly, the template should be highly flexible,
easy to learn and use. The following are the design issues in order to achieve the
above objectives.
22
4.1.1. DESIGN ISSUE – THE CONCEPTUAL VIEWS
23
4.1.1. Design issue – The Conceptual views
After studied the approach of SWeHG in the transformation of RDF to HTML.
The conceptual views of its suggestion are used [20]. To facilitate the
transformation, RDF data model is considered as several conceptual levels.
Firstly, data level - the actual data/resource in RDF statement. For instance, The
CD Hide your heart costs $9.90, CD Hide your heart here is the data. Secondly,
metadata level – the property of the resource in RDF statement. For instance,
CD Hide your heart costs $9.90, costs here is the metadata. Thirdly, an RDF
Schema defines ontology level - the vocabulary used at the metadata level. For
instance, the schema indicates that the artist is an instance of the class “Human”.
Fourthly, logic rule level - semantic relations between the resources in the data
model. For instance, a binary relation between two CDs may be that they are
same Artist, same company but different cost.
Figure 4.1. Transforming RDF data model to HTML pages
CHAPTER 4 THE SOLUTION – RDF2HTML TOOL
24
Figure 4.1 shows the transformation of RDF to HTML. The RDF data model is
on the left. On the right, the Home page has links to various Index pages
classifying the underlying RPages that are related with each other by semantic
links.
Home page defines the entrance page to the RDF model. A HTML page that
contains frames to show Index Pages and Resource Pages typically defines it.
Resource pages display one resource such as an ontological concept (e.g., a class)
or a piece of resource with its property and value. Each resource that is intended
to be shown to the end-user has an RPage of its own.
Index pages classify RPages in conceptual hierarchical way.
Making the tool more flexible, the above structure is just a suggestion, user can
have there own structural design.
4.1.2. Design issue – Handling RDF document
As mentioned in section 2.3, RDF defines a model for representing resources,
properties and properties values and this model should independent of the
representation method, in other words, it should be syntax-neutral. Hence, RDF
data can be represented in RDF/XML, N3, N-triples etc.
Knowing that RDF model should be syntax-neutral, hence, the RDF2HTML tool
should be able to handle different format of RDF document. Therefore, an
abstraction of the RDF model is needed. As a result, Jena RDF API Model is
4.1.3. DESIGN ISSUE – DEFINING THE PAGE CONTEXT
25
used for the abstraction in the initial version of RDF2HTML tool. However, this
is not the best solution, since it needs to change the existing program in order to
support new format of RDF. Therefore, the ultimate solution – the plug-in
structure is introduced. Details on the plug-in structure can be found in section
4.1.1.
About the ontology level, i.e. the RDF Schema, the Jena RDF API also supports
loading RDF Schema into the RDF Model.
4.1.3. Design issue – Defining the page context
In RDF2HTML tool, the definition of the page context is defined by a query.
The reasons to use query language to define the page context are the robustness
of the query language, the ease of learning and the ease of reusability.
In current version, there are three types of query language supported. They are
RDQL supported by Jena RDF API, a RQL supported by Sesame RDF API and
SeRQL supported by Sesame RDF API.
4.1.4. Design issue – System flow
The approach used to transform RDF document to HTML pages in
RDF2HTML tool is simple. Firstly, all the RDF documents are treated as RDF
model and loaded into the system. Secondly, several queries are included in the
HTML template to define the context of the HTML pages. Then the template
processor takes the RDF model and the HTML template to generate the
conforming HTML pages. This is a brief description of RDF2HTML tool system
flow.
CHAPTER 4 THE SOLUTION – RDF2HTML TOOL
26
It is obvious that the HTML template is the heart of RDF2HTML tool as it
defines the page context and layout. However, writing the template,
programming skills are not needed. The skills that needed are writing the query
and manipulating the tag, details are covered in the section 4.2.
4.1.5. Design issue – Template Structure
The need of a template
Before the design of the template, the need of a template is considered.
Generally, the need of a template is the separation of the data and the display
layout, and then fills in different data to the template to provide data specific
documents. Specifically, one of the needs of template in RDF2HTML tool is to
specify the displaying HTML layout for the generation of a set of HTML pages
from a RDF model. The other need is to define and display the semantic relation
between RDF resources in the template. After the need of the template is
considered, the features of the data, which is RDF Model in this project, is
consider next.
RDF features
As mentioned in section 2.3, RDF is a framework provides a consistent,
standardized way of describing and querying internet resources. It is clear that
RDF is metadata-describing resource; hence, most of the properties value can be
treated as simple string or URI to other resource. Knowing this feature of RDF,
the template is designed to use simple string replacement method to replace the
variable in the template with the corresponding property value in RDF model.
4.1.5. DESIGN ISSUE – TEMPLATE STRUCTURE
27
Template types
During the design of the template, the flexibility and the ease of learning and
using are the main concerns. In the initial design (see
Figure 4.2), the template can be classified into four types. Firstly, the page no
need query RDF model but still need some operations provided by RDF2HTML
tool, e.g. string replacement. Secondly, single page generation template, this
template would query RDF model while only one HTML page will be generated.
Thirdly, single query multi-page generation template, one of the examples of this
template type is the resource page template. Clearly, resource page template is
the template to generate the HTML pages such that each HTML page describe
conforming resource. In addition, the query used is to select all resources, which
want to present, in the RDF model. Lastly, the group by property multi-page
generation template, the use of this template is to generate an Index page that
link to resources, which fulfill the group by condition, for instance, the resources
that has the same creator.
CHAPTER 4 THE SOLUTION – RDF2HTML TOOL
28
Figure 4.2. Template type classification (initial version)
After the initial version had implemented, a review on the template structure and
functionality had been taking out. As last, the need of a new template type is
discovered (see
Figure 4.3). The new template type – generic multi-query, multi-page generation
template, is a generic template type, which can replace the old single query
multi-page generation template and group by property multi-page generation
template.
4.2. TEMPLATE LANGUAGE
29
HTML template
Page NO need
query RDF model
Page need query
RDF model
Single page
generation template
Generic multiquery, multi-page
generation template
Figure 4.3. Template type classification (reviewed version)
4.2. Template Language
In the HTML template, the page properties, the layout and the page context are
defined. In addition, all these parameters help the generation of the HTML pages.
The following is a example of HTML template to generate an index page with link
to all resource pages.
CHAPTER 4 THE SOLUTION – RDF2HTML TOOL
30
1:<rdf2html page=1 filename=index\all_index.html />
2:<html>
3:<p><h3>
4:All photos
5:</h3></p>
6:<table>
7:<rdf2html condition='select *
8:
from {resource} photo:identifier {identifier},
9:
{resource} photo:title {title}
10:
using namespace
11:
photo = <http://www.w3.org/2000/PhotoRDF/dc-1-0#>'
12:
orderBy=title distinct=identifier>
13:<tr><td>
14:<rdf2html type=link
15:
linkText=title
16:
href='$urlprefix$/resource/resource_+identifier+.html'
17:
addons='target=resource'/>
18:</td></tr>
19:</rdf2html>
20:</table>
21:</html>
Figure 4.4. HTML template example to generate index page to all resource page
Page Directive
See Figure 4.4, it is an example of single page generation template. Consider line1, it
is the page directive of this HTML template; this tag defines the template properties.
There are several attributes can be set. Firstly, page attribute to indicate this
template generate how many pages. The options of page attribute are “1” and “*”.
Obviously, “1” indicates this template only generates one page. While “*“ means the
template generates multi-pages. Secondly, the filename attribute, this attribute
define the filename of the generated HTML page(s). The filename attribute must be
a valid filename on the operating system and it is a relative to the output directory
option set in the system configuration. If the page option is “1”, the filename
attribute indicate the absolute file name of the generated HTML page. For example,
“index\all_index.html” indicate the generated HTML page will locate at “<output
directory>\index\” folder and have the file name all_index.html.
4.2. TEMPLATE LANGUAGE
31
For the page option is “*” (See Figure 4.5), the filename attribute becomes a
variable, without using a automatic sequence number to act as the file name, a
unique property of the resource is suggested to be used in the file name formation.
Consider Figure 4.5 the value of filename is “resource\resource_+identifier+.html”
the variable identifier enclosed by “+” sign is the variable defined by the query in
the condition attribute. This condition attribute follows the description in the next
section Conditional Tag. In short, it is the query to select the resource for the
template and when it is defined in the page directive, each record in the result set is
corresponding to a new HTML page.
In addition, for the template that no need to query RDF model, the page directive
can be omitted.
Figure 4.5 Page directive example of multi-page template
CHAPTER 4 THE SOLUTION – RDF2HTML TOOL
32
Conditional Tag
Consider Figure 4.4 line 7 to line 19, line 7 to line 12 is the opening tag while line
19 is the closing tag. Within the opening tag and the closing tag, all tags are treated
as repeating tag.
Figure 4.6. The opening tag of the example shown in Figure 4.4
Firstly, consider the opening tag. The use of opening tag is to select out the
properties need to be presented in the repeating tag. Hence, a attribute named
condition exist which takes the query to define the data represent in the repeating
tag. As mentioned in section 4.1.3, RDF2HTML support RDQL, RQL and SeRQL
currently. In this example, SeRQL is used. This query is used to select all resources,
which has identifier and title properties. In addition, the variable name resource
represents the resource URI, identifier and title represents value of identifier and title
property respectively. Some examples use of SeRQL, RQL and RDQL can be found
in Appendix A. About the orderBy and distinct attributes, these two attributes are
to support the sorting function and the distinct function of the result set, since not all
supported query languages support order by function and only SeRQL support
distinct function. The orderBy function is sort by ascending in default, if sort by
descending is desired, the attribute ascending need set to false.
After the explanation of simple conditional tag, the next concern is the nesting
ability of conditional tag. See Figure 4.7, line 21 to 27 is the opening tag of the
conditional tag, which select the resources with the same subject and coverage with
the selected resource of this template. In line 24 and 25, “+subject+” and
4.2. TEMPLATE LANGUAGE
33
“+coverage+“ are given the relationship of same subject and coverage by inserting
the subject and coverage values of the page resource to the query. This mechanism
is the heart of relation definition in RDF2HTML tool, and providing the flexibility of
using it. In addition, this nesting ability has no limit theoretically, that means another
conditional tag can be defined in between the opening tag line 21 to 27 and closing
tag line 31.
1:<rdf2html page=* filename=resource\resource_+identifier+.html
2:
condition='select *
3:
from {resource} photo:identifier {identifier},
4:
[{resource} photo:description {description}],
5:
[{resource} photo:format {format}],
6:
[{resource} photo:subject {subject}],
7:
[{resource} photo:coverage {coverage}],
8:
[{resource} photo:title {title}],
9:
[{resource} photo:date {date}],
10:
[{resource} photo:creator {creator}]
11:
using namespace
12:
photo = <http://www.w3.org/2000/PhotoRDF/dc-1-0#>' />
13:<html>
14:<h1>
15:<rdf2html type=text property=title />
16:</h1>
17:<table>
18:<tr><td>
19:Photos that have the same subject and coverage
20:</td></tr>
21:<rdf2html condition='select *
22:
from {resource} photo:identifier {identifier},
23:
{resource} photo:title {title},
24:
{resource} photo:subject {"+subject+"},
25:
{resource} photo:coverage {"+coverage+"}
26:
using namespace
27:
photo = <http://www.w3.org/2000/PhotoRDF/dc-1-0#>' />
28:<tr><td>
29:<rdf2html type=generic template='<a href="$urlprefix$/resource/resource_+identifier+.html" >+title+</27:a>' />
30:</td></tr>
31:</rdf2html>
32:</table>
33:</html>
Figure 4.7. Example of a generic multi-query, multi-page generation template
CHAPTER 4 THE SOLUTION – RDF2HTML TOOL
34
Operation Tag
Consider Figure 4.8 line 14 to 17, it is a link operation tag, which is used to
generate a link in HTML page. The link operation tag act as an example of simple
operation tag, all operation tag are single tag (i.e. the tag is closed by />) and the
type attribute is an essential attribute of every operation tag. This essentiality is due
to the dynamic loading of operation tags into the system. Having this dynamic
loading properties of operation tag, extending the system by adding new operation
tag and defining own operation tag by the user is possible and easy. Back to the
explanation of link tag. Since the structure and attribute names of operation tags are
not restricted, every operation tag can has its own semantic.
14:<rdf2html type=link
15:
linkText=title
16:
href='$urlprefix$/resource/resource_+identifier+.html'
17:
addons='target=resource'/>
Figure 4.8. The operation tag of the example shown in Figure 4.4
Like the link operation tag, the href attribute correspond to the href attribute in <a>
tag in HTML, the linkText attribute correspond to the displaying text of a link in
HTML page. i.e. <a>Link Text</a>. The addons is the string, which need to append
at
the
end
of
the
link
tag.
Consider
the
value
of href attribute,
“$urlprefix$/resource/resource_+identifier+.html”, urlprefix is an attribute defined
in the system configuration file; identifier is the variable define in the query in
opening tag. By the example, it is clear that, the attribute defined in system
configuration file can be used by enclosing with “$” sign, while the string
replacement of query properties is enclosed by “+” sign. This replacement method is
suggested to be used in all operation tag. Acting as an example, the resulting HTML
tag of this link tag may look like Figure 4.9., assuming urlprefix is http://iiivan.noip.com/photo/travel/new_zealand/ and both title and identifier is “img_1738”.
4.2. TEMPLATE LANGUAGE
35
Figure 4.9. An example result of the link operation tag in Figure 4.8.
Other than link tag, RDF2HTML currently provides image tag, text tag, generic
tag and if tag. This may be strange that, only image, link and text tag had been
provided. The reason is they are the initial design and the new generic tag can
replace them. As generic tag can replace image, link and text tag, the detail of image
and text tag will not discuss here. Following will discuss the semantic generic tag
and if tag.
See Figure 4.10 and 4.11, these two demonstrate the semantic of generic tag. For
Figure 4.10, it is a simple generic tag, the template attribute defines the string to
generate and the position of substitution of the variable and user defined variable.
The substitution mechanism is same as described before.
Figure 4.10. A simple example of generic tag
For Figure 4.11, it shows that generic tag can work as a conditional tag, and now
the repeating tag are defined in the template attribute. However, using generic tag as
a conditional tag will not has the nesting ability like conditional tag, as generic tag is
single tag only.
CHAPTER 4 THE SOLUTION – RDF2HTML TOOL
36
Figure 4.11. An example of generic tag with condition attributes
If tag is similar to generic tag, the only different is if tag provides conditional choice
of template. See Figure 4.12. property attribute is a query variable, op attribute
define the operation (possible options, “=”, “<”, “>”, “like”), value attribute is the
comparison value, then attribute is the template choice if comparison return true,
else attribute is for false result. In addition, if tag also has an optional condition
attribute like generic, however, the comparison is for the variable in the condition
query defined in if tag rather than outside if tag. Currently, the comparison only
applies to string.
Figure 4.12. An example of a if tag
4.3. THE DESIGN OF THE SYSTEM
37
4.3. The design of the system
In this section, the design of the system and workflow will be explained. The
RDF2HTML tool aims to generate HTML pages from HTML templates defined by
the user.
4.3.1. Component design
RDF2HTML Tool
RDF/XML
Jena Default
Model Loader
RDF/XML
Jena Ontology
Model Loader
RDF Model Loader
Jena RDQL
query engine
Sesame
Serql/rql
query engine
Query Executor
Template Processor
Operation
Operation
Tag
Tag
Operation
Tag
HTML
Template
Parser
Figure 4.13. Component view of RDF2HTML Tool
The system is consisted of five types of component (see Figure 4.13); they are
RDF model loader, HTML template parser, query executor, template processor
and a group of tag operation. The model loader is responsible for loading the
RDF model from RDF document. The HTML template parser is to parse the
template into a list of nodes. The query executor is to execute the query defined
in the template. The template processor is to generate corresponding HTML
CHAPTER 4 THE SOLUTION – RDF2HTML TOOL
38
pages base on templates and query defined in the template. The tag operation
group is responsible for the tags used in templates. In order to make the system
extensible and flexible a plug-in architecture is designed and implemented.
4.3.1.1.
RDF Model Loader
The RDF Model Loader is responsible for loading the RDF model from RDF
document. During implementation, Factory Pattern was used to make the system
more extensible and flexible in future. In the current version, only Jena RDF
model will be loaded for query. However, extend the system to load different
RDF model using others API or even a RDF database are very easy, it just needs
to write a new plug-in. The detail on writing a new plug-in will be discussed in
section 4.1.1.
4.3.1.1.1. Jena RDF Model
Jena RDF Model is the RDF model provided by the Jena RDF API11. With
Jean RDF API, the RDF model can be loaded easily for query in template
translation process. During the translation process, the RDQL query
language provided by the Jean RDF API can act as a logical rule to select the
data from RDF model, detail can be found in Appendix A.
11
http://jena.sourceforge.net/index.html
4.3.1.2 HTML TEMPLATE PARSER
4.3.1.2.
39
HTML Template Parser
The HTML Template Parser used in the system is using the opensource HTML
Parser12. Choosing opensource HTMLParser rather than write a new parser are
mainly because it provides functions that the system needs, it is fast and free to
use. Hence, HTMLParser is chose to use, to reduce the implementation
workload and to avoid redesign and re-implement the existing well-designed
libraries.
4.3.1.2.1. HTMLParser
HTMLParser is a super-fast real-time parser for real-world HTML. Using
HTMLParser a list of nodes is generated from the template file and this list
will be used in the translation process.
4.3.1.3. Query Executor
The Query Executor is responsible for executing query (i.e. the logical rule)
defined in the template file base on the RDF model loaded from the Model
Loader. To make it extensible and flexible Factory Pattern and the plug-in
architecture is used just like the Model Loader.
12
http://htmlparser.sourceforge.net/
CHAPTER 4 THE SOLUTION – RDF2HTML TOOL
40
4.3.1.4. Template Processor
The Template Processor is responsible for generating HTML pages from the
HTML template files. HTML template file define which data/property should be
presented and how they is presented. Then the template processor processes it
and generates conforming HTML pages.
4.3.1.5. Operation Tag Group
The Operation Tag Group is the tags can be used in the template files. These
tags are responsible for filling the query result into the template. In current
version, link tag, image tag, text tag, generic tag and if tag are the tags provided
in the system. Some example use of these tags can be found in section 4.2.
Template Language
4.3.2. SYSTEM ARCHITECTURE
41
4.3.2. System Architecture
Ab
d
cte
st r a
o ur
ta S
Da
ce
Figure 4.14. Architecture view of RDF2HTML Tool
Data Loading
Consider Figure 4.14, RDF documents and HTML templates are loaded into the
RDF2HTML tool by the RDF Model Loader and HTML Template Parser. After
that, RDF Model Loader and HTML Template Parser passes an Abstracted
Data Source and Template Node List to Template Processor respectively. The
Abstracted Data Source internal data structure is a HashMap using string as key
and object as value, and it is use for the abstraction of data source in plug-in
structure. The template node list is a list holding all nodes of the template as a
list.
CHAPTER 4 THE SOLUTION – RDF2HTML TOOL
42
Generation Process
The process of transforming RDF documents into a set of HTML pages is
defined by the Algorithm 4.1. The inputs are a set of HTML templates and a set
of RDF documents. The output is a set of HTML pages conforming to the
templates.
This process is going through the template list one by one. In each template
operation, the template will fist be parsed into a template node list for the
conversion later. Then the page option is used for switching cases, if page was
“1” then single page convert process would takes place. If page was “*”, the
page query, i.e. the page query define in page directive, in current template
would be executed and the record would be used in the convert process. In this
multi-page generation process, a HTML file would be generated after the
convert process with each record. Lastly, is the default case, this process is
mainly for the template no need to query the RDF model, the convert process
only takes the template node list for conversion. This process only generates
single HTML page for each template.
4.3.2. SYSTEM ARCHITECTURE
43
Algorithm 4.1. Main process of Template Processor for RDF to HTML transformation
Plug-in structure
During the initial design, the plug-in structure was not included in RDF2HTML
tool. The reason is that only RDQL is planned to be used in the initial design.
After continuous research on the internet, two more powerful query languages,
which were SeRQL and RQL, were discovered and used. Thus, there is a need
of refactoring, and this introduces the need of the extensibility of the tool.
CHAPTER 4 THE SOLUTION – RDF2HTML TOOL
44
Before the refactoring, searching on the internet for related solution of extending
software has been done. Finally, the structure of eclipse 13 project had been
referenced to design a plug-in structure in RDF2HTML tool.
See Figure 4.15, it shows the plug-in architecture in RDF2HTML tool. In
RDF2HTML tool, there are two extension points, one is in the RDF Model
Loader, and the other is in the Query Executor. The linkage between RDF
Model Loader and Query Executor is the Abstracted Data Source. As mentioned
before, Abstracted Data source internal data structure is a HashMap with string
as key and object as value. In this plug-in architecture, each plug-in must define
a unique plug-in id, and this id will act as the key value in Abstracted Data
source; the key for the plug-in selection in RDF Model Loader and Query
Executor.
Figure 4.15. Plug-in Architecture view of RDF2HTML tool
13
http://www.eclipse.org/
4.4. CASE APPLICATION
45
In order to write a plug-in, the plug-in must define its unique id first. For
instance, serql, rql and rdql. After decided a plug-in id, one interface –
IModelLoader
and
two
classes
–
AbstractQueryEngine
and
AbstractQueryResultList need to be implemented and extended respectively. In
AstractQueryEngine, the abstract method exec(String queryStr, String[]
distinct), which returns an AbstractQueryResultList, needs to be implemented.
Other then implements the interface and extends the classes, there is one
requirement on the package name. All implemented classes should be had the
package “rdf2html.plugin.”+plug-in id. For instance, “rdf2html.plugin.serql” is
the package name of serql plug-in.
Currently, in RDF2HTML tool there are three plug-ins, they are serql, rql and
rdql plug-ins.
4.4. Case Application
In order to test and evaluate the usability of RDF2HTML tool, a simple web
photo album has been generated. There are a set of 62 traveling photos selected
for this test.
Step 1
First, prepare the RDF document use in this test case. In this test case, there are
two types of resources, one is photo, and the other is the creator of the photo.
Figure 4.16 and 4.17 shows an example of the RDF document describing photo
and creator respectively.
46
CHAPTER 4 THE SOLUTION – RDF2HTML TOOL
Figure 4.16 A RDF document describes a photo.
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:foaf="http://xmlns.com/foaf/0.1/"
xmlns:admin="http://webns.net/mvcb/">
<foaf:PersonalProfileDocument rdf:about="http://iiivan.no-ip.com/profile/">
<foaf:maker rdf:nodeID="me"/>
<foaf:primaryTopic rdf:nodeID="me"/>
<admin:generatorAgent rdf:resource="http://www.ldodds.com/foaf/foaf-a-matic"/>
<admin:errorReportsTo rdf:resource="mailto:leigh@ldodds.com"/>
</foaf:PersonalProfileDocument>
<foaf:Person rdf:nodeID="me">
<foaf:name>Wai-tung Lam</foaf:name>
<foaf:title>Mr</foaf:title>
<foaf:givenname>Wai-tung</foaf:givenname>
<foaf:family_name>Lam</foaf:family_name>
<foaf:nick>Ivan</foaf:nick>
<foaf:mbox_sha1sum>343f2d774144c519f33423dbac18c85856c61b29</foaf:mbox_sha1sum>
<foaf:homepage rdf:resource="http://iiivan.no-ip.com/profile/"/></foaf:Person>
</rdf:RDF>
Figure 4.17 A RDF document describes a creator
4.4. CASE APPLICATION
47
Step 2
After prepared the RDF document, the next step is to design the layout and file
structure of the generated web album. According to the suggested web site
structure mentioned in section 0, the layout will be look like Figure 4.18. In
addition, the file structure will be look like Figure 4.19. Index
Figure 4.18. Layout of case application
Figure 4.19. File structures of case application
CHAPTER 4 THE SOLUTION – RDF2HTML TOOL
48
Step 3
After the layout and file structure designed, the next step is to define the
template, defining template is the same as defining the web site contents. There
are totally twelve templates in this test case, and the query language used in the
test case is SeRQL. In addition, all the templates can be found in Appendix C.
According to the classification in section 4.1.5. there are two templates no need
query RDF model, i.e. index.r2h and heading.r2h; there are five single page
generation
templates,
i.e.
all_index.r2h,
coverage_index.r2h,
creator_index.r2h, date_index.r2h and subject_index.r2h; there are five
generic
multi-query,
groupby_coverage.r2h,
multi-page
generation
groupby_creator.r2h,
templates,
i.e.
groupby_date.r2h,
groupby_subject.r2h and resource_template.r2h.
The index.r2h and heading.r2h templates are responsible for the upper part frame
set and the lower part frame set respectively, thus they do not need to query
RDF model and only string replacement is needed.
It is suggested to define the resource_template before consider all others
templates query RDF model, since the linkage to resource html page is affected
by the filename definition in resource_template.
For resource_template.r2h is to generate a resource html page for each resource.
Thus, a query to select all resources and the desired properties is defined (see
Figure 4.20). In this case application, a relation that the photos have the same
subject and coverage will be linked in the resource html page (see Figure 4.21
for the query). As the linkage to the resource html page will be affected by the
4.4. CASE APPLICATION
49
filename of it, hence the properties chosen for file name formation should not
contain special character that not allows for a file name.
<rdf2html page=* filename=resource\resource_+identifier+.html
condition='select *
from {resource} photo:identifier {identifier},
[{resource} photo:description {description}],
[{resource} photo:format {format}],
[{resource} photo:subject {subject}],
[{resource} photo:coverage {coverage}],
[{resource} photo:title {title}],
[{resource} photo:date {date}],
[{resource} photo:creator {creator}],
[{resource} photo:creator {} rdf:type {foaf:PersonalProfileDocument};
foaf:maker {maker} foaf:name {name}]
using namespace
photo = <http://www.w3.org/2000/PhotoRDF/dc-1-0#>,
foaf = <http://xmlns.com/foaf/0.1/>' />
Figure 4.20. Query selects all resources in resource_template
Figure 4.21. Opening tag for the relation with same subject and coverage
The all_index.r2h is to generate an index file that link to all resource html pages.
Hence the query to select all resources is constructed (see Figure 4.22).
CHAPTER 4 THE SOLUTION – RDF2HTML TOOL
50
Figure 4.22. Query to select all resources and properties for link formation
From groupby_coverage.r2h to groupby_subject.r2h are to generate an indexing
file link to the resource html pages having same selected property value. For
instance, groupby_coverage will select out all the possible coverage property
value. In addition, using this value to link another query to select out the all
resource with the same properties value and then link to that resource html page
(see Figure 4.23).
Figure 4.23. Opening tag of groupby_coverage.r2h
From coverage_index.r2h to subject_index.r2h are to generate an index file that
link to the index file which indexing the resource html pages having same
selected properties value. For instance, coverage_index.r2h will select out all the
possible coverage property value (see Figure 4.24) and use this value to form a
link.
4.4. CASE APPLICATION
51
Figure 4.24. Opening tag of coverage_index.r2h
Step 4
After defining all the templates, the next
step is to setup the System
Configuration file. There are three options need to set, they are rdfSourceDir,
templateSourceDir and outputDir. The name of the option already told what it
did. After setting the system configuration file, run the RDF2HTML tool, then
the generated HTML pages could be found in the output directory. The
generated web site will look like Figure 4.18. In this case application, it shows
that only four steps can generated a web album from a set RDF documents using
RDF2HTML tool.
Chapter 5
Conclusion
5.1. Benefits Obtained
The RDF2HTML tool implemented in this project is successfully providing an
extensible, flexible and easy to use method to publish Semantic Web Content –
RDF resources into a set of HTML pages. In order to use this tool, users do not
need programming skills, they only require the knowledge on query language
and the manipulation of the template tag. For advanced users know
programming skills, they even allowed to write their own tag. For someone who
wants to extend this tool to support other query languages, they can do it easily
because of the benefit of the plug-in structure.
52
5.2 LIMITATIONS
53
Generally, RDF2HTML tool is not only restricted for the transformation of RDF
to HTML, most of the option in transformation is able to customizable, from the
input file extension and the default output file extension to data loading and
query execution. For instance, using this tool, RDF model can convert to XML
files with the use of generic operation tag. Another example is using database as
the data source, then, this tool can be a simple report-generating tool.
5.2. Limitations
After a critical review on the current version of RDF2HTML, two limitations are
discovered. Firstly, some statically data like the number of unlinked pages, empty
link groups and the number of links in the group cannot be collect. The reason to
have this limitation is that the design of RDF2HTML is too generic and the need
of statically data does not be concerned during the design. Secondly, the
relationship linkage between conditional tags does not support properties with
multi-values. The reasons are that the selection of multi-values as a record is not
possible in the query language. Although this selection is a limitation, however,
displaying multi-values of a resource in a resource template is not a limitation.
Using another query to select all the values of the property and linking this query
to the resource by inserting the resource URI in the query can solve the problem.
5.3. Further Work
In order to evaluate the usability of RDF2HTML tool in practice, more work on
testing is needed. Although RDF2HTML tool are easy, extensible and flexible
for the program creator to use, more user experiments and opinion of template
designers can be included in future. Besides, a statical analyzer can be
implemented to log down statical data.
54
CHAPTER 5 CONCLUSION
To sum up, it is no doubt that RDF can describe resources effectively. After
describe resources, the publishing of RDF to internet is became the problem.
While this problem can be solved by the RDF2HTML tool, which implemented
in this project, in an easily way. Then the next problem is how to make RDF
becomes popular and widely use. This problem is mainly affected by the ease of
RDF creation and the standardization of RDF vocabulary (i.e. ontology) in
different domain. As mentioned in chapter 2, the upper level of semantic web
technology depends on the lower which including RDF level and Ontology level.
Hence, in order to make the semantic web succeed, widely use of RDF on the
web and standardized of RDF vocabulary are the key.
Appendix A
Supported Query Language
SeRQL & RQL
SeRQL is provided by the Sesame API. Sesame14 is an open source RDF database
with support for RDF Schema inferencing and querying. Originally, it was developed
by Aduna (then known as Administrator) as a research prototype for the EU
research project On-To-Knowledge. Now, it is further developed and maintained by
Aduna in cooperation with NLnet Foundation, developers from OntoText, and a
number of volunteer developers who contribute ideas, bug reports and fixes.
Sesame has been designed with flexibility in mind. It can be deployed on top of a
variety of storage systems (relational databases, in-memory, file systems, keyword
14
http://www.openrdf.org/
55
indexers, etc.), and offers a large scalar of tools to developers to leverage the power
of RDF and RDF Schema, such as a flexible access API, which supports both local
and remote (through HTTP, SOAP or RMI) access, and several query languages, of
which SeRQL is the most powerful one.
SeRQL Query example: http://www.openrdf.org/sesame/serql/serql-examples.html
SeRQL Query Specification: http://www.openrdf.org/doc/users/ch06.html
RQL Tutorial: http://www.openrdf.org/doc/rql-tutorial.html
RDQL
RDQL is a query language for RDF in Jena models. The idea is to provide a dataoriented query model so that there is a more declarative approach to complement
the fine-grained, procedural Jena API.
RDQL is an implementation of the SquishQL RDF query language, which itself is
derived from rdfDB. This class of query languages regards RDF as triple data,
without schema or ontology information unless explicitly included in the RDF source.
RDF provides a graph with directed edges - the nodes are resources or
literals. RDQL provides a way of specifying a graph pattern that is matched against
the graph to yield a set of matches. It returns a list of bindings - each binding is a set
of name-value pairs for the values of the variables. All variables are bound (there is
no disjunction in the query).
RDQL Programmer’s Introduction:
http://jena.sourceforge.net/tutorial/RDQL/index.html
56
Appendix B
Interim report
57
City University of Hong Kong
Department of Computer Science
BSCS Final Year Project 2003-2004
Interim Report
BSCCS(FT)
Semantic web - RDF to HTML tool
(Volume
1
of
1 )
Student Name
: Lam Wai Tung Ivan
Student No.
: 50316414
Programme Code : BSCCS(FT)
Supervisor
: Dr. Andy CHUN, H W
Date
:
58
For Official Use Only
CS 4512 Project
Semantic Web – RDF to HTML tool
Lam Wai-tung Ivan
BSCCS 50316414
Abstract
The Resource Description Framework (RDF) is used to describe content, such as
HTML pages, graphics, audio files and other documents, for the machines to
interpret on the Semantic Web. However, RDF document is not good for human to
interpret, hence a tool to render RDF content to semantically linked HTML pages
for better human reading is proposed. The method of the rendering is using HTML
templates with tags to define the page layout, semantic linkage and logical rules base
on the RDF data model. A tool is implemented in Java using this method to generate
semantically linked site of HTML pages from RDF data model. As a case application,
a web photo album is generated out.
CS 4512 Project
Semantic Web – RDF to HTML tool
Lam Wai-tung Ivan
BSCCS 50316414
Table of Content
0. Project Title…………………………………………………………………… 1
0. Introduction …………………………………………………………………… 1
1.2. The problem ……………………………………………………………. 2
1.3. Project Objectives ……………………………………………………………..3
1.4. Scope of project work …………………………………………………………4
0. Background Research ………………………………………………………… 5
2.1. Semantic Web ………………………………………………………….. 5
2.2. Extensible Markup Language (XML) ………………………………….. 6
2.5. Extensible Stylesheet Language Transformations (XSLT) ……………..
8
2.6. Streaming Transformations for XML (STX) …………………………...
9
2.3. Resource Definition Framework ……………………………….
6
2.4. RDF Schema ………………………………………………….
8
0. The solution ………………………………………………………………….. 11
4.1. Transforming RDF to HTML ………………………………………… 22
0. The design of the system ...................................................................... 36
Reference ........................................................................................................... 73
2. Project Title
Semantic web - RDF to HTML tool
3. Introduction
Nowadays, Semantic Web is a hot topic in web technology. It provides a
common framework that allows data to be shared and reused across application,
enterprise, and community boundaries. In order to achieve sharing and reusing of
data, the web need to have metadata describing resources, such as web pages,
documents, photos, and real world objects in a machine understandable format. In
other words, it makes a simple extension to the current web by using binary
relationships to capture the meaning between links and data. [9]
The computer industry has agreed and uses XML standards to give a syntactic
structure for describing data. However, XML can be used in many different ways to
describe the same data, this makes it too open and arbitrary to support the type of
widespread and ad hoc data integration envisaged for the Semantic Web. Therefore
a standard syntax - Resource Description Framework (RDF) was introduced. In
Semantic Web, RDF acting an important role since it defines a graph model to
provide a consistent, standardised way of describing and querying internet resources,
from text pages and graphics to audio files and video clips. It gives semantic
interoperability, and provides the base layer for building a Semantic Web. [10]
3.1. The problem
“RDF is for machine read, not for human”
Currently, RDF/XML is the most common way to store RDF data, but XML is not a
good representation structure for human to read. Since RDF is intended for machine
interpretation, it is not a must that RDF need to be human readable. However, the
main objective of a web is to present semantic content in a human readable way.
61
Therefore, a better visual representation is essential towards the success of semantic
web. Hypertext is the most common way to represent content over the Internet now,
so HTML is the best choice to display RDF data.
In fact, this can be done by semantic portal15 which provides dynamic HTML pages.
However, there are some limitations to the content provider when using it. Firstly, it
limits the ontology use, since only content of certain type ontology that it’s defined
can be published. Secondly, the publication is controlled by the portal owner, thus
content provider cannot publish content easily as publishing static web pages if they
did not own a portal application. Thirdly, search engines are difficult to search
dynamic content compare to static content.
In the coming future, more content will be in RDF format. Hence, publishing
semantic web in human readable form easily is the key to success. To achieve this
idea, a tool that transform RDF model to static HTML pages is proposed.
15
e.g. http://ubp.learninglab.uni-hannover.de/EducaNext/ubp/home
62
4. Project Objectives
By doing this report, the following are meant to achieve:
Get familiar with Semantic Web concept.
Learn Extensible Markup Language (XML), Extensible Stylesheet
Language Transformations (XSLT), Streaming Transformations for XML
(STX), Resource Description Framework (RDF), and RDF Schema
(RDFS).
A tool to transform RDF to HTML in Java.
Study the usage and new challenges of RDF.
63
5. Scope of project work
The scope of work will include the followings:
Study publications about Semantic Web concept.
Study the language specifications of XML, XSLT, STX, RDF and RDFS.
Create RDF documents for testing.
Implement a tool to transform RDF to HTML in Java with Jena (a Java
RDF APIs).
A small and semantically linked site of static HTML pages generated by
the tool.
Find out the usage and up coming challenges in RDF.
64
6. Background Research
6.1. Semantic Web
Figure 2. Architecture of Semantic Web16
As Figure 2.2.1.1 show, the Semantic Web is a layered architecture. Each layer is
built on the lower layer and is to provide better capabilities to represent the
knowledge. Up to now, little works has done on the first 3 layers, since the ontology
layer is just become W3C recommendations. For the Universal Resource Identifier
(URI) and Unicode layer, they are well developed for long. XML and Namespaces
layer are widely adopted in web development now. RDF and RDF Schema layer are
recommended by W3C.
In this project, the representation of RDF layer to human readable form will be the
focus.
16
http://www.w3.org/2002/Talks/04-sweb/slide12-3.html
65
6.2. Extensible Markup Language (XML)
XML is a simple, very flexible markup language derived from SGML. XML
originally designed for large-scale electronic publishing and now it is very important
in the exchange of a wide variety of data on the Web and elsewhere. [11]
XML can achieve the exchange of wide variety of data because of it arbitrary
structure. That is, XML does specify neither semantics nor a tag set, and provides a
facility to define tags and the structural relationships between them. Consequently,
the semantics of an XML document can be defined by the applications. Having such
arbitrary structure, XML is used for the upper layers in Semantic Web architecture
to define their own semantics.
6.3. Extensible Stylesheet Language Transformations (XSLT)
XSLT is the most important part of the XSL Standards and it is a W3C
Recommendation. It is the part of XSL that is used to transform an XML document
into another XML document, or another type of document that is recognized by a
browser, like HTML and XHTML. Normally XSLT does this by transforming each
XML element into an (X)HTML element. [12]
XSLT can also add new elements into the output file, or remove elements. It can
rearrange and sort elements, and test and make decisions about which elements to
display, and a lot more. [12]
A common way to describe the transformation process is to say that XSLT
transforms an XML source tree into an XML result tree. [12]
In the transformation process, XSLT uses XPath to define parts of the source
document that match one or more predefined templates. When a match is found,
XSLT will transform the matching part of the source document into the result
66
document. The parts of the source document that do not match a template will end
up unmodified in the result document. [12]
In fact, XSLT can be used to transform RDF/XML document to HTML and there
are some style sheet on the web did it 17 . However, that transformation is more
suitable for a small and simple RDF/XML document but not the whole RDF model.
For example, it is difficult to transform the RDF/XML document with some rules
such as display books related to Classification but not Data Mining. Hence, this
project is proposed. Although XSLT is not suitable for the transformation of
RDF/XML document with rules, it still very useful for the transformation of simple
RDF data.
17
http://rssxpress.ukoln.ac.uk/view.cgi?rss_url=http://journals.iucr.org/a/rss10.xml
67
6.4. Streaming Transformations for XML (STX)
STX is very similar to XSLT; it is a one-pass transformation language for XML
documents. It is intended as a high-speed, low memory consumption alternative to
XSLT. Since it does not require the construction of an in-memory tree, it is suitable
for use in resource constrained scenarios. [13]
STX provides a streaming analog for XSLT by adopting some of the now familiar
concepts from XSLT (e.g., matching based on templates and an XPath 1.0-like
expression language - STXPath) but using SAX as the underlying interface to the
XML document. SAX is the event-oriented sibling of the DOM API which provides
a sequential view of an XML document through a stream of events. [14] This is why
STX is high-speed and low memory consumption compare with XSLT.
The STX transformation is achieved by associating events with templates. A
template pattern is matched against events and their context. The best matching
template is then instantiated to create a part of the result stream. A template is
always instantiated with respect to the current context, a set of additional
information maintained during the transformation. In constructing the result stream,
events from the source stream can be filtered and arbitrary events can be added.
Events can also be reordered using a working storage. [15]
Obviously, the main different between STX and XSLT is the underlying interface to
the XML document. XSLT use DOM API while STX use SAX. Hence, STX is
faster and lower memory consumption than XSLT. However, it does not mean that
SAX is better than XSLT. As the streaming character of STX, only current node
and its ancestors accessible, while random access to all data in the document in
XSLT is allowed. A sentence can describe the differences clearly - “XSLT like a
book while STX like reading”
68
6.5. Resource Definition Framework (RDF)
“Resources have Properties with Values”
RDF is a framework provides a consistent, standardised way of describing and
querying internet resources, from text pages and graphics to audio files and video
clips. It expresses relations between objects, gives syntactic interoperability, and
provides the base layer for building a Semantic Web. [10]
To describe the relations between objects and achieve consistency, RDF defines a
model for representing resources, properties and properties values. In fact RDF data
model is independent of the representation method, in other words, it is syntaxneutral. In other words, RDF data can be represented in many formats such as
RDF/XML, N3, N-triples etc.
Consider the following example:
1: <?xml version="1.0"?>
2: <rdf:RDF
3: xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
4: xmlns:cd="http://www.recshop.fake/cd">
5: <rdf:Description
6: rdf:about="http://www.recshop.fake/cd/Hide your heart">
7:
<cd:price>9.90</cd:price>
8: </rdf:Description>
9: </rdf:RDF>
From the RDF/XML document above, we can represent the data model in Triples,
Directed Graph and Sentence as shown below:
Subject (Resource)
Predicate (Property)
http://www.recshop.fake/cd/Hide
http://www.recshop.fake/cdprice
your heart
Table 2. Triples of the Data Model
69
Object (Value)
"9.90"
RDF data model can be represented using directed graphs, see Figure 2.2.3.1. In the
figure, subject is represented by node, the property is represented by a directed arc
and the property value is represented by rectangles. In the node and on the arc,
Universal Resource Identifier (URI)18 – http://www.recshop.fake/cd/Hide your heart
and http://www.recshop.fake/cdprice are used. In RDF, URI is used to describe all
the objects to give them a unique and universal means of identification.
Figure 3. Graph of the data model
The sentence: "The CD Hide your heart costs $9.90"
RDF provides unambiguous semantic of statement assertion to be carried in web
documents which is not possible using pure XML. Hence, it becomes the base layer
for building a Semantic Web.
6.6. RDF Schema (RDFS)
RDF needs a way to define application-specific classes and properties. Applicationspecific classes and properties must be defined using extensions to RDF. One such
extension is RDF Schema. [16]
RDF Schema does not provide actual application-specific classes and properties.
Instead RDF Schema provides the framework to describe application-specific classes
and properties. [16]
Classes in RDF Schema are much like classes in object oriented programming
languages. This allows resources to be defined as instances of classes, and subclasses
of classes. [16]
18
http://www.w3.org/Addressing/
70
7. The solution
7.1. Transforming RDF to HTML
To facilitate the transformation, RDF data model is considered as several conceptual
levels. Firstly, data level - the actual data/resource in RDF statement. For instance,
The CD Hide your heart costs $9.90, CD Hide your heart here is the data. Secondly,
metadata level – the property of the resource in RDF statement. For instance, CD
Hide your heart costs $9.90, costs here is the metadata. Thirdly, ontology level - the
vocabulary used at the metadata level is defined by an RDF Schema. For instance,
the schema indicates that the artist is an instance of the class “Human”. Fourthly,
logic rule level - semantic relations between the resources in the data model. For
instance, a binary relation between two CDs may be that they are same Artist, same
company but different cost.
Figure 4. Transforming RDF data model to HTML pages
Figure 4 shows the transformation of RDF to HTML. The RDF data model is on the
left. On the right, the Home page has links to various Index pages classifying the
underlying RPages that are related with each other by semantic links.
Home page defines the entrance page to the RDF model. It is typically defined by a
HTML page that contains frames to show Index Pages and Resource Pages.
Resource pages display one resource such as an ontological concept (e.g., a class)
or a piece of resource with its property and value. Each resource that is intended to
71
be shown to the end-user has a RPage of its own.
Index pages classify RPages in conceptual hierarchical way.
7.2.
The design of the system
Figure 5. System overview
Currently, the system is designed to parse the RDF document into a Jena RDF
model. And the HTML template processor grabs the corresponding resources
from the model base on the HTML template and fills into the template to
generate HTML Pages.
In the HTML templates, tags are planned to use for some basic rules
representation like select all CD resources belongs to Artist Andy Lau.
72
Reference
l
Cited Reference in the report:
[1] Tim, Berners-Lee. (2002). The Semantic Web - A Simple Extension to the
Current Web. [Online]. W3C. Available:
http://www.w3.org/2002/Talks/04-sweb/slide6-1.html [2004, Sep]
[2] Introduction to Semantic Web Technologies: Standard Syntax – RDF.
[Online]. HP Labs. Available: http://www.hpl.hp.com/semweb/swtechnology.htm# Standard%20Syntax%20-%20RDF
[3] Extensible Markup Language (XML) – Introduction. [Online]. W3C.
Available: http://www.w3.org/XML/ [2004, Sep]
[4] Introduction to XSLT. [Online]. W3 Schools. Available:
http://www.w3schools.com/xsl/xsl_intro.asp [2004, Oct]
[5] Streaming Transformations for XML (STX) . [Online]. Available:
http://stx.sourceforge.net/ [2004, Oct]
[6] Becker, Oliver., Brown, Oliver. and Cimprich, Petr. (2003, 26 February).
An Introduction to Streaming Transformations for XML. [Online].
O’Reilly xml.com Available:
http://www.xml.com/pub/a/2003/02/26/stx.html [2004, Oct]
[7] Becker, Oliver. Et al. (2004, 1 July). STX transformation language
specification. [Online]. http://stx.sourceforge.net/documents/spec-stx20040701.html [2004, Oct]
[8] RDF Schema. [Online]. W3 Schools. Available:
http://www.w3schools.com/rdf/rdf_schema.asp [2004, Sep]
73
l
Reference books:
Hjelm, Johan. (2001). Creating the semantic Web with RDF: professional
developer’s guide. USA: John Wiley & Sons, Inc.
Shelley Powers. (2003). Practical RDF. USA: O’Reilly
l
Reference web resources:
Joost. [Online]. Joost. Available: http://joost.sourceforge.net/ [2004, Nov]
Dave Beckett, ed. (2004, 2 February). RDF/XML Syntax Specification
(Revised). [Online]. W3C. Available: http://www.w3.org/TR/2004/REC-rdfsyntax-grammar-20040210/ [2004, Sep]
74
Appendix C
Case Application Templates
all_index.r2h
<rdf2html page=1 filename=index\all_index.html />
<html>
<p><h3>
All photos
</h3></p>
<table>
<rdf2html condition='select *
from {resource} photo:identifier {identifier},
{resource} photo:title {title}
using namespace
photo = <http://www.w3.org/2000/PhotoRDF/dc-1-0#>'
orderBy=title distinct=identifier>
<tr><td>
<rdf2html type=link
linkText=title
href='$urlprefix$/resource/resource_+identifier+.html'
addons='target=resource'/>
</td></tr>
</rdf2html>
75
</table>
</html>
coverage_index.r2h
<rdf2html page=1 filename=index\coverage_index.html />
<html>
<p><h3>
Coverages index
</p>
<table>
<rdf2html condition='select *
from {resource} photo:coverage {coverage}
using namespace
photo = <http://www.w3.org/2000/PhotoRDF/dc-1-0#>'
orderBy=coverage distinct=coverage>
<tr><td>
<rdf2html type=link linkText=coverage
href='$urlprefix$/index/index_+coverage+.html' />
</td></tr>
</rdf2html>
</table>
</html>
creator_index.r2h
<rdf2html page=1 filename=index\creator_index.html />
<html>
<p><h3>
Creators index
</h3></p>
<table>
<rdf2html condition='select *
from {resource} photo:creator {} rdf:type
{foaf:PersonalProfileDocument};
foaf:maker {maker} foaf:name {name}
using namespace
photo = <http://www.w3.org/2000/PhotoRDF/dc-1-0#>,
foaf = <http://xmlns.com/foaf/0.1/>'
orderBy=creator distinct=creator>
<tr><td>
<rdf2html type=link linkText=name href='$urlprefix$/index/index_+name+.html' />
</td></tr>
</rdf2html>
</table>
</html>
76
date_index.r2h
<rdf2html page=1 filename=index\date_index.html />
<html>
<p><h3>
Dates index
</h3></p>
<table>
<rdf2html condition='select *
from {resource} photo:date {date}
using namespace
photo = <http://www.w3.org/2000/PhotoRDF/dc-1-0#>'
orderBy=date distinct=date>
<tr><td>
<rdf2html type=link linkText=date href='$urlprefix$/index/index_+date+.html' />
</td></tr>
</rdf2html>
</table>
</html>
groupby_coverage.r2h
<rdf2html page=* condition='select distinct coverage
from {resource} photo:coverage {coverage}
using namespace
photo = <http://www.w3.org/2000/PhotoRDF/dc-1-0#>'
filename=index\index_+coverage+.html/>
<html>
<p><h3>
Photos took at <rdf2html type=text property=coverage />
</h3></p>
<table>
<rdf2html type=generic
condition='select *
from {resource} photo:coverage {"+coverage+"},
{resource} photo:identifier {identifier},
{resource} photo:title {title}
using namespace photo = <http://www.w3.org/2000/PhotoRDF/dc-10#>'
orderBy=title ascending=true
template='<tr><td><a href="$urlprefix$/resource/resource_+identifier+.html"
target=resource >+title+</a></td>
<td>+title+</td></tr>'>
</table><br>
<rdf2html type=generic template='<a
href="$urlprefix$/index/coverage_index.html">coverage index</a>' />
</html>
77
groupby_creator.r2h
<rdf2html page=* condition='select distinct name
from {resource} photo:creator {} rdf:type
{foaf:PersonalProfileDocument};
foaf:maker {maker} foaf:name {name}
using namespace
photo = <http://www.w3.org/2000/PhotoRDF/dc-1-0#>,
foaf = <http://xmlns.com/foaf/0.1/>'
filename=index\index_+name+.html/>
<html>
<p><h3>
Photos took by <rdf2html type=text property=name />
</h3></p>
<table>
<rdf2html type=generic
condition='select *
from {resource} photo:creator {} rdf:type
{foaf:PersonalProfileDocument};
foaf:maker {maker} foaf:name {"+name+"},
{resource} photo:identifier {identifier},
{resource} photo:title {title}
using namespace
photo = <http://www.w3.org/2000/PhotoRDF/dc-1-0#>,
foaf = <http://xmlns.com/foaf/0.1/>'
orderby=title ascending=true
template='<tr><td><a href="$urlprefix$/resource/resource_+identifier+.html"
target=resource >+title+</a></td>
<td>+title+</td></tr>'>
</table>
<br>
<rdf2html type=generic template='<a
href="$urlprefix$/index/creator_index.html">creator index</a>' />
</html>
78
groupby_date.r2h
<rdf2html page=* condition='select distinct date
from {resource} photo:date {date}
using namespace
photo = <http://www.w3.org/2000/PhotoRDF/dc-1-0#>'
filename=index\index_+date+.html/>
<html>
<p><h3>
Photos took on <rdf2html type=text property=date />
</h3></p>
<table>
<rdf2html type=generic
condition='select *
from {resource} photo:date {"+date+"},
{resource} photo:identifier {identifier},
{resource} photo:title {title}
using namespace photo = <http://www.w3.org/2000/PhotoRDF/dc-10#>'
orderBy=title ascending=true
template='<tr><td><a href="$urlprefix$/resource/resource_+identifier+.html"
target=resource >+title+</a></td>
<td>+title+</td></tr>'>
</table>
<br>
<rdf2html type=generic template='<a href="$urlprefix$/index/date_index.html">date
index</a>' />
</html>
79
groupby_subject.r2h
<rdf2html page=* condition='select distinct subject
from {resource} photo:subject {subject}
using namespace
photo = <http://www.w3.org/2000/PhotoRDF/dc-1-0#>'
filename=index\index_+subject+.html/>
<html>
<p><h3>
<rdf2html type=text property=subject /> photos
</h3></p>
<table>
<rdf2html type=generic
condition='select *
from {resource} photo:subject {"+subject+"},
{resource} photo:identifier {identifier},
{resource} photo:title {title}
using namespace photo = <http://www.w3.org/2000/PhotoRDF/dc-10#>'
orderby=title ascending=true
template='<tr><td><a href="$urlprefix$/resource/resource_+identifier+.html"
target=resource >+title+</a></td>
<td>+title+</td></tr>'>
</table>
<br>
<rdf2html type=generic template='<a
href="$urlprefix$/index/subject_index.html">subject index</a>' />
</html>
80
heading.r2h
<html>
<table>
<tr><td colspan='3'>
<h2>A photo album generated by rdf2html ...</h2>
</td></tr>
<tr>
<td><rdf2html type=generic template='<a
href="$urlprefix$/index/coverage_index.html" target="groupby" >' /> coverage
link</a></td>
<td><rdf2html type=generic template='<a
href="$urlprefix$/index/creator_index.html" target="groupby" >' /> creator
link</a></td>
<td><rdf2html type=generic template='<a href="$urlprefix$/index/date_index.html"
target="groupby" >' /> date link</a></td>
<td><rdf2html type=generic template='<a
href="$urlprefix$/index/subject_index.html" target="groupby" >' /> subject
link</a></td>
</tr>
</table>
</html>
index.r2h
<html>
<frameset rows="20%,80%">
<rdf2html type=generic template='<frame name=heading
src="$urlprefix$/heading.html">' />
<frameset cols="15%,15%,70%">
<rdf2html type=generic template='<frame name=all
src="$urlprefix$/index/all_index.html">' />
<rdf2html type=generic template='<frame name=groupby
src="$urlprefix$/index/coverage_index.html">' />
<frame name=resource src= >
</frameset>
</frameset>
</html>
81
resource_template.r2h
<rdf2html page=* filename=resource\resource_+identifier+.html
condition='select *
from {resource} photo:identifier {identifier},
[{resource} photo:description {description}],
[{resource} photo:format {format}],
[{resource} photo:subject {subject}],
[{resource} photo:coverage {coverage}],
[{resource} photo:title {title}],
[{resource} photo:date {date}],
[{resource} photo:creator {creator}],
[{resource} photo:creator {} rdf:type
{foaf:PersonalProfileDocument};
foaf:maker {maker} foaf:name {name}]
using namespace
photo = <http://www.w3.org/2000/PhotoRDF/dc-1-0#>,
foaf = <http://xmlns.com/foaf/0.1/>' />
<html>
<h1>
<rdf2html type=text property=title />
</h1>
<rdf2html type=if property=subject op=like value=Panorama
then='<img src=+resource+ width=1280 height=480>'
else='<img src=+resource+ width=640 height=480>' />
<table>
<tr>
<td>Creator:</td><td><rdf2html type=generic template='<a
href="+creator+"+>+name+</a>' /></td>
</tr><tr>
<td>Description:</td><td><rdf2html type=text property=description /></td>
</tr><tr>
<td>Coverage:</td><td><rdf2html type=text property=coverage /></td>
</tr><tr>
<td>Date:</td><td><rdf2html type=text property=date /></td>
</tr><tr>
<td>Subject:</td><td><rdf2html type=text property=subject /></td>
</tr><tr>
<td>Format:</td><td><rdf2html type=text property=format /></td>
</tr><tr>
<td>
<rdf2html type=generic template='<a href="$urlprefix$/rdf/+identifier+.rdf" >
<img src="$urlprefix$/img/rdfdownload.png" border="0"
alt="rdf download"></a>' />
</td>
</tr>
</table>
<table>
82
<tr><td>
Photos that have the same subject and coverage
</td></tr>
<rdf2html condition='select *
from {resource} photo:identifier {identifier},
{resource} photo:title {title},
{resource} photo:subject {"+subject+"},
{resource} photo:coverage {"+coverage+"}
where not resource like "*+resource+*"
using namespace
photo = <http://www.w3.org/2000/PhotoRDF/dc-1-0#>'
orderBy=title >
<tr><td>
<rdf2html type=generic template='<a
href="$urlprefix$/resource/resource_+identifier+.html" >+title+</a>' />
</td></tr>
</rdf2html>
</table>
</html>
subject_index.r2h
<rdf2html page=1 filename=index\subject_index.html />
<html>
<p><h3>
Subjects index
</h3></p>
<table>
<rdf2html condition='select *
from {resource} photo:subject {subject}
using namespace
photo = <http://www.w3.org/2000/PhotoRDF/dc-1-0#>'
orderBy=subject distinct=subject>
<tr><td>
<rdf2html type=link linkText=subject href='$urlprefix$/index/index_+subject+.html'
/>
</td></tr>
<rdf2html type=generic template='<tr><td>one of the photo on +subject+ is
+resource+</td></tr>' />
</rdf2html>
</table>
</html>
83
SystemConfig.xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<!--the configuration file for rdf2html-->
<properties>
<!--the file extension of rdf file in regular expression-->
<entry key="rdfExt">.*\.[rR][dD][fF]</entry>
<!--the file extension of template file in regular expression-->
<entry key="templateExt">.*\.[rR]2[hH]</entry>
<!--the default file extension of output file (only for template without page option)->
<entry key="defaultOutputExt">.html</entry>
<!--the operation tag to match. a / will be added at the front for closing tag-->
<entry key="matchingTag">rdf2html</entry>
<!--the supported query type are rdql, rql and serql currently-->
<entry key="supportQueryType">rdql,rql,serql</entry>
<!--the default query type-->
<entry key="defaultQueryType">serql</entry>
<!--the directory holding the rdf files-->
<entry key="rdfSourceDir">F:\My
Documents\FYP\soft\work\rdf2html\resources\rdf\</entry>
<!--the directory holding the template files-->
<entry key="templateSourceDir">F:\My
Documents\FYP\soft\work\rdf2html\resources\r2h\</entry>
<!--the output html directory-->
<entry key="outputDir">F:\TEMP\output\</entry>
<!--the option to overwrite existing file. default is true-->
<entry key="overwrite">true</entry>
<!--the following variables are the user defined variables for simple string
replacement use.-->
<!--usage example: -->
<!--<entry key="urlprefix">file:///f:/temp/output</entry>-->
<!--$urlprefix$ will be replace to file:///f:/temp/output in the template-->
<!--the url prefix u can define urself-->
<entry key="urlprefix">file:///f:/temp/output</entry>
</properties>
84
Bibliography
l
Cited Reference in the report:
[9] Tim, Berners-Lee. (2002). The Semantic Web - A Simple Extension to the
Current Web. [Online]. W3C. Available:
http://www.w3.org/2002/Talks/04-sweb/slide6-1.html [2004, Sep]
[10] Introduction to Semantic Web Technologies: Standard Syntax – RDF.
[Online]. HP Labs. Available: http://www.hpl.hp.com/semweb/swtechnology.htm# Standard%20Syntax%20-%20RDF
[11] Extensible Markup Language (XML) – Introduction. [Online]. W3C.
Available: http://www.w3.org/XML/ [2004, Sep]
[12] Introduction to XSLT. [Online]. W3 Schools. Available:
http://www.w3schools.com/xsl/xsl_intro.asp [2004, Oct]
[13] Streaming Transformations for XML (STX) . [Online]. Available:
http://stx.sourceforge.net/ [2004, Oct]
85
[14] Becker, Oliver., Brown, Oliver. and Cimprich, Petr. (2003, 26 February).
An Introduction to Streaming Transformations for XML. [Online].
O’Reilly xml.com Available:
http://www.xml.com/pub/a/2003/02/26/stx.html [2004, Oct]
[15] Becker, Oliver. Et al. (2004, 1 July). STX transformation language
specification. [Online]. http://stx.sourceforge.net/documents/spec-stx20040701.html [2004, Oct]
[16] RDF Schema. [Online]. W3 Schools. Available:
http://www.w3schools.com/rdf/rdf_schema.asp [2004, Sep]
[17] BrownSauce RDF Browser. [Online]. BrownSauce RDF Browser.
Available: http://brownsauce.sourceforge.net [2004, Oct]
[18] Spectacle. [Online]. Aduna. Available: http://aduna.biz/products/spectacle/
[2005, March]
[19] Alison, Cawsey.(2002, 5 July). Presenting tailored resource descriptions:
Will XSLT do the job?. [Online]. Department of Computing and Electrical
Engineering heriot-Watt University. Available:
http://www.cee.hw.ac.uk/~alison/www9/paper.html [2004, Sep]
[20] Arttu, Valo. Eero, Hyvönen., Kim, Viljanen., Markus, Holi. (2004, 6
October). Publishing Semantic Web Content as Semantically Linked
HTML Pages. [Online]. Department of Computer Science, University of
Helsinki. Available:
http://www.cs.helsinki.fi/u/eahyvone/publications/xmlfinland2003/swehg_
article_xmlfi2003.pdf [2004, Oct. 6].
86
l
Reference books:
Hjelm, Johan. (2001). Creating the semantic Web with RDF: professional
developer’s guide. USA: John Wiley & Sons, Inc.
Shelley Powers. (2003). Practical RDF. USA: O’Reilly
l
Reference web resources:
Joost. [Online]. Joost. Available: http://joost.sourceforge.net/ [2004, Nov]
Dave, Beckett., ed. (2004, 2 February). RDF/XML Syntax Specification
(Revised). [Online]. W3C. Available: http://www.w3.org/TR/2004/REC-rdfsyntax-grammar-20040210/ [2004, Sep]
Dave, Beckett., ed. (2004, 10 February). RDF Vocabulary Description
Language 1.0: RDF Schema. [Online]. W3C. Available:
http://www.w3.org/TR/rdf-schema/ [2004, Sep]
Tim, Berners-Lee. (1998, 17 September). What the Semantic Web can
represent. [Online]. W3C. Available:
http://www.w3.org/DesignIssues/RDFnot.html [2004, Sep]
Azad, Bolour., (July 3, 2003).Notes on the Eclipse Plug-in Architecture.
[Online]. Eclipse.org. Available: http://www.eclipse.org/articles/Article-Plugin-architecture/plugin_architecture.html [2005, Jan]
87
Download