City University of Hong Kong Department of Computer Science BSCCS/BSCS Final Year Project Report 2004-2005 (04CS019) Semantic web - RDF to HTML tool (Volume 1 Student Name : Lam Wai Tung Student No. : 50316414 of 1 ) Programme Code : BSCCS(FT) Supervisor : Dr. CHUN, H W Andy 1st Reader : Dr. Liu, W Y 2nd Reader : Dr. Ip, Horace For Official Use Only This page is intentionally left blank. Semantic Web – RDF to HTML tool Lam Wai Tung Ivan Department of Computer Science City University of Hong Kong Supervisor: Dr Chun H W Andy April 7, 2005 Abstract The Resource Description Framework (RDF) helps to describe content, such as HTML pages, graphics, audio files and other documents. It assists the machines to interpret the Semantic Web. However, RDF document is not a perfect form for presentation. That is, it is not easy to interpret, hence a tool to render RDF content to semantically linked HTML pages for better human reading is proposed. The method of the rendering is using HTML templates with tags to define the page layout, semantic linkage and logical rules which is based on the RDF data model. A tool named RDF2HTML tool is designed and then implemented in Java with the use of a specially designed template structure to generate semantically linked HTML pages from RDF data model. In order to provide extensibility and flexibility to the tool, a plug-in structure is introduced. To work as a case application, RDF2HTML tool provides an effective mean to generated web photo album. ii Acknowledgments First of all, I would like to give special thank for the Department of Computer Science. Without its support, I will not have such a precious opportunity to experience a real research project. Then, I would like to extend a very special thanks to my supervisor, Dr. CHUN, Hon Wai Andy for giving me valuable advices and for his continuous support throughout the process. Lastly, special thanks to two readers, Dr. Liu, Wen Yin and Dr. Ip, H S Horace for their helps and assistances. Contents 1. 2. 3. Introduction 1 …………………………………………………………………….. 1 1.1 Background 1.2 Motivation – The Problem of RDF ………………………………………………… 2 1.3 Project Objectives …………………………………………………………………. 3 1.4 Scope of project ……………………………………………………………… 4 Background Research 5 2.1 Semantic Web …………………………………………………………………….... 5 2.2 Extensible Markup Language ……………………………………………… 6 2.3 Resource Definition Framework ……………………………………………. 6 2.4 RDF Schema ……………………………………………………………….. 8 2.5 Extensible Stylesheet Language Transformations (XSLT) ………………………… 8 2.6 Streaming Transformations for XML (STX) ……………………………………… Related Works 3.1 9 11 BrownSauce ………………………………………………………………………… 11 3.1.1 BrownSauce limitations ……………………………………………………… 12 3.2 Spectacle…………………………………………………………………………… 13 3.2.1 Spectacle limitations …………………………………………………………. 15 3.3 Mirador…………………………………………. ………………………………… 16 3.3.1 Mirador demonstration limitations …………………………………………… 19 3.4 SWeHG ……………………………………………………………………………… 20 3.4.1 SWeHG limitations ………………………………………………………….. 21 iii 4. The Solution – RDF2HTML Tool 4.1 22 Design the transformation of RDF to HTML ……………………………………… 22 4.1.1 Design issue – The Conceptual views …………………………… 23 4.1.2 Design issue – Handling RDF document ………………………… 24 4.1.3 Design issue – Defining the page context ………………………… 25 4.1.4 Design issue – System flow 4.1.5 Design issue – Template Structure ……………………………………… 25 ……………………………… 26 4.2 Template Language ………………………………………………………………… 29 4.3 The design of the system 4.3.1 Component design 4.3.1.1 …………………………………………………………. 37 …………………………………………………………. 37 RDF Model Loader …………………………………………….. 38 4.3.1.1.1 Jena RDF Model 4.3.1.2 HTML Template Parser …………………………………….. 38 ………………………………………… 39 4.2.1.2.1 HTMLParser………………………………………… Query Executor 4.3.1.4 Template Processor 4.3.1.5 Operation Tag Group 4.4 Case Application 5. ………………………………………………… 39 4.3.1.3 4.3.2 System Architecture …………………………………………….. 40 …………………………………………... 40 ………………………………………………………. 41 …………………………………………………………………… 45 Conclusion 5.1 Benefits Obtained 39 52 ………………………………………………………………….. 52 5.2 Limitations………………………………………………………………………… 53 5.3 Further Work ………………………………………………………………………… 53 Appendix A 55 Supported Query Language ………………………………………………………………. 55 Appendix B 57 Interim Report ……………………………………………………………………………. 58 Appendix C 75 Case Application Templates ………………………………………………………………. 75 Bibliography 85 1 Chapter 1 Introduction 1.1. Background Nowadays, Semantic Web is a hot topic in web technology. It provides a common framework, which allows data to be shared and reused across application, enterprise, and community boundaries. In order to achieve sharing and reusing of data, metadata is and essential ancillary to describe resources, such as web pages, documents, photos, and real world objects in a machine understandable format. In other words, it creates a simple extension to the current web by using binary relationships to capture the meaning between links and data. [9] Most of the computer industry has agreed and used XML standards to give a syntactic structure to describe data. However, XML can be used in many different ways to describe the same data; as a result, it becomes too open and arbitrary to 1 2 CHAPTER 1 INTRODUCTION support the type of widespread and ad hoc data integration envisaged for the Semantic Web. To tackle this issue, a standard syntax - Resource Description Framework (RDF) was introduced. In Semantic Web, RDF acts as an important role since it defines a graph model to provide a consistent and standardized way of describing and querying internet resources, from text pages and graphics to audio files and video clips. It not only offers semantic interoperability, but also provides the base layer for building a Semantic Web. [10] 1.2. Motivation - The Problem of RDF Currently, RDF/XML is the most common way to store RDF data, but XML is not a good representation structure for human to read. It is not necessary that RDF need to be human readable, since RDF is intended for machine interpretation. However, the main objective of a web is to present semantic content in a human readable way. Therefore, a better visual representation is essential towards the success of semantic web. Hypertext is the most common way to represent content over the Internet nowadays, so HTML is the best choice to display RDF data. In fact, this can be done by semantic portal1, which provides dynamic HTML pages. However, there are still some limitations to the content provider when using it. Firstly, it limits the ontology use, since only content of certain type ontology, which is defined, can be published. Secondly, the publication is controlled by the portal owner, thus content provider is not able to publish particular content easily as publishing static web pages if they never own a portal application. Thirdly, it is hard to search dynamic content compare to static content in search engines. 1 e.g. http://ubp.learninglab.uni-hannover.de/EducaNext/ubp/home 1.3. PROJECT OBJECTIVES 3 The truth is that more content will be in RDF format in the coming future. Hence, publishing semantic web in human readable form easily is the key to success. To achieve this mission, a tool - RDF2HTML which is manage to transform RDF model to static HTML pages is proposed. 1.3. Project Objectives By doing this project, the following are meant to achieve: l To get familiar with Semantic Web concept. l To learn Extensible Markup Language (XML), Extensible Stylesheet Language Transformations (XSLT), Streaming Transformations for XML (STX), Resource Description Framework (RDF), and RDF Schema (RDFS). l To implement a tool – RDF2HTML to transform RDF to HTML in Java. l To provide an easy to use template structure. 4 CHAPTER 1 INTRODUCTION 1.4. Scope of project work The scope of work will include the followings: l Study the language specifications of XML, XSLT, STX, RDF and RDFS. l Study and evaluate the existing solution on the transformation of RDF to HTML. l Create RDF documents for testing. l Design the template structure. l Design and implement a tool to transform RDF to HTML in Java with Jena (a Java RDF APIs). l Estimate a small and semantically linked site of static HTML pages, which is generated by the RDF2HTML tool. Chapter 2 Background Research 2.1. Semantic Web Figure 2.2.1.1. Architecture of Semantic Web2 2 http://www.w3.org/2002/Talks/04-sweb/slide12-3.html 5 6 CHAPTER 2 BACKGROUND RESEARCH As Figure 2.2.1.1 reveals, the Semantic Web is based on a layered architecture. Each layer is built on the top of the lower layer. It provides better capabilities to represent the knowledge. Until the modern era, little works has been done on the first three layers, since the ontology layer has just become W3C recommendations. For the Universal Resource Identifier (URI) and Unicode layer, they are well developed for a long time. Moreover, XML and Namespaces layer are widely adopted in web development presently. Finally yet importantly, both RDF and RDF Schema layer are recommended by W3C. In this project, the focus is put onto the representation of RDF layer in human readable form. 2.2. Extensible Markup Language (XML) XML is a simple and flexible markup language, which is derived from SGML. XML is originally designed for large-scale electronic publishing and now it is very important in the exchange of a wide variety of data on the Web and elsewhere. [11] XML is able to achieve the exchange of wide variety of data because of it arbitrary structure. That is, XML does specify neither semantics nor a tag set, and provides a facility to define tags and the structural relationships between them. Consequently, the semantics of an XML document can be defined by the applications. Having such an arbitrary structure, XML is, therefore, used for the upper layers in Semantic Web architecture to define their own semantics. 2.3. Resource Definition Framework (RDF) RDF is a framework, which provides a consistent as well as standardized way to describe and query internet resources, from text pages and graphics to audio files and video clips. It expresses the relations between objects, gives syntactic interoperability, and provides the base layer for building up a Semantic Web. [10] 2.3 RESOURCE DEFINITION FRAMEWORK (RDF) 7 To describe the relations between objects and achieve consistency, RDF defines a model for representing resources, properties and properties values. In fact RDF data model is independent of the representation method, in other words, it is syntaxneutral. Thus, RDF data can be represented in many formats such as RDF/XML, N3, N-triples etc. Consider the following example: 1: <?xml version="1.0"?> 2: <rdf:RDF 3: xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 4: xmlns:cd="http://www.recshop.fake/cd"> 5: <rdf:Description 6: rdf:about="http://www.recshop.fake/cd/Hide your heart"> 7: <cd:price>9.90</cd:price> 8: </rdf:Description> 9: </rdf:RDF> From the RDF/XML document above, the data model can be represented in Triples, Directed Graph and Sentence as shown below: Subject (Resource) Predicate (Property) http://www.recshop.fake/cd/Hide http://www.recshop.fake/cdprice your heart Object (Value) "9.90" Table 2.1. Triples of the Data Model RDF data model can be represented by using directed graphs, see Figure 2.2.3.1. In the figure, subject is represented by node, the property is represented by a directed arc and the property value is represented by rectangles. In the node and on the arc, Universal Resource Identifier (URI)3 – http://www.recshop.fake/cd/Hide your heart and http://www.recshop.fake/cdprice are used. In RDF, URI is used to describe all the objects to give them a unique and universal means of identification. 3 http://www.w3.org/Addressing/ CHAPTER 2 BACKGROUND RESEARCH 8 Figure 2.2.3.1. Graph of the data model The sentence: "The CD Hide your heart costs $9.90" RDF provides unambiguous semantic of statement assertion which can be carried in web documents, nonetheless, this can hardly be done by utilizing pure XML. Hence, it becomes the base layer for building a Semantic Web. 2.4. RDF Schema (RDFS) RDF needs a way to define application-specific classes and properties. Applicationspecific classes and properties must be defined using extensions to RDF. One of such extension is RDFS. RDFS does not provide actual application-specific classes and properties. Instead, RDF Schema provides the framework to describe application-specific classes and properties. Classes in RDF Schema are similar to classes in object oriented programming languages. This allows resources to be defined as instances of classes, and subclasses of classes. [16] 2.5. Extensible Stylesheet Language Transformations (XSLT) XSLT is the most essential part in the XSL Standards and it is a W3C Recommendation. It is the part of XSL that is used to transform an XML document into another XML document, or another type of document that is recognized by a browser, like HTML and XHTML. Normally XSLT does this by transforming each XML element into an (X)HTML element. 2.6 STREAMING TRANSFORMATIONS FOR XML 9 XSLT can also add new elements into the output file, or remove elements. It is able to rearrange and sort elements, as well as test and make decisions about which elements to be displayed and a lot more. A common way to describe the transformation process is to say that XSLT transforms an XML source tree into an XML result tree. During the transformation process, XSLT uses XPath to define parts of the source document that match one or more predefined templates. On one hand, if a match was found, XSLT will transform the matching part of the source document into the result document. On the other hand, if the parts of the source document that did not match a template will end up unmodified in the result document. [12] In fact, XSLT can be used to transform RDF/XML document to HTML and there are some style sheet on the web did it 4 . However, that transformation is more suitable for a small and simple RDF/XML document instead of the whole RDF model. For example, it is difficult to transform the RDF/XML document with some rules such as display books, which is related to Classification but not Data Mining. Hence, this project is proposed. Although XSLT is least suitable for the transformation of RDF/XML Model with rules, it is still very useful for the transformation of simple RDF data. 2.6. Streaming Transformations for XML (STX) STX is almost similar to XSLT; it is a one-pass transformation language for XML documents. It is intended as a high-speed, low memory consumption alternative to XSLT. Since it does not require the construction of an in-memory tree, it is suitable for use in resource constrained scenarios. [13] 4 http://rssxpress.ukoln.ac.uk/view.cgi?rss_url=http://journals.iucr.org/a/rss10.xml 10 CHAPTER 2 BACKGROUND RESEARCH STX provides a streaming analog for XSLT by adopting some of the recent familiar concepts from XSLT (e.g., matching based on templates and an XPath 1.0-like expression language - STXPath) but using SAX as the underlying interface to the XML document. SAX is the event-oriented sibling of the DOM API, which provides a sequential view of an XML document through a stream of events. [14] That is why STX is high-speed and low memory consumption when comparing with XSLT. The STX transformation is achieved by associating various events with templates. A template pattern is matched against events and their context. The best matching template is then instantiated to create a part of the result stream. A template is always instantiated with respect to the current context, a set of additional information maintained during the transformation. In constructing the result stream, events from the source stream can be filtered and arbitrary events can be added. Events can also be reordered using a working storage. [15] Obviously, the main different between STX and XSLT is the underlying interface to the XML document. XSLT uses DOM API while STX uses SAX. As a result, STX is faster and lower memory consumption than XSLT. However, it does not mean that SAX is better than XSLT. It is because the streaming character of STX, only current node and its ancestors accessible, while random access to all data in the document in XSLT is allowed. A sentence can describe the differences clearly “XSLT like a book while STX like reading” Chapter 3 Related Works 3.1. BrownSauce BrownSauce 5 is a generic RDF browser. It was written by Damian Steer whilst employed at HP Labs Bristol. It is freesoftware, released under a BSD style licence. 5 http://brownsauce.sourceforge.net 11 CHAPTER 3 RELATED WORKS 12 BrownSauce breaks the problem into two parts: coarse-graining (breaking the data down into usable chunks, like "information about person X") and aggregation (making those chunks from multiple sources). The first part is done, and users can browse more than one source by using rdfs:seeAlso references. Aggregation is currently being worked on.[17] Figure 0.1.1. A screenshot of BrownSauce 3.1.1. BrownSauce limitations Even though BrownSauce is a nice generic RDF browser, it does not allow user to customize the HTML presentation page expect styling it using CSS. In addition, it does not provide function for user to control the data to be seen. These are some functions that BrownSauce does not provide. 3.2 SPECTACLE 13 Although BrownSauce is not the perfect solution to the representation problem of RDF, it is a nice and easy to use RDF browser for general use. 3.2. Spectacle Aduna Spectacle 6 provides a powerful way of finding information on Aduna Metadata servers7 . It combines the effectiveness and flexibility of full-text search with the ease-of-use of faceted navigation: the ability to find information based on properties such as its location, file type, modification dates, author, etc. Spectacle uses Guided Exploration technology to guide the user through large information environments by continuously offering contextual hints for further exploration and by preventing "dead ends": all links you see in a Spectacle navigation structure are guaranteed to lead to information, so the user is never let down by "zero hits".[18] 6 7 http://aduna.biz/products/spectacle/ http://aduna.biz/products/metadataserver/ 14 CHAPTER 3 RELATED WORKS Figure 3.2. A screenshot of Spectacle 3.2.1 SPECTACLE LIMITATIONS 15 Figure 3.3. The architecture of Spectacle 3.2.1. Spectacle limitations Spectacle did a good job on the RDF to HTML transformation, it provides good representation of the metadata. Nevertheless, the architecture shows (see Figure 3.3), a Metadata Server is involved, and this indicates that the user needs to handle the metadata server in order to present metadata. In addition, the RDF to HTML transformation in Spectacle is based on APIs. Thus, the user needs to write programs that use the API. With the above reasons, Spectacle is targeted on enterprise users other than simple users. CHAPTER 3 RELATED WORKS 16 3.3. Mirador The full name of Mirador 8 is Multimedia Information Retrieval Aided by Descriptions of Online Resources, this project is aimed at investigating how the user can better be supported in searching for resources on the WWW, by exploiting metadata associated with a resource. In Mirador initial demonstration9, there is an attempt on using XSLT with RDF. In this demonstration, it shows that transforming RDF/XML to HTML by using XSLT is possible. The following is the example from Mirador initial demonstration, Figure 3.4 shows the RDF example, Figure 3.5 is part of the style sheet, which is used to transform RDF/XML to HTML, Figure 3.6 represents the transformation result.[19] 8 9 http://www.cee.hw.ac.uk/~mirador/ http://www.cee.hw.ac.uk/~mirador/demos.html 3.3 MIRADOR 17 Figure 3.4. Simple RDF example Figure 3.5. Partial style sheet for displaying RDF CHAPTER 3 RELATED WORKS 18 Description of: http://www.dlib.org Title D-Lib Program - Research in Digital Libraries The D-Lib program supports the community of Description people with research interests in digital libraries and electronic publishing. Publisher Corporation For National Research Initiatives Date 1995-01-07 Research; statistical methods Subject Type Format Education, research, related topics World Wide Web Home Page text/html Language en Figure 3.6. Example metadata table result transform using XSLT 3.3.1. MIRADOR DEMONSTRATION LIMITATIONS 19 3.3.1. Mirador demonstration limitations Considering this example, some limitations of which using XSLT to transform RDF to HTML directly have shown. Firstly, using XSLT to transform RDF/XML to HTML, can transform into one HTML page only in each time. Therefore, if the transformation of each resource in RDF model was needed, the only way was to divide the RDF model into a set of RDF/XML documents which only one resource is contained. However, dividing the RDF model into a set of RDF/XML documents introduce another limitation, that is the lost of relationship between RDF resources. For instance, the relationship that both resources had the same creator cannot be shown in this approach. Secondly, using XSLT to transform RDF to HTML directly is only can be used on RDF/XML, however, in section 2.3 - Resource Definition Framework (RDF) has been mentioned, RDF can be represented in many formats such as RDF/XML, N3, N-triples etc. Other than the limitation, the usability of using XSLT to transform RDF to HTML creates also a problem, Figure 3.5 shows the complexity of using XSLT. To conclude, despite XSLT is powerful, its complexity is also high. 20 CHAPTER 3 RELATED WORKS 3.4. SWeHG SWeHG10 full names is Semantic Web HTML Generator. The goal of SWeHG is the same as this project. The goal of SWeHG is to provide a "poor man's" publication tool for the Semantic Web. Hence, it aims to generate a semantically linked and conceptually indexed static HTML page site from an RDF(S) repository. See Figure 3.7, the internal architecture of SWeHG. As mentioned in the paper explaining SWeHG. It separates the transformation of RDF to HTML into two levels. One is the HTML level; a layout template, which can be designed by a layout designer, specifies the layout of the rendered HTML pages. In addition, Programming skills are not needed in this level. The other level is the RDF level; the semantic linkage between the pages is determined using logical predicates. These predicates define the semantics of the tags used on the HTML level and an application programmer provides the definitions. [20] 10 http://www.cs.helsinki.fi/group/seco/swehg/ 3.4.1. SWEHG LIMITATIONS 21 Figure 3.7. The Internal architecture of SWeHG 3.4.1. SWeHG limitations SWeHG really did a good job on the transformation of RDF to HTML. It has provided a good conceptual view and structure for the transformation of RDF to HTML. Since the proposed web site structure is simple and meets the key role of classification and indexing in view-based searching, this web site structure is used in RDF2HTML tool. Although SWeHG is so good, the complex architecture of it also increases the complexity of using it. For instance, user needs to know XSLT or has a layout designer in order to design the layout of generated HTML. Although XSLT is a powerful tool to transform XML document, it is also a complex language to use it. In addition, an application programmer should provide the definitions of logical predicates. This means user need to program when a new logical predicate is needed. Obviously, using SWeHG can achieve the transformation of RDF to HTML, however, the user need to have the logic, Prolog knowledge and programming skills to define the semantics tags and relation, also XSLT knowledge for the layout design of HTML pages. These knowledge and skills requirements of using SWeHG, reduce the usability of the tool. Chapter 4 The Solution – RDF2HTML Tool 4.1. Design the transformation of RDF to HTML There are several objectives which RDF2HTML tool are wanted to achieve. Firstly, define a web site structure for the transformation of RDF documents to HTML pages. Secondly, makes the tool independent of the RDF format. Thirdly, provides an easy way to define the page context. Lastly, the template should be highly flexible, easy to learn and use. The following are the design issues in order to achieve the above objectives. 22 4.1.1. DESIGN ISSUE – THE CONCEPTUAL VIEWS 23 4.1.1. Design issue – The Conceptual views After studied the approach of SWeHG in the transformation of RDF to HTML. The conceptual views of its suggestion are used [20]. To facilitate the transformation, RDF data model is considered as several conceptual levels. Firstly, data level - the actual data/resource in RDF statement. For instance, The CD Hide your heart costs $9.90, CD Hide your heart here is the data. Secondly, metadata level – the property of the resource in RDF statement. For instance, CD Hide your heart costs $9.90, costs here is the metadata. Thirdly, an RDF Schema defines ontology level - the vocabulary used at the metadata level. For instance, the schema indicates that the artist is an instance of the class “Human”. Fourthly, logic rule level - semantic relations between the resources in the data model. For instance, a binary relation between two CDs may be that they are same Artist, same company but different cost. Figure 4.1. Transforming RDF data model to HTML pages CHAPTER 4 THE SOLUTION – RDF2HTML TOOL 24 Figure 4.1 shows the transformation of RDF to HTML. The RDF data model is on the left. On the right, the Home page has links to various Index pages classifying the underlying RPages that are related with each other by semantic links. Home page defines the entrance page to the RDF model. A HTML page that contains frames to show Index Pages and Resource Pages typically defines it. Resource pages display one resource such as an ontological concept (e.g., a class) or a piece of resource with its property and value. Each resource that is intended to be shown to the end-user has an RPage of its own. Index pages classify RPages in conceptual hierarchical way. Making the tool more flexible, the above structure is just a suggestion, user can have there own structural design. 4.1.2. Design issue – Handling RDF document As mentioned in section 2.3, RDF defines a model for representing resources, properties and properties values and this model should independent of the representation method, in other words, it should be syntax-neutral. Hence, RDF data can be represented in RDF/XML, N3, N-triples etc. Knowing that RDF model should be syntax-neutral, hence, the RDF2HTML tool should be able to handle different format of RDF document. Therefore, an abstraction of the RDF model is needed. As a result, Jena RDF API Model is 4.1.3. DESIGN ISSUE – DEFINING THE PAGE CONTEXT 25 used for the abstraction in the initial version of RDF2HTML tool. However, this is not the best solution, since it needs to change the existing program in order to support new format of RDF. Therefore, the ultimate solution – the plug-in structure is introduced. Details on the plug-in structure can be found in section 4.1.1. About the ontology level, i.e. the RDF Schema, the Jena RDF API also supports loading RDF Schema into the RDF Model. 4.1.3. Design issue – Defining the page context In RDF2HTML tool, the definition of the page context is defined by a query. The reasons to use query language to define the page context are the robustness of the query language, the ease of learning and the ease of reusability. In current version, there are three types of query language supported. They are RDQL supported by Jena RDF API, a RQL supported by Sesame RDF API and SeRQL supported by Sesame RDF API. 4.1.4. Design issue – System flow The approach used to transform RDF document to HTML pages in RDF2HTML tool is simple. Firstly, all the RDF documents are treated as RDF model and loaded into the system. Secondly, several queries are included in the HTML template to define the context of the HTML pages. Then the template processor takes the RDF model and the HTML template to generate the conforming HTML pages. This is a brief description of RDF2HTML tool system flow. CHAPTER 4 THE SOLUTION – RDF2HTML TOOL 26 It is obvious that the HTML template is the heart of RDF2HTML tool as it defines the page context and layout. However, writing the template, programming skills are not needed. The skills that needed are writing the query and manipulating the tag, details are covered in the section 4.2. 4.1.5. Design issue – Template Structure The need of a template Before the design of the template, the need of a template is considered. Generally, the need of a template is the separation of the data and the display layout, and then fills in different data to the template to provide data specific documents. Specifically, one of the needs of template in RDF2HTML tool is to specify the displaying HTML layout for the generation of a set of HTML pages from a RDF model. The other need is to define and display the semantic relation between RDF resources in the template. After the need of the template is considered, the features of the data, which is RDF Model in this project, is consider next. RDF features As mentioned in section 2.3, RDF is a framework provides a consistent, standardized way of describing and querying internet resources. It is clear that RDF is metadata-describing resource; hence, most of the properties value can be treated as simple string or URI to other resource. Knowing this feature of RDF, the template is designed to use simple string replacement method to replace the variable in the template with the corresponding property value in RDF model. 4.1.5. DESIGN ISSUE – TEMPLATE STRUCTURE 27 Template types During the design of the template, the flexibility and the ease of learning and using are the main concerns. In the initial design (see Figure 4.2), the template can be classified into four types. Firstly, the page no need query RDF model but still need some operations provided by RDF2HTML tool, e.g. string replacement. Secondly, single page generation template, this template would query RDF model while only one HTML page will be generated. Thirdly, single query multi-page generation template, one of the examples of this template type is the resource page template. Clearly, resource page template is the template to generate the HTML pages such that each HTML page describe conforming resource. In addition, the query used is to select all resources, which want to present, in the RDF model. Lastly, the group by property multi-page generation template, the use of this template is to generate an Index page that link to resources, which fulfill the group by condition, for instance, the resources that has the same creator. CHAPTER 4 THE SOLUTION – RDF2HTML TOOL 28 Figure 4.2. Template type classification (initial version) After the initial version had implemented, a review on the template structure and functionality had been taking out. As last, the need of a new template type is discovered (see Figure 4.3). The new template type – generic multi-query, multi-page generation template, is a generic template type, which can replace the old single query multi-page generation template and group by property multi-page generation template. 4.2. TEMPLATE LANGUAGE 29 HTML template Page NO need query RDF model Page need query RDF model Single page generation template Generic multiquery, multi-page generation template Figure 4.3. Template type classification (reviewed version) 4.2. Template Language In the HTML template, the page properties, the layout and the page context are defined. In addition, all these parameters help the generation of the HTML pages. The following is a example of HTML template to generate an index page with link to all resource pages. CHAPTER 4 THE SOLUTION – RDF2HTML TOOL 30 1:<rdf2html page=1 filename=index\all_index.html /> 2:<html> 3:<p><h3> 4:All photos 5:</h3></p> 6:<table> 7:<rdf2html condition='select * 8: from {resource} photo:identifier {identifier}, 9: {resource} photo:title {title} 10: using namespace 11: photo = <http://www.w3.org/2000/PhotoRDF/dc-1-0#>' 12: orderBy=title distinct=identifier> 13:<tr><td> 14:<rdf2html type=link 15: linkText=title 16: href='$urlprefix$/resource/resource_+identifier+.html' 17: addons='target=resource'/> 18:</td></tr> 19:</rdf2html> 20:</table> 21:</html> Figure 4.4. HTML template example to generate index page to all resource page Page Directive See Figure 4.4, it is an example of single page generation template. Consider line1, it is the page directive of this HTML template; this tag defines the template properties. There are several attributes can be set. Firstly, page attribute to indicate this template generate how many pages. The options of page attribute are “1” and “*”. Obviously, “1” indicates this template only generates one page. While “*“ means the template generates multi-pages. Secondly, the filename attribute, this attribute define the filename of the generated HTML page(s). The filename attribute must be a valid filename on the operating system and it is a relative to the output directory option set in the system configuration. If the page option is “1”, the filename attribute indicate the absolute file name of the generated HTML page. For example, “index\all_index.html” indicate the generated HTML page will locate at “<output directory>\index\” folder and have the file name all_index.html. 4.2. TEMPLATE LANGUAGE 31 For the page option is “*” (See Figure 4.5), the filename attribute becomes a variable, without using a automatic sequence number to act as the file name, a unique property of the resource is suggested to be used in the file name formation. Consider Figure 4.5 the value of filename is “resource\resource_+identifier+.html” the variable identifier enclosed by “+” sign is the variable defined by the query in the condition attribute. This condition attribute follows the description in the next section Conditional Tag. In short, it is the query to select the resource for the template and when it is defined in the page directive, each record in the result set is corresponding to a new HTML page. In addition, for the template that no need to query RDF model, the page directive can be omitted. Figure 4.5 Page directive example of multi-page template CHAPTER 4 THE SOLUTION – RDF2HTML TOOL 32 Conditional Tag Consider Figure 4.4 line 7 to line 19, line 7 to line 12 is the opening tag while line 19 is the closing tag. Within the opening tag and the closing tag, all tags are treated as repeating tag. Figure 4.6. The opening tag of the example shown in Figure 4.4 Firstly, consider the opening tag. The use of opening tag is to select out the properties need to be presented in the repeating tag. Hence, a attribute named condition exist which takes the query to define the data represent in the repeating tag. As mentioned in section 4.1.3, RDF2HTML support RDQL, RQL and SeRQL currently. In this example, SeRQL is used. This query is used to select all resources, which has identifier and title properties. In addition, the variable name resource represents the resource URI, identifier and title represents value of identifier and title property respectively. Some examples use of SeRQL, RQL and RDQL can be found in Appendix A. About the orderBy and distinct attributes, these two attributes are to support the sorting function and the distinct function of the result set, since not all supported query languages support order by function and only SeRQL support distinct function. The orderBy function is sort by ascending in default, if sort by descending is desired, the attribute ascending need set to false. After the explanation of simple conditional tag, the next concern is the nesting ability of conditional tag. See Figure 4.7, line 21 to 27 is the opening tag of the conditional tag, which select the resources with the same subject and coverage with the selected resource of this template. In line 24 and 25, “+subject+” and 4.2. TEMPLATE LANGUAGE 33 “+coverage+“ are given the relationship of same subject and coverage by inserting the subject and coverage values of the page resource to the query. This mechanism is the heart of relation definition in RDF2HTML tool, and providing the flexibility of using it. In addition, this nesting ability has no limit theoretically, that means another conditional tag can be defined in between the opening tag line 21 to 27 and closing tag line 31. 1:<rdf2html page=* filename=resource\resource_+identifier+.html 2: condition='select * 3: from {resource} photo:identifier {identifier}, 4: [{resource} photo:description {description}], 5: [{resource} photo:format {format}], 6: [{resource} photo:subject {subject}], 7: [{resource} photo:coverage {coverage}], 8: [{resource} photo:title {title}], 9: [{resource} photo:date {date}], 10: [{resource} photo:creator {creator}] 11: using namespace 12: photo = <http://www.w3.org/2000/PhotoRDF/dc-1-0#>' /> 13:<html> 14:<h1> 15:<rdf2html type=text property=title /> 16:</h1> 17:<table> 18:<tr><td> 19:Photos that have the same subject and coverage 20:</td></tr> 21:<rdf2html condition='select * 22: from {resource} photo:identifier {identifier}, 23: {resource} photo:title {title}, 24: {resource} photo:subject {"+subject+"}, 25: {resource} photo:coverage {"+coverage+"} 26: using namespace 27: photo = <http://www.w3.org/2000/PhotoRDF/dc-1-0#>' /> 28:<tr><td> 29:<rdf2html type=generic template='<a href="$urlprefix$/resource/resource_+identifier+.html" >+title+</27:a>' /> 30:</td></tr> 31:</rdf2html> 32:</table> 33:</html> Figure 4.7. Example of a generic multi-query, multi-page generation template CHAPTER 4 THE SOLUTION – RDF2HTML TOOL 34 Operation Tag Consider Figure 4.8 line 14 to 17, it is a link operation tag, which is used to generate a link in HTML page. The link operation tag act as an example of simple operation tag, all operation tag are single tag (i.e. the tag is closed by />) and the type attribute is an essential attribute of every operation tag. This essentiality is due to the dynamic loading of operation tags into the system. Having this dynamic loading properties of operation tag, extending the system by adding new operation tag and defining own operation tag by the user is possible and easy. Back to the explanation of link tag. Since the structure and attribute names of operation tags are not restricted, every operation tag can has its own semantic. 14:<rdf2html type=link 15: linkText=title 16: href='$urlprefix$/resource/resource_+identifier+.html' 17: addons='target=resource'/> Figure 4.8. The operation tag of the example shown in Figure 4.4 Like the link operation tag, the href attribute correspond to the href attribute in <a> tag in HTML, the linkText attribute correspond to the displaying text of a link in HTML page. i.e. <a>Link Text</a>. The addons is the string, which need to append at the end of the link tag. Consider the value of href attribute, “$urlprefix$/resource/resource_+identifier+.html”, urlprefix is an attribute defined in the system configuration file; identifier is the variable define in the query in opening tag. By the example, it is clear that, the attribute defined in system configuration file can be used by enclosing with “$” sign, while the string replacement of query properties is enclosed by “+” sign. This replacement method is suggested to be used in all operation tag. Acting as an example, the resulting HTML tag of this link tag may look like Figure 4.9., assuming urlprefix is http://iiivan.noip.com/photo/travel/new_zealand/ and both title and identifier is “img_1738”. 4.2. TEMPLATE LANGUAGE 35 Figure 4.9. An example result of the link operation tag in Figure 4.8. Other than link tag, RDF2HTML currently provides image tag, text tag, generic tag and if tag. This may be strange that, only image, link and text tag had been provided. The reason is they are the initial design and the new generic tag can replace them. As generic tag can replace image, link and text tag, the detail of image and text tag will not discuss here. Following will discuss the semantic generic tag and if tag. See Figure 4.10 and 4.11, these two demonstrate the semantic of generic tag. For Figure 4.10, it is a simple generic tag, the template attribute defines the string to generate and the position of substitution of the variable and user defined variable. The substitution mechanism is same as described before. Figure 4.10. A simple example of generic tag For Figure 4.11, it shows that generic tag can work as a conditional tag, and now the repeating tag are defined in the template attribute. However, using generic tag as a conditional tag will not has the nesting ability like conditional tag, as generic tag is single tag only. CHAPTER 4 THE SOLUTION – RDF2HTML TOOL 36 Figure 4.11. An example of generic tag with condition attributes If tag is similar to generic tag, the only different is if tag provides conditional choice of template. See Figure 4.12. property attribute is a query variable, op attribute define the operation (possible options, “=”, “<”, “>”, “like”), value attribute is the comparison value, then attribute is the template choice if comparison return true, else attribute is for false result. In addition, if tag also has an optional condition attribute like generic, however, the comparison is for the variable in the condition query defined in if tag rather than outside if tag. Currently, the comparison only applies to string. Figure 4.12. An example of a if tag 4.3. THE DESIGN OF THE SYSTEM 37 4.3. The design of the system In this section, the design of the system and workflow will be explained. The RDF2HTML tool aims to generate HTML pages from HTML templates defined by the user. 4.3.1. Component design RDF2HTML Tool RDF/XML Jena Default Model Loader RDF/XML Jena Ontology Model Loader RDF Model Loader Jena RDQL query engine Sesame Serql/rql query engine Query Executor Template Processor Operation Operation Tag Tag Operation Tag HTML Template Parser Figure 4.13. Component view of RDF2HTML Tool The system is consisted of five types of component (see Figure 4.13); they are RDF model loader, HTML template parser, query executor, template processor and a group of tag operation. The model loader is responsible for loading the RDF model from RDF document. The HTML template parser is to parse the template into a list of nodes. The query executor is to execute the query defined in the template. The template processor is to generate corresponding HTML CHAPTER 4 THE SOLUTION – RDF2HTML TOOL 38 pages base on templates and query defined in the template. The tag operation group is responsible for the tags used in templates. In order to make the system extensible and flexible a plug-in architecture is designed and implemented. 4.3.1.1. RDF Model Loader The RDF Model Loader is responsible for loading the RDF model from RDF document. During implementation, Factory Pattern was used to make the system more extensible and flexible in future. In the current version, only Jena RDF model will be loaded for query. However, extend the system to load different RDF model using others API or even a RDF database are very easy, it just needs to write a new plug-in. The detail on writing a new plug-in will be discussed in section 4.1.1. 4.3.1.1.1. Jena RDF Model Jena RDF Model is the RDF model provided by the Jena RDF API11. With Jean RDF API, the RDF model can be loaded easily for query in template translation process. During the translation process, the RDQL query language provided by the Jean RDF API can act as a logical rule to select the data from RDF model, detail can be found in Appendix A. 11 http://jena.sourceforge.net/index.html 4.3.1.2 HTML TEMPLATE PARSER 4.3.1.2. 39 HTML Template Parser The HTML Template Parser used in the system is using the opensource HTML Parser12. Choosing opensource HTMLParser rather than write a new parser are mainly because it provides functions that the system needs, it is fast and free to use. Hence, HTMLParser is chose to use, to reduce the implementation workload and to avoid redesign and re-implement the existing well-designed libraries. 4.3.1.2.1. HTMLParser HTMLParser is a super-fast real-time parser for real-world HTML. Using HTMLParser a list of nodes is generated from the template file and this list will be used in the translation process. 4.3.1.3. Query Executor The Query Executor is responsible for executing query (i.e. the logical rule) defined in the template file base on the RDF model loaded from the Model Loader. To make it extensible and flexible Factory Pattern and the plug-in architecture is used just like the Model Loader. 12 http://htmlparser.sourceforge.net/ CHAPTER 4 THE SOLUTION – RDF2HTML TOOL 40 4.3.1.4. Template Processor The Template Processor is responsible for generating HTML pages from the HTML template files. HTML template file define which data/property should be presented and how they is presented. Then the template processor processes it and generates conforming HTML pages. 4.3.1.5. Operation Tag Group The Operation Tag Group is the tags can be used in the template files. These tags are responsible for filling the query result into the template. In current version, link tag, image tag, text tag, generic tag and if tag are the tags provided in the system. Some example use of these tags can be found in section 4.2. Template Language 4.3.2. SYSTEM ARCHITECTURE 41 4.3.2. System Architecture Ab d cte st r a o ur ta S Da ce Figure 4.14. Architecture view of RDF2HTML Tool Data Loading Consider Figure 4.14, RDF documents and HTML templates are loaded into the RDF2HTML tool by the RDF Model Loader and HTML Template Parser. After that, RDF Model Loader and HTML Template Parser passes an Abstracted Data Source and Template Node List to Template Processor respectively. The Abstracted Data Source internal data structure is a HashMap using string as key and object as value, and it is use for the abstraction of data source in plug-in structure. The template node list is a list holding all nodes of the template as a list. CHAPTER 4 THE SOLUTION – RDF2HTML TOOL 42 Generation Process The process of transforming RDF documents into a set of HTML pages is defined by the Algorithm 4.1. The inputs are a set of HTML templates and a set of RDF documents. The output is a set of HTML pages conforming to the templates. This process is going through the template list one by one. In each template operation, the template will fist be parsed into a template node list for the conversion later. Then the page option is used for switching cases, if page was “1” then single page convert process would takes place. If page was “*”, the page query, i.e. the page query define in page directive, in current template would be executed and the record would be used in the convert process. In this multi-page generation process, a HTML file would be generated after the convert process with each record. Lastly, is the default case, this process is mainly for the template no need to query the RDF model, the convert process only takes the template node list for conversion. This process only generates single HTML page for each template. 4.3.2. SYSTEM ARCHITECTURE 43 Algorithm 4.1. Main process of Template Processor for RDF to HTML transformation Plug-in structure During the initial design, the plug-in structure was not included in RDF2HTML tool. The reason is that only RDQL is planned to be used in the initial design. After continuous research on the internet, two more powerful query languages, which were SeRQL and RQL, were discovered and used. Thus, there is a need of refactoring, and this introduces the need of the extensibility of the tool. CHAPTER 4 THE SOLUTION – RDF2HTML TOOL 44 Before the refactoring, searching on the internet for related solution of extending software has been done. Finally, the structure of eclipse 13 project had been referenced to design a plug-in structure in RDF2HTML tool. See Figure 4.15, it shows the plug-in architecture in RDF2HTML tool. In RDF2HTML tool, there are two extension points, one is in the RDF Model Loader, and the other is in the Query Executor. The linkage between RDF Model Loader and Query Executor is the Abstracted Data Source. As mentioned before, Abstracted Data source internal data structure is a HashMap with string as key and object as value. In this plug-in architecture, each plug-in must define a unique plug-in id, and this id will act as the key value in Abstracted Data source; the key for the plug-in selection in RDF Model Loader and Query Executor. Figure 4.15. Plug-in Architecture view of RDF2HTML tool 13 http://www.eclipse.org/ 4.4. CASE APPLICATION 45 In order to write a plug-in, the plug-in must define its unique id first. For instance, serql, rql and rdql. After decided a plug-in id, one interface – IModelLoader and two classes – AbstractQueryEngine and AbstractQueryResultList need to be implemented and extended respectively. In AstractQueryEngine, the abstract method exec(String queryStr, String[] distinct), which returns an AbstractQueryResultList, needs to be implemented. Other then implements the interface and extends the classes, there is one requirement on the package name. All implemented classes should be had the package “rdf2html.plugin.”+plug-in id. For instance, “rdf2html.plugin.serql” is the package name of serql plug-in. Currently, in RDF2HTML tool there are three plug-ins, they are serql, rql and rdql plug-ins. 4.4. Case Application In order to test and evaluate the usability of RDF2HTML tool, a simple web photo album has been generated. There are a set of 62 traveling photos selected for this test. Step 1 First, prepare the RDF document use in this test case. In this test case, there are two types of resources, one is photo, and the other is the creator of the photo. Figure 4.16 and 4.17 shows an example of the RDF document describing photo and creator respectively. 46 CHAPTER 4 THE SOLUTION – RDF2HTML TOOL Figure 4.16 A RDF document describes a photo. <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:admin="http://webns.net/mvcb/"> <foaf:PersonalProfileDocument rdf:about="http://iiivan.no-ip.com/profile/"> <foaf:maker rdf:nodeID="me"/> <foaf:primaryTopic rdf:nodeID="me"/> <admin:generatorAgent rdf:resource="http://www.ldodds.com/foaf/foaf-a-matic"/> <admin:errorReportsTo rdf:resource="mailto:leigh@ldodds.com"/> </foaf:PersonalProfileDocument> <foaf:Person rdf:nodeID="me"> <foaf:name>Wai-tung Lam</foaf:name> <foaf:title>Mr</foaf:title> <foaf:givenname>Wai-tung</foaf:givenname> <foaf:family_name>Lam</foaf:family_name> <foaf:nick>Ivan</foaf:nick> <foaf:mbox_sha1sum>343f2d774144c519f33423dbac18c85856c61b29</foaf:mbox_sha1sum> <foaf:homepage rdf:resource="http://iiivan.no-ip.com/profile/"/></foaf:Person> </rdf:RDF> Figure 4.17 A RDF document describes a creator 4.4. CASE APPLICATION 47 Step 2 After prepared the RDF document, the next step is to design the layout and file structure of the generated web album. According to the suggested web site structure mentioned in section 0, the layout will be look like Figure 4.18. In addition, the file structure will be look like Figure 4.19. Index Figure 4.18. Layout of case application Figure 4.19. File structures of case application CHAPTER 4 THE SOLUTION – RDF2HTML TOOL 48 Step 3 After the layout and file structure designed, the next step is to define the template, defining template is the same as defining the web site contents. There are totally twelve templates in this test case, and the query language used in the test case is SeRQL. In addition, all the templates can be found in Appendix C. According to the classification in section 4.1.5. there are two templates no need query RDF model, i.e. index.r2h and heading.r2h; there are five single page generation templates, i.e. all_index.r2h, coverage_index.r2h, creator_index.r2h, date_index.r2h and subject_index.r2h; there are five generic multi-query, groupby_coverage.r2h, multi-page generation groupby_creator.r2h, templates, i.e. groupby_date.r2h, groupby_subject.r2h and resource_template.r2h. The index.r2h and heading.r2h templates are responsible for the upper part frame set and the lower part frame set respectively, thus they do not need to query RDF model and only string replacement is needed. It is suggested to define the resource_template before consider all others templates query RDF model, since the linkage to resource html page is affected by the filename definition in resource_template. For resource_template.r2h is to generate a resource html page for each resource. Thus, a query to select all resources and the desired properties is defined (see Figure 4.20). In this case application, a relation that the photos have the same subject and coverage will be linked in the resource html page (see Figure 4.21 for the query). As the linkage to the resource html page will be affected by the 4.4. CASE APPLICATION 49 filename of it, hence the properties chosen for file name formation should not contain special character that not allows for a file name. <rdf2html page=* filename=resource\resource_+identifier+.html condition='select * from {resource} photo:identifier {identifier}, [{resource} photo:description {description}], [{resource} photo:format {format}], [{resource} photo:subject {subject}], [{resource} photo:coverage {coverage}], [{resource} photo:title {title}], [{resource} photo:date {date}], [{resource} photo:creator {creator}], [{resource} photo:creator {} rdf:type {foaf:PersonalProfileDocument}; foaf:maker {maker} foaf:name {name}] using namespace photo = <http://www.w3.org/2000/PhotoRDF/dc-1-0#>, foaf = <http://xmlns.com/foaf/0.1/>' /> Figure 4.20. Query selects all resources in resource_template Figure 4.21. Opening tag for the relation with same subject and coverage The all_index.r2h is to generate an index file that link to all resource html pages. Hence the query to select all resources is constructed (see Figure 4.22). CHAPTER 4 THE SOLUTION – RDF2HTML TOOL 50 Figure 4.22. Query to select all resources and properties for link formation From groupby_coverage.r2h to groupby_subject.r2h are to generate an indexing file link to the resource html pages having same selected property value. For instance, groupby_coverage will select out all the possible coverage property value. In addition, using this value to link another query to select out the all resource with the same properties value and then link to that resource html page (see Figure 4.23). Figure 4.23. Opening tag of groupby_coverage.r2h From coverage_index.r2h to subject_index.r2h are to generate an index file that link to the index file which indexing the resource html pages having same selected properties value. For instance, coverage_index.r2h will select out all the possible coverage property value (see Figure 4.24) and use this value to form a link. 4.4. CASE APPLICATION 51 Figure 4.24. Opening tag of coverage_index.r2h Step 4 After defining all the templates, the next step is to setup the System Configuration file. There are three options need to set, they are rdfSourceDir, templateSourceDir and outputDir. The name of the option already told what it did. After setting the system configuration file, run the RDF2HTML tool, then the generated HTML pages could be found in the output directory. The generated web site will look like Figure 4.18. In this case application, it shows that only four steps can generated a web album from a set RDF documents using RDF2HTML tool. Chapter 5 Conclusion 5.1. Benefits Obtained The RDF2HTML tool implemented in this project is successfully providing an extensible, flexible and easy to use method to publish Semantic Web Content – RDF resources into a set of HTML pages. In order to use this tool, users do not need programming skills, they only require the knowledge on query language and the manipulation of the template tag. For advanced users know programming skills, they even allowed to write their own tag. For someone who wants to extend this tool to support other query languages, they can do it easily because of the benefit of the plug-in structure. 52 5.2 LIMITATIONS 53 Generally, RDF2HTML tool is not only restricted for the transformation of RDF to HTML, most of the option in transformation is able to customizable, from the input file extension and the default output file extension to data loading and query execution. For instance, using this tool, RDF model can convert to XML files with the use of generic operation tag. Another example is using database as the data source, then, this tool can be a simple report-generating tool. 5.2. Limitations After a critical review on the current version of RDF2HTML, two limitations are discovered. Firstly, some statically data like the number of unlinked pages, empty link groups and the number of links in the group cannot be collect. The reason to have this limitation is that the design of RDF2HTML is too generic and the need of statically data does not be concerned during the design. Secondly, the relationship linkage between conditional tags does not support properties with multi-values. The reasons are that the selection of multi-values as a record is not possible in the query language. Although this selection is a limitation, however, displaying multi-values of a resource in a resource template is not a limitation. Using another query to select all the values of the property and linking this query to the resource by inserting the resource URI in the query can solve the problem. 5.3. Further Work In order to evaluate the usability of RDF2HTML tool in practice, more work on testing is needed. Although RDF2HTML tool are easy, extensible and flexible for the program creator to use, more user experiments and opinion of template designers can be included in future. Besides, a statical analyzer can be implemented to log down statical data. 54 CHAPTER 5 CONCLUSION To sum up, it is no doubt that RDF can describe resources effectively. After describe resources, the publishing of RDF to internet is became the problem. While this problem can be solved by the RDF2HTML tool, which implemented in this project, in an easily way. Then the next problem is how to make RDF becomes popular and widely use. This problem is mainly affected by the ease of RDF creation and the standardization of RDF vocabulary (i.e. ontology) in different domain. As mentioned in chapter 2, the upper level of semantic web technology depends on the lower which including RDF level and Ontology level. Hence, in order to make the semantic web succeed, widely use of RDF on the web and standardized of RDF vocabulary are the key. Appendix A Supported Query Language SeRQL & RQL SeRQL is provided by the Sesame API. Sesame14 is an open source RDF database with support for RDF Schema inferencing and querying. Originally, it was developed by Aduna (then known as Administrator) as a research prototype for the EU research project On-To-Knowledge. Now, it is further developed and maintained by Aduna in cooperation with NLnet Foundation, developers from OntoText, and a number of volunteer developers who contribute ideas, bug reports and fixes. Sesame has been designed with flexibility in mind. It can be deployed on top of a variety of storage systems (relational databases, in-memory, file systems, keyword 14 http://www.openrdf.org/ 55 indexers, etc.), and offers a large scalar of tools to developers to leverage the power of RDF and RDF Schema, such as a flexible access API, which supports both local and remote (through HTTP, SOAP or RMI) access, and several query languages, of which SeRQL is the most powerful one. SeRQL Query example: http://www.openrdf.org/sesame/serql/serql-examples.html SeRQL Query Specification: http://www.openrdf.org/doc/users/ch06.html RQL Tutorial: http://www.openrdf.org/doc/rql-tutorial.html RDQL RDQL is a query language for RDF in Jena models. The idea is to provide a dataoriented query model so that there is a more declarative approach to complement the fine-grained, procedural Jena API. RDQL is an implementation of the SquishQL RDF query language, which itself is derived from rdfDB. This class of query languages regards RDF as triple data, without schema or ontology information unless explicitly included in the RDF source. RDF provides a graph with directed edges - the nodes are resources or literals. RDQL provides a way of specifying a graph pattern that is matched against the graph to yield a set of matches. It returns a list of bindings - each binding is a set of name-value pairs for the values of the variables. All variables are bound (there is no disjunction in the query). RDQL Programmer’s Introduction: http://jena.sourceforge.net/tutorial/RDQL/index.html 56 Appendix B Interim report 57 City University of Hong Kong Department of Computer Science BSCS Final Year Project 2003-2004 Interim Report BSCCS(FT) Semantic web - RDF to HTML tool (Volume 1 of 1 ) Student Name : Lam Wai Tung Ivan Student No. : 50316414 Programme Code : BSCCS(FT) Supervisor : Dr. Andy CHUN, H W Date : 58 For Official Use Only CS 4512 Project Semantic Web – RDF to HTML tool Lam Wai-tung Ivan BSCCS 50316414 Abstract The Resource Description Framework (RDF) is used to describe content, such as HTML pages, graphics, audio files and other documents, for the machines to interpret on the Semantic Web. However, RDF document is not good for human to interpret, hence a tool to render RDF content to semantically linked HTML pages for better human reading is proposed. The method of the rendering is using HTML templates with tags to define the page layout, semantic linkage and logical rules base on the RDF data model. A tool is implemented in Java using this method to generate semantically linked site of HTML pages from RDF data model. As a case application, a web photo album is generated out. CS 4512 Project Semantic Web – RDF to HTML tool Lam Wai-tung Ivan BSCCS 50316414 Table of Content 0. Project Title…………………………………………………………………… 1 0. Introduction …………………………………………………………………… 1 1.2. The problem ……………………………………………………………. 2 1.3. Project Objectives ……………………………………………………………..3 1.4. Scope of project work …………………………………………………………4 0. Background Research ………………………………………………………… 5 2.1. Semantic Web ………………………………………………………….. 5 2.2. Extensible Markup Language (XML) ………………………………….. 6 2.5. Extensible Stylesheet Language Transformations (XSLT) …………….. 8 2.6. Streaming Transformations for XML (STX) …………………………... 9 2.3. Resource Definition Framework ………………………………. 6 2.4. RDF Schema …………………………………………………. 8 0. The solution ………………………………………………………………….. 11 4.1. Transforming RDF to HTML ………………………………………… 22 0. The design of the system ...................................................................... 36 Reference ........................................................................................................... 73 2. Project Title Semantic web - RDF to HTML tool 3. Introduction Nowadays, Semantic Web is a hot topic in web technology. It provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries. In order to achieve sharing and reusing of data, the web need to have metadata describing resources, such as web pages, documents, photos, and real world objects in a machine understandable format. In other words, it makes a simple extension to the current web by using binary relationships to capture the meaning between links and data. [9] The computer industry has agreed and uses XML standards to give a syntactic structure for describing data. However, XML can be used in many different ways to describe the same data, this makes it too open and arbitrary to support the type of widespread and ad hoc data integration envisaged for the Semantic Web. Therefore a standard syntax - Resource Description Framework (RDF) was introduced. In Semantic Web, RDF acting an important role since it defines a graph model to provide a consistent, standardised way of describing and querying internet resources, from text pages and graphics to audio files and video clips. It gives semantic interoperability, and provides the base layer for building a Semantic Web. [10] 3.1. The problem “RDF is for machine read, not for human” Currently, RDF/XML is the most common way to store RDF data, but XML is not a good representation structure for human to read. Since RDF is intended for machine interpretation, it is not a must that RDF need to be human readable. However, the main objective of a web is to present semantic content in a human readable way. 61 Therefore, a better visual representation is essential towards the success of semantic web. Hypertext is the most common way to represent content over the Internet now, so HTML is the best choice to display RDF data. In fact, this can be done by semantic portal15 which provides dynamic HTML pages. However, there are some limitations to the content provider when using it. Firstly, it limits the ontology use, since only content of certain type ontology that it’s defined can be published. Secondly, the publication is controlled by the portal owner, thus content provider cannot publish content easily as publishing static web pages if they did not own a portal application. Thirdly, search engines are difficult to search dynamic content compare to static content. In the coming future, more content will be in RDF format. Hence, publishing semantic web in human readable form easily is the key to success. To achieve this idea, a tool that transform RDF model to static HTML pages is proposed. 15 e.g. http://ubp.learninglab.uni-hannover.de/EducaNext/ubp/home 62 4. Project Objectives By doing this report, the following are meant to achieve: Get familiar with Semantic Web concept. Learn Extensible Markup Language (XML), Extensible Stylesheet Language Transformations (XSLT), Streaming Transformations for XML (STX), Resource Description Framework (RDF), and RDF Schema (RDFS). A tool to transform RDF to HTML in Java. Study the usage and new challenges of RDF. 63 5. Scope of project work The scope of work will include the followings: Study publications about Semantic Web concept. Study the language specifications of XML, XSLT, STX, RDF and RDFS. Create RDF documents for testing. Implement a tool to transform RDF to HTML in Java with Jena (a Java RDF APIs). A small and semantically linked site of static HTML pages generated by the tool. Find out the usage and up coming challenges in RDF. 64 6. Background Research 6.1. Semantic Web Figure 2. Architecture of Semantic Web16 As Figure 2.2.1.1 show, the Semantic Web is a layered architecture. Each layer is built on the lower layer and is to provide better capabilities to represent the knowledge. Up to now, little works has done on the first 3 layers, since the ontology layer is just become W3C recommendations. For the Universal Resource Identifier (URI) and Unicode layer, they are well developed for long. XML and Namespaces layer are widely adopted in web development now. RDF and RDF Schema layer are recommended by W3C. In this project, the representation of RDF layer to human readable form will be the focus. 16 http://www.w3.org/2002/Talks/04-sweb/slide12-3.html 65 6.2. Extensible Markup Language (XML) XML is a simple, very flexible markup language derived from SGML. XML originally designed for large-scale electronic publishing and now it is very important in the exchange of a wide variety of data on the Web and elsewhere. [11] XML can achieve the exchange of wide variety of data because of it arbitrary structure. That is, XML does specify neither semantics nor a tag set, and provides a facility to define tags and the structural relationships between them. Consequently, the semantics of an XML document can be defined by the applications. Having such arbitrary structure, XML is used for the upper layers in Semantic Web architecture to define their own semantics. 6.3. Extensible Stylesheet Language Transformations (XSLT) XSLT is the most important part of the XSL Standards and it is a W3C Recommendation. It is the part of XSL that is used to transform an XML document into another XML document, or another type of document that is recognized by a browser, like HTML and XHTML. Normally XSLT does this by transforming each XML element into an (X)HTML element. [12] XSLT can also add new elements into the output file, or remove elements. It can rearrange and sort elements, and test and make decisions about which elements to display, and a lot more. [12] A common way to describe the transformation process is to say that XSLT transforms an XML source tree into an XML result tree. [12] In the transformation process, XSLT uses XPath to define parts of the source document that match one or more predefined templates. When a match is found, XSLT will transform the matching part of the source document into the result 66 document. The parts of the source document that do not match a template will end up unmodified in the result document. [12] In fact, XSLT can be used to transform RDF/XML document to HTML and there are some style sheet on the web did it 17 . However, that transformation is more suitable for a small and simple RDF/XML document but not the whole RDF model. For example, it is difficult to transform the RDF/XML document with some rules such as display books related to Classification but not Data Mining. Hence, this project is proposed. Although XSLT is not suitable for the transformation of RDF/XML document with rules, it still very useful for the transformation of simple RDF data. 17 http://rssxpress.ukoln.ac.uk/view.cgi?rss_url=http://journals.iucr.org/a/rss10.xml 67 6.4. Streaming Transformations for XML (STX) STX is very similar to XSLT; it is a one-pass transformation language for XML documents. It is intended as a high-speed, low memory consumption alternative to XSLT. Since it does not require the construction of an in-memory tree, it is suitable for use in resource constrained scenarios. [13] STX provides a streaming analog for XSLT by adopting some of the now familiar concepts from XSLT (e.g., matching based on templates and an XPath 1.0-like expression language - STXPath) but using SAX as the underlying interface to the XML document. SAX is the event-oriented sibling of the DOM API which provides a sequential view of an XML document through a stream of events. [14] This is why STX is high-speed and low memory consumption compare with XSLT. The STX transformation is achieved by associating events with templates. A template pattern is matched against events and their context. The best matching template is then instantiated to create a part of the result stream. A template is always instantiated with respect to the current context, a set of additional information maintained during the transformation. In constructing the result stream, events from the source stream can be filtered and arbitrary events can be added. Events can also be reordered using a working storage. [15] Obviously, the main different between STX and XSLT is the underlying interface to the XML document. XSLT use DOM API while STX use SAX. Hence, STX is faster and lower memory consumption than XSLT. However, it does not mean that SAX is better than XSLT. As the streaming character of STX, only current node and its ancestors accessible, while random access to all data in the document in XSLT is allowed. A sentence can describe the differences clearly - “XSLT like a book while STX like reading” 68 6.5. Resource Definition Framework (RDF) “Resources have Properties with Values” RDF is a framework provides a consistent, standardised way of describing and querying internet resources, from text pages and graphics to audio files and video clips. It expresses relations between objects, gives syntactic interoperability, and provides the base layer for building a Semantic Web. [10] To describe the relations between objects and achieve consistency, RDF defines a model for representing resources, properties and properties values. In fact RDF data model is independent of the representation method, in other words, it is syntaxneutral. In other words, RDF data can be represented in many formats such as RDF/XML, N3, N-triples etc. Consider the following example: 1: <?xml version="1.0"?> 2: <rdf:RDF 3: xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 4: xmlns:cd="http://www.recshop.fake/cd"> 5: <rdf:Description 6: rdf:about="http://www.recshop.fake/cd/Hide your heart"> 7: <cd:price>9.90</cd:price> 8: </rdf:Description> 9: </rdf:RDF> From the RDF/XML document above, we can represent the data model in Triples, Directed Graph and Sentence as shown below: Subject (Resource) Predicate (Property) http://www.recshop.fake/cd/Hide http://www.recshop.fake/cdprice your heart Table 2. Triples of the Data Model 69 Object (Value) "9.90" RDF data model can be represented using directed graphs, see Figure 2.2.3.1. In the figure, subject is represented by node, the property is represented by a directed arc and the property value is represented by rectangles. In the node and on the arc, Universal Resource Identifier (URI)18 – http://www.recshop.fake/cd/Hide your heart and http://www.recshop.fake/cdprice are used. In RDF, URI is used to describe all the objects to give them a unique and universal means of identification. Figure 3. Graph of the data model The sentence: "The CD Hide your heart costs $9.90" RDF provides unambiguous semantic of statement assertion to be carried in web documents which is not possible using pure XML. Hence, it becomes the base layer for building a Semantic Web. 6.6. RDF Schema (RDFS) RDF needs a way to define application-specific classes and properties. Applicationspecific classes and properties must be defined using extensions to RDF. One such extension is RDF Schema. [16] RDF Schema does not provide actual application-specific classes and properties. Instead RDF Schema provides the framework to describe application-specific classes and properties. [16] Classes in RDF Schema are much like classes in object oriented programming languages. This allows resources to be defined as instances of classes, and subclasses of classes. [16] 18 http://www.w3.org/Addressing/ 70 7. The solution 7.1. Transforming RDF to HTML To facilitate the transformation, RDF data model is considered as several conceptual levels. Firstly, data level - the actual data/resource in RDF statement. For instance, The CD Hide your heart costs $9.90, CD Hide your heart here is the data. Secondly, metadata level – the property of the resource in RDF statement. For instance, CD Hide your heart costs $9.90, costs here is the metadata. Thirdly, ontology level - the vocabulary used at the metadata level is defined by an RDF Schema. For instance, the schema indicates that the artist is an instance of the class “Human”. Fourthly, logic rule level - semantic relations between the resources in the data model. For instance, a binary relation between two CDs may be that they are same Artist, same company but different cost. Figure 4. Transforming RDF data model to HTML pages Figure 4 shows the transformation of RDF to HTML. The RDF data model is on the left. On the right, the Home page has links to various Index pages classifying the underlying RPages that are related with each other by semantic links. Home page defines the entrance page to the RDF model. It is typically defined by a HTML page that contains frames to show Index Pages and Resource Pages. Resource pages display one resource such as an ontological concept (e.g., a class) or a piece of resource with its property and value. Each resource that is intended to 71 be shown to the end-user has a RPage of its own. Index pages classify RPages in conceptual hierarchical way. 7.2. The design of the system Figure 5. System overview Currently, the system is designed to parse the RDF document into a Jena RDF model. And the HTML template processor grabs the corresponding resources from the model base on the HTML template and fills into the template to generate HTML Pages. In the HTML templates, tags are planned to use for some basic rules representation like select all CD resources belongs to Artist Andy Lau. 72 Reference l Cited Reference in the report: [1] Tim, Berners-Lee. (2002). The Semantic Web - A Simple Extension to the Current Web. [Online]. W3C. Available: http://www.w3.org/2002/Talks/04-sweb/slide6-1.html [2004, Sep] [2] Introduction to Semantic Web Technologies: Standard Syntax – RDF. [Online]. HP Labs. Available: http://www.hpl.hp.com/semweb/swtechnology.htm# Standard%20Syntax%20-%20RDF [3] Extensible Markup Language (XML) – Introduction. [Online]. W3C. Available: http://www.w3.org/XML/ [2004, Sep] [4] Introduction to XSLT. [Online]. W3 Schools. Available: http://www.w3schools.com/xsl/xsl_intro.asp [2004, Oct] [5] Streaming Transformations for XML (STX) . [Online]. Available: http://stx.sourceforge.net/ [2004, Oct] [6] Becker, Oliver., Brown, Oliver. and Cimprich, Petr. (2003, 26 February). An Introduction to Streaming Transformations for XML. [Online]. O’Reilly xml.com Available: http://www.xml.com/pub/a/2003/02/26/stx.html [2004, Oct] [7] Becker, Oliver. Et al. (2004, 1 July). STX transformation language specification. [Online]. http://stx.sourceforge.net/documents/spec-stx20040701.html [2004, Oct] [8] RDF Schema. [Online]. W3 Schools. Available: http://www.w3schools.com/rdf/rdf_schema.asp [2004, Sep] 73 l Reference books: Hjelm, Johan. (2001). Creating the semantic Web with RDF: professional developer’s guide. USA: John Wiley & Sons, Inc. Shelley Powers. (2003). Practical RDF. USA: O’Reilly l Reference web resources: Joost. [Online]. Joost. Available: http://joost.sourceforge.net/ [2004, Nov] Dave Beckett, ed. (2004, 2 February). RDF/XML Syntax Specification (Revised). [Online]. W3C. Available: http://www.w3.org/TR/2004/REC-rdfsyntax-grammar-20040210/ [2004, Sep] 74 Appendix C Case Application Templates all_index.r2h <rdf2html page=1 filename=index\all_index.html /> <html> <p><h3> All photos </h3></p> <table> <rdf2html condition='select * from {resource} photo:identifier {identifier}, {resource} photo:title {title} using namespace photo = <http://www.w3.org/2000/PhotoRDF/dc-1-0#>' orderBy=title distinct=identifier> <tr><td> <rdf2html type=link linkText=title href='$urlprefix$/resource/resource_+identifier+.html' addons='target=resource'/> </td></tr> </rdf2html> 75 </table> </html> coverage_index.r2h <rdf2html page=1 filename=index\coverage_index.html /> <html> <p><h3> Coverages index </p> <table> <rdf2html condition='select * from {resource} photo:coverage {coverage} using namespace photo = <http://www.w3.org/2000/PhotoRDF/dc-1-0#>' orderBy=coverage distinct=coverage> <tr><td> <rdf2html type=link linkText=coverage href='$urlprefix$/index/index_+coverage+.html' /> </td></tr> </rdf2html> </table> </html> creator_index.r2h <rdf2html page=1 filename=index\creator_index.html /> <html> <p><h3> Creators index </h3></p> <table> <rdf2html condition='select * from {resource} photo:creator {} rdf:type {foaf:PersonalProfileDocument}; foaf:maker {maker} foaf:name {name} using namespace photo = <http://www.w3.org/2000/PhotoRDF/dc-1-0#>, foaf = <http://xmlns.com/foaf/0.1/>' orderBy=creator distinct=creator> <tr><td> <rdf2html type=link linkText=name href='$urlprefix$/index/index_+name+.html' /> </td></tr> </rdf2html> </table> </html> 76 date_index.r2h <rdf2html page=1 filename=index\date_index.html /> <html> <p><h3> Dates index </h3></p> <table> <rdf2html condition='select * from {resource} photo:date {date} using namespace photo = <http://www.w3.org/2000/PhotoRDF/dc-1-0#>' orderBy=date distinct=date> <tr><td> <rdf2html type=link linkText=date href='$urlprefix$/index/index_+date+.html' /> </td></tr> </rdf2html> </table> </html> groupby_coverage.r2h <rdf2html page=* condition='select distinct coverage from {resource} photo:coverage {coverage} using namespace photo = <http://www.w3.org/2000/PhotoRDF/dc-1-0#>' filename=index\index_+coverage+.html/> <html> <p><h3> Photos took at <rdf2html type=text property=coverage /> </h3></p> <table> <rdf2html type=generic condition='select * from {resource} photo:coverage {"+coverage+"}, {resource} photo:identifier {identifier}, {resource} photo:title {title} using namespace photo = <http://www.w3.org/2000/PhotoRDF/dc-10#>' orderBy=title ascending=true template='<tr><td><a href="$urlprefix$/resource/resource_+identifier+.html" target=resource >+title+</a></td> <td>+title+</td></tr>'> </table><br> <rdf2html type=generic template='<a href="$urlprefix$/index/coverage_index.html">coverage index</a>' /> </html> 77 groupby_creator.r2h <rdf2html page=* condition='select distinct name from {resource} photo:creator {} rdf:type {foaf:PersonalProfileDocument}; foaf:maker {maker} foaf:name {name} using namespace photo = <http://www.w3.org/2000/PhotoRDF/dc-1-0#>, foaf = <http://xmlns.com/foaf/0.1/>' filename=index\index_+name+.html/> <html> <p><h3> Photos took by <rdf2html type=text property=name /> </h3></p> <table> <rdf2html type=generic condition='select * from {resource} photo:creator {} rdf:type {foaf:PersonalProfileDocument}; foaf:maker {maker} foaf:name {"+name+"}, {resource} photo:identifier {identifier}, {resource} photo:title {title} using namespace photo = <http://www.w3.org/2000/PhotoRDF/dc-1-0#>, foaf = <http://xmlns.com/foaf/0.1/>' orderby=title ascending=true template='<tr><td><a href="$urlprefix$/resource/resource_+identifier+.html" target=resource >+title+</a></td> <td>+title+</td></tr>'> </table> <br> <rdf2html type=generic template='<a href="$urlprefix$/index/creator_index.html">creator index</a>' /> </html> 78 groupby_date.r2h <rdf2html page=* condition='select distinct date from {resource} photo:date {date} using namespace photo = <http://www.w3.org/2000/PhotoRDF/dc-1-0#>' filename=index\index_+date+.html/> <html> <p><h3> Photos took on <rdf2html type=text property=date /> </h3></p> <table> <rdf2html type=generic condition='select * from {resource} photo:date {"+date+"}, {resource} photo:identifier {identifier}, {resource} photo:title {title} using namespace photo = <http://www.w3.org/2000/PhotoRDF/dc-10#>' orderBy=title ascending=true template='<tr><td><a href="$urlprefix$/resource/resource_+identifier+.html" target=resource >+title+</a></td> <td>+title+</td></tr>'> </table> <br> <rdf2html type=generic template='<a href="$urlprefix$/index/date_index.html">date index</a>' /> </html> 79 groupby_subject.r2h <rdf2html page=* condition='select distinct subject from {resource} photo:subject {subject} using namespace photo = <http://www.w3.org/2000/PhotoRDF/dc-1-0#>' filename=index\index_+subject+.html/> <html> <p><h3> <rdf2html type=text property=subject /> photos </h3></p> <table> <rdf2html type=generic condition='select * from {resource} photo:subject {"+subject+"}, {resource} photo:identifier {identifier}, {resource} photo:title {title} using namespace photo = <http://www.w3.org/2000/PhotoRDF/dc-10#>' orderby=title ascending=true template='<tr><td><a href="$urlprefix$/resource/resource_+identifier+.html" target=resource >+title+</a></td> <td>+title+</td></tr>'> </table> <br> <rdf2html type=generic template='<a href="$urlprefix$/index/subject_index.html">subject index</a>' /> </html> 80 heading.r2h <html> <table> <tr><td colspan='3'> <h2>A photo album generated by rdf2html ...</h2> </td></tr> <tr> <td><rdf2html type=generic template='<a href="$urlprefix$/index/coverage_index.html" target="groupby" >' /> coverage link</a></td> <td><rdf2html type=generic template='<a href="$urlprefix$/index/creator_index.html" target="groupby" >' /> creator link</a></td> <td><rdf2html type=generic template='<a href="$urlprefix$/index/date_index.html" target="groupby" >' /> date link</a></td> <td><rdf2html type=generic template='<a href="$urlprefix$/index/subject_index.html" target="groupby" >' /> subject link</a></td> </tr> </table> </html> index.r2h <html> <frameset rows="20%,80%"> <rdf2html type=generic template='<frame name=heading src="$urlprefix$/heading.html">' /> <frameset cols="15%,15%,70%"> <rdf2html type=generic template='<frame name=all src="$urlprefix$/index/all_index.html">' /> <rdf2html type=generic template='<frame name=groupby src="$urlprefix$/index/coverage_index.html">' /> <frame name=resource src= > </frameset> </frameset> </html> 81 resource_template.r2h <rdf2html page=* filename=resource\resource_+identifier+.html condition='select * from {resource} photo:identifier {identifier}, [{resource} photo:description {description}], [{resource} photo:format {format}], [{resource} photo:subject {subject}], [{resource} photo:coverage {coverage}], [{resource} photo:title {title}], [{resource} photo:date {date}], [{resource} photo:creator {creator}], [{resource} photo:creator {} rdf:type {foaf:PersonalProfileDocument}; foaf:maker {maker} foaf:name {name}] using namespace photo = <http://www.w3.org/2000/PhotoRDF/dc-1-0#>, foaf = <http://xmlns.com/foaf/0.1/>' /> <html> <h1> <rdf2html type=text property=title /> </h1> <rdf2html type=if property=subject op=like value=Panorama then='<img src=+resource+ width=1280 height=480>' else='<img src=+resource+ width=640 height=480>' /> <table> <tr> <td>Creator:</td><td><rdf2html type=generic template='<a href="+creator+"+>+name+</a>' /></td> </tr><tr> <td>Description:</td><td><rdf2html type=text property=description /></td> </tr><tr> <td>Coverage:</td><td><rdf2html type=text property=coverage /></td> </tr><tr> <td>Date:</td><td><rdf2html type=text property=date /></td> </tr><tr> <td>Subject:</td><td><rdf2html type=text property=subject /></td> </tr><tr> <td>Format:</td><td><rdf2html type=text property=format /></td> </tr><tr> <td> <rdf2html type=generic template='<a href="$urlprefix$/rdf/+identifier+.rdf" > <img src="$urlprefix$/img/rdfdownload.png" border="0" alt="rdf download"></a>' /> </td> </tr> </table> <table> 82 <tr><td> Photos that have the same subject and coverage </td></tr> <rdf2html condition='select * from {resource} photo:identifier {identifier}, {resource} photo:title {title}, {resource} photo:subject {"+subject+"}, {resource} photo:coverage {"+coverage+"} where not resource like "*+resource+*" using namespace photo = <http://www.w3.org/2000/PhotoRDF/dc-1-0#>' orderBy=title > <tr><td> <rdf2html type=generic template='<a href="$urlprefix$/resource/resource_+identifier+.html" >+title+</a>' /> </td></tr> </rdf2html> </table> </html> subject_index.r2h <rdf2html page=1 filename=index\subject_index.html /> <html> <p><h3> Subjects index </h3></p> <table> <rdf2html condition='select * from {resource} photo:subject {subject} using namespace photo = <http://www.w3.org/2000/PhotoRDF/dc-1-0#>' orderBy=subject distinct=subject> <tr><td> <rdf2html type=link linkText=subject href='$urlprefix$/index/index_+subject+.html' /> </td></tr> <rdf2html type=generic template='<tr><td>one of the photo on +subject+ is +resource+</td></tr>' /> </rdf2html> </table> </html> 83 SystemConfig.xml <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd"> <!--the configuration file for rdf2html--> <properties> <!--the file extension of rdf file in regular expression--> <entry key="rdfExt">.*\.[rR][dD][fF]</entry> <!--the file extension of template file in regular expression--> <entry key="templateExt">.*\.[rR]2[hH]</entry> <!--the default file extension of output file (only for template without page option)-> <entry key="defaultOutputExt">.html</entry> <!--the operation tag to match. a / will be added at the front for closing tag--> <entry key="matchingTag">rdf2html</entry> <!--the supported query type are rdql, rql and serql currently--> <entry key="supportQueryType">rdql,rql,serql</entry> <!--the default query type--> <entry key="defaultQueryType">serql</entry> <!--the directory holding the rdf files--> <entry key="rdfSourceDir">F:\My Documents\FYP\soft\work\rdf2html\resources\rdf\</entry> <!--the directory holding the template files--> <entry key="templateSourceDir">F:\My Documents\FYP\soft\work\rdf2html\resources\r2h\</entry> <!--the output html directory--> <entry key="outputDir">F:\TEMP\output\</entry> <!--the option to overwrite existing file. default is true--> <entry key="overwrite">true</entry> <!--the following variables are the user defined variables for simple string replacement use.--> <!--usage example: --> <!--<entry key="urlprefix">file:///f:/temp/output</entry>--> <!--$urlprefix$ will be replace to file:///f:/temp/output in the template--> <!--the url prefix u can define urself--> <entry key="urlprefix">file:///f:/temp/output</entry> </properties> 84 Bibliography l Cited Reference in the report: [9] Tim, Berners-Lee. (2002). The Semantic Web - A Simple Extension to the Current Web. [Online]. W3C. Available: http://www.w3.org/2002/Talks/04-sweb/slide6-1.html [2004, Sep] [10] Introduction to Semantic Web Technologies: Standard Syntax – RDF. [Online]. HP Labs. Available: http://www.hpl.hp.com/semweb/swtechnology.htm# Standard%20Syntax%20-%20RDF [11] Extensible Markup Language (XML) – Introduction. [Online]. W3C. Available: http://www.w3.org/XML/ [2004, Sep] [12] Introduction to XSLT. [Online]. W3 Schools. Available: http://www.w3schools.com/xsl/xsl_intro.asp [2004, Oct] [13] Streaming Transformations for XML (STX) . [Online]. Available: http://stx.sourceforge.net/ [2004, Oct] 85 [14] Becker, Oliver., Brown, Oliver. and Cimprich, Petr. (2003, 26 February). An Introduction to Streaming Transformations for XML. [Online]. O’Reilly xml.com Available: http://www.xml.com/pub/a/2003/02/26/stx.html [2004, Oct] [15] Becker, Oliver. Et al. (2004, 1 July). STX transformation language specification. [Online]. http://stx.sourceforge.net/documents/spec-stx20040701.html [2004, Oct] [16] RDF Schema. [Online]. W3 Schools. Available: http://www.w3schools.com/rdf/rdf_schema.asp [2004, Sep] [17] BrownSauce RDF Browser. [Online]. BrownSauce RDF Browser. Available: http://brownsauce.sourceforge.net [2004, Oct] [18] Spectacle. [Online]. Aduna. Available: http://aduna.biz/products/spectacle/ [2005, March] [19] Alison, Cawsey.(2002, 5 July). Presenting tailored resource descriptions: Will XSLT do the job?. [Online]. Department of Computing and Electrical Engineering heriot-Watt University. Available: http://www.cee.hw.ac.uk/~alison/www9/paper.html [2004, Sep] [20] Arttu, Valo. Eero, Hyvönen., Kim, Viljanen., Markus, Holi. (2004, 6 October). Publishing Semantic Web Content as Semantically Linked HTML Pages. [Online]. Department of Computer Science, University of Helsinki. Available: http://www.cs.helsinki.fi/u/eahyvone/publications/xmlfinland2003/swehg_ article_xmlfi2003.pdf [2004, Oct. 6]. 86 l Reference books: Hjelm, Johan. (2001). Creating the semantic Web with RDF: professional developer’s guide. USA: John Wiley & Sons, Inc. Shelley Powers. (2003). Practical RDF. USA: O’Reilly l Reference web resources: Joost. [Online]. Joost. Available: http://joost.sourceforge.net/ [2004, Nov] Dave, Beckett., ed. (2004, 2 February). RDF/XML Syntax Specification (Revised). [Online]. W3C. Available: http://www.w3.org/TR/2004/REC-rdfsyntax-grammar-20040210/ [2004, Sep] Dave, Beckett., ed. (2004, 10 February). RDF Vocabulary Description Language 1.0: RDF Schema. [Online]. W3C. Available: http://www.w3.org/TR/rdf-schema/ [2004, Sep] Tim, Berners-Lee. (1998, 17 September). What the Semantic Web can represent. [Online]. W3C. Available: http://www.w3.org/DesignIssues/RDFnot.html [2004, Sep] Azad, Bolour., (July 3, 2003).Notes on the Eclipse Plug-in Architecture. [Online]. Eclipse.org. Available: http://www.eclipse.org/articles/Article-Plugin-architecture/plugin_architecture.html [2005, Jan] 87