City University of Hong Kong Department of Computer Science BSCS Final Year Project 2004-2005 Final Report 04CS020 Project Title Real-time Web-based Collaborative Editor (Volume 1 of 1) Student Name: Ng Ming Hong Student No: 50307104 Programme Code: BSCCS Supervisor: Dr Chun, H W Andy 1st Reader: Dr Liu, W Y 2nd Reader: Dr Deng, Xiaotie Page 1 of 73 For Official Use Only Abstract Nowadays, revision control systems are used to support editing of documents by multiple authors. However, these systems do not provide an easy way to know what other authors are doing. To address this problem, real-time collaborative editing systems (CES) have been developed to enable editing of document by a group of authors over the network at the same time. The purpose of this project was to design and implement a real-time web-based collaborative XML editor that would be easier to use and deploy than the previous real-time collaborative editors. The overwrite problems and consistency problems were reduced by using the approaches of sectionlevel locking and centralized architecture. Our current implementation ran successful on Mozilla-based browsers, with near real-time syndication, where the maximum delay was adjusted by the server administrator. The basic objectives were met and the editor was usable. Page 2 of 73 Table of Contents Abstract.......................................................................................................................... 2 1. Introduction............................................................................................................. 5 1.1. 1.2. 1.3. 1.4. The Problem.................................................................................................... 5 The Solution.................................................................................................... 6 Objectives of the Project................................................................................. 6 Scope of the Project.........................................................................................6 2. Review of Related Works....................................................................................... 7 2.1. 2.2. 2.3. 2.4. 2.5. 2.6. 2.7. 2.8. Iris....................................................................................................................7 REDUCE......................................................................................................... 9 WebDAV.......................................................................................................11 iStorm............................................................................................................ 13 SubEthaEdit...................................................................................................14 MediaWiki.....................................................................................................15 Comments about Related Works...................................................................17 Inspirations to This Work..............................................................................18 3. Methodology......................................................................................................... 23 3.1. 3.2. 3.3. 3.4. 3.5. 3.6. 3.7. 3.8. 3.9. Document Model...........................................................................................23 System Architecture...................................................................................... 25 Operations..................................................................................................... 26 Locking Mechanism...................................................................................... 27 Syndication Mechanism................................................................................ 28 Interactions of Components...........................................................................29 Communication Protocol...............................................................................32 Database Schema...........................................................................................37 User Interfaces...............................................................................................38 4. Results...................................................................................................................47 4.1. Initial Loading Time......................................................................................47 4.2. Maximum Number of Clients........................................................................49 5. Discussion............................................................................................................. 50 5.1. 5.2. 5.3. 5.4. 5.5. 5.6. Explanation of Results...................................................................................50 Comparison with Related Works.................................................................. 51 Benefits..........................................................................................................53 Limitations.....................................................................................................55 Applications...................................................................................................59 Suggestions....................................................................................................60 6. Conclusion............................................................................................................ 63 7. References............................................................................................................. 64 Page 3 of 73 8. Acknowledgment.................................................................................................. 66 9. Appendix...............................................................................................................67 9.1. 9.2. 9.3. 9.4. Appendix A: DSS 0.1 DTD Specification.....................................................67 Appendix B: Cache Database SQL Definition.............................................. 70 Appendix C: Web Development Resources..................................................72 Appendix D: Product Website.......................................................................73 Page 4 of 73 1. Introduction The research of collaborative editing system (CES) began one to two decades ago. Because of the rise of Internet in the 90s, more and more computers were interconnected. In a workplace, people often need to work on a same set of documents simultaneously. For example, a group of programmers writing a software and a group of project members writing a report. Due to this strong need, many systems had been proposed to support collaborative editing. Some of them support asynchronous work while some support synchronous work. 1.1. The Problem Currently, we rely on revision control systems like CVS and SVN for syndicating the documents that reside on multiple sites. Revision control system is a kind of asynchronous CES based on shared repository, where documents in the repository can be locked and accessed separately. The system itself does not provide editing functions. It simply provides document files for the standard single-user editors. While these systems are widely used, there exists a major shortcoming: there is no easy way to know what other authors are doing. Information like what files others are editing and what part of the file others are editing is very useful for an author: he / she can avoid editing those files and parts. Also, without communicating with other authors, there will be conflict when committing a file that was already committed by someone else. These conflicts need to be resolved manually by updating the file to be committed before committing. Also, the result of syndicating a conflicted file is also unknown until the file is syndicated. To prevent commit conflicts, authors often need to make agreements before making changes. While it is possible to do so in small groups, it would be much harder if the number of authors is huge or unexpected. This is not impossible. An example is Wiki, a kind of collaborative website that allows visitors to freely create and edit contents of website using just the web browsers. Page 5 of 73 1.2. The Solution One of the possible answers to this problem is synchronous CES, where work flow is synchronous. Ideally, the editor has to be real-time, meaning that any change made in the local replica will be visible immediately, and any change made by one author will be visible to other authors as soon as possible. Authors should also be notified of the status and progress of other authors. For example, there should be visual identification of the parts that are being edited. Communication facility like instant messaging and chatting should also be available, as it can assist collaborative work. Authors can discuss about their work when editing collaboratively. 1.3. Objectives of the Project The objectives of this project are to design and implement a real-time web-based collaborative editor in which authors can edit XML (eXtensible Markup Language) documents concurrently. Though the primary target XML dialect is XHTML, the editor will be designed to be generic enough to adopt other XML dialects like MathML easily. As a minor objective, the project will also investigate the possibility of generalizing the idea to plain text documents. 1.4. Scope of the Project The scope of the project includes the followings: • Study publications and articles about CES. • Study the DOM methods. • Design and implement a protocol for syndicating XML documents. • Design and implement the software in both client-side and server-side. • Design and implement the user interface of the editor. • Design and implement an API for easy integrating with other web applications. Page 6 of 73 2. Review of Related Works Many real-time collaborative editors or editing systems had been developed in the past 10 years. Some were highly academic while some were highly commercial. Many different approaches had been taken to solve the problem of document syndication. They provide good basis for this project. To see how CES evolves over the years, the following related works are arranged in chronological order: 2.1. Iris The Project Iris (Koch, 1995) was one of the earliest attempts to create a collaborative multi-user editor. Iris was not just one program, but a collection of application that forms a collaborative editing environment. There were no central components. The documents were accessed and modified by different application, including standard single-user editors. It was first implemented with C, C++ and Tcl/Tk (Iris-1), but then re-made with Java (Iris-2). The software was divided into 2 layers: • Access layer It was responsible for managing data objects of the document. Instead of preventing conflict by restricting access, it used an optimistic replication control that will not avoid conflicting updates. Conflicting updates were identified by temporarily exchanging the update histories through multicasting. • User interface layer It was a family of specialized editors which handled different media types. While the project was stopped in 1998, some observations that are still very valid. These would be used to justify the robustness of this project. Page 7 of 73 A group editor has to be usable for the individual user. An author should be able to: • Read and write any displayed text • With no technical access restriction • With the ability to create private areas, and adjust granularity and time for making updates public • Get immediate response to their actions • Customize the UI according to preference According to Koch, notion of awareness is very important. The application should provide awareness of the state and actions of other users automatically and continuously. The awareness of actions includes the indication of what the user is accessing and what kind of access is being made. In order to support this notion of awareness, the following items are needed: • History of changes • Online information about the authors and their changes • Communication between authors (synchronous, asynchronous, 1:1, 1:n) • Interface for integrating with external applications for information Page 8 of 73 2.2. REDUCE REDUCE (Sun et al, 1997) stands for REal-time Distributed Unconstrained Collaborative Environment. As suggested by its name, there were 3 important goals: • Real-time • Distributed • Unconstrained These goals were essentially the same as those of the Project Iris. Also, the program was implemented with Java. However, the paper described the ways to keep data across servers consistent. When editing operations are sent to other servers over the network in a multicast way, 3 consistency problems arise: • Divergence Time Site A Site B O1 O2 Site C O4 O3 Figure 2.2.a – A scenario of a real-time collaborative editing session Due to the indeterminable latency in network transmission, operations may arrive and be executed in different order, resulting in different results. Consider Figure 2.2.1, the order of the four operations are different for the 3 sites: for site A, O1, O2, O4, O3; for site B, O2, O1, O3, O4,; and for site C, O2, O4, O3, O1. Unless the operations are commutative, which are obviously not, the replicas will not be identical. Page 9 of 73 • Causality violation Similarly, the operations may be out of their natural cause-effect order. For some sites, answer may appear before the question. This will certainly lead to confusion for authors. • Intention violation As any part can be freely edited by any author, this creates another problem when two authors are editing the same part of the document at the same time. Suppose the two authors are editing the same string “ABCD”. One author inserts an “X” between B and C. Another author deletes the letter “B”. Naturally, the intended result should be “AXCD”. The two operations O1, O2 are insert( “X”, 1 ) and delete( 1 ) respectively, where the number is the index of the character to be inserted and deleted. They are then propagated to the other sites almost at the same time. If O1 is executed before O2, everything fine. However, if O2 is executed before O1, the resulting string will be “ACXD”, which is clearly not what the authors intended. In REDUCE, the site servers are connected together directly without any immediate server. There exist a session manager, which is only used when joining and leaving session. When editing, the character-based operations are kept in a buffer. Before multicasting to the other servers, the accumulated operations may be converted into a single string-based operation for efficiency reasons. Operations are also serialized and reordered in order to preserve causality and intention. Page 10 of 73 2.3. WebDAV WebDAV (RFC 2518, 1999) is a set of extensions to the HTTP protocol that supports web-based distributed authoring and versioning (hence the name WebDAV). In WebDAV, there are two main objects: resource and collection, which basically means file and directory in a file system. Every resource also has a set of properties attached. Operations in WebDAV are defined as some new HTTP methods. These provide facilities for manipulations of property, resource and collection, and the prevention of overwrite. Methods Descriptions LOCK To lock a resource UNLOCK To unlock a resource GET To get the content of a resource PUT To set the content of a resource PROPFIND To get the properties of a resource PROPPATCH To create or remove the properties of a resource COPY To copy a resource, property or collection MOVE To move a resource, property or collection DELETE To delete a resource, property or collection MKCOL To create a new collection Table 2.3.a – WebDAV methods In the corresponding HTTP requests and responses, instead of URL encoded string, XML is used for holding the data to be transmitted. As shown in table 2.3.1, WebDAV uses a locking mechanism (LOCK and UNLOCK) to prevent a document from being overwritten by another copy of the document. Locking mechanism has been proved to be practical, as seems in relational databases and revision control systems. In a typical application use of WebDAV, it involves the locking, reading, writing and unlocking of resources as illustrated in Figure 2.3.a. Page 11 of 73 Client File > Open Method LOCK Server Lock resource PROPFIND Send properties to client GET Send resource contents to client PUT Save new contents Edit File > Save Exit UNLOCK Unlock resource Figure 2.3.b – Application use of WebDAV Write lock is the only type of lock defined in WebDAV. The scope of the lock is an entire resource, as the protocol was designed to support every possible media type. Without the knowledge of the content type being locked, it is impossible to create a sub-resource lock. Therefore, it is clearly that the protocol was not designed for realtime and sub-resource collaboration. However, the protocol itself is a good reference for designing the protocol used in this project. Page 12 of 73 2.4. iStorm In contrast to previous researches and products, iStorm (MathGameHouse, 2002), a collaborative tool for Mac OS X, took a very simple approach: one and only one author can edit at a time. When an author want to start editing, he / she need to make a request to the current document holder, if any. If no, the author will become the document holder who has complete control on editing the document. Apart from user switching, there is no significant difference from a single-user editor. Figure 2.4.a – A screenshot of iStorm's editing session (source: iStorm product website) Figure 2.4.a shows an editing session in iStorm. A color of the circle button (at the bottom) indicates the status of the document. Blue means you are editing; red means the other is editing; and green means no one is editing. Undoubtedly, this method is very undesirable. In practical, the editor can only support a very small group of authors, as there would be too much time wasted in waiting for the approval of requests. Nevertheless, this method has an advantage that the development time of the software should be relatively short. Page 13 of 73 2.5. SubEthaEdit SubEthaEdit (TheCodingMonkeys, 2003) is a text editor for Mac OS X. The editor, formerly known as “Hydra”, was probably one of the most exciting CESs ever exist. Like REDUCE, it is non-locking and non-blocking: anyone can type anywhere anytime. In addition, it supports true, real-time, distributed collaboration. The notion of awareness is clearly indicated in several areas: • Each author can see the cursors and edits of other authors. • A list of authors in different colors • The edits are colored according to the authors. Figure 2.5.a – A screenshot of SubEthaEdit's editing session (source: SubEthaEdit product website) Unfortunately, since the product is commercial, there is no documentation about the ways or algorithms to implement the editor. Nevertheless, SubEthaEdit provided a good reference for designing the user interface of real-time collaborative editors. Page 14 of 73 2.6. MediaWiki MediaWiki (Wikimedia Foundation, 2002) is a widely used Wiki engine created for the Wikipedia website (a Web-based free content encyclopedia). Like other wikis, contents are collaboratively edited using just the web browsers. As authors can edit any part of the document at any time, edit conflict will occur when an user tries to issue a save that would overwrite the someone else's work, as illustrated: 1. User A opens a document and starts to edit. 2. User B open the same document as similar time and starts to edit. 3. User A makes some changes. User B also makes some other changes. 4. User B saves. Then user A saves. 5. Changes made by user B is then overwritten by user A. These kind of edit conflicts, if any, are detected and aborted when the user saves the document. One of the distinguishing features of MediaWiki is the use of headings to separate a document horizontally and hierarchically, where the location of the heading defines the border of the section, and level of the heading defines the hierarchy (Figure 2.6.a). This reduce the chance of having edit conflicts as multiple authors can edit on different sections. As shown in the screenshot below, an “edit” link is associated with each headings. A section is edited by following its “edit” link. Its subsections, if any, will also be edited. To facilitate communication between authors, a “talk page” (one of the special wiki pages) is associated with each article and user page that is used to discuss various issues about that article and user (Figure 2.6.b). This kind of communication, however, is static. This means that an author will not know any new changes and information unless he/she explicitly visit the corresponding webpage. Page 15 of 73 Figure 2.6.a – A screenshot of a Wikipedia article Figure 2.6.b – A screenshot of a Wikipedia talk page Page 16 of 73 2.7. Comments about Related Works As we can see, Iris and REDUCE used an optimistic approach. Instead of avoiding conflicting updates, they tried to merge them and give the intended effect of the conflicted operations. However, the effect may not be desired for average users unless there is a clear notion of awareness (like SubEthaEdit). An author may be confused when parts of the document were suddenly (and magically) being changed. This is especially undesirable when it comes to 2 edits that are very close. Consider a sentence that it grammatically incorrect: “I found a solutions”. One author may like to correct it by removing the “s”; while another author may like to change “a” to “some”. Hence, the resulting string will become “I found some solution”, which is still grammatical incorrect. According to Sun, “to the best of our knowledge, none of the existing cooperative editing systems has attempted to maintain semantical consistency automatically”. And more importantly, character-level syndication may not be suitable for XML documents. Consider the following case of editing XHTML: Suppose the two authors are editing the same string “ABCD”. One author want to help “BC” bold. Another author want to make “CD” italic. The intended result will then be “A<b>B<i>C</b>D</i>”, which is invalid as there are overlapping tags. While many CESs have been developed, they did not seem to be widely used in organizations. This may due to the following reasons: • Adherence to a single platform iStorm and SubEthaEdit were for Mac OS X only. Interoperability is important for organization which often has more than one type of computers. • The use of Java Applet Both Iris and REDUCE were implemented in Java. While Java Applet can be considered as a web-based solution, its usage is not welcomed. It is likely due to the lengthy initial loading time of JVM (Java Virtual Machine) and Applet. • Little exposure to the general public Page 17 of 73 2.8. Inspirations to This Work Except the third one, other shortcomings would be solved or reduced by using alternative approaches: • Multi-platform Instead of adhering to a single platform, the editor should be available to as many platforms as possible. A web-based editor will easily fulfill this requirement as the only software required is the web browser. • Improve the client-side experience Instead of using external plugins like JVM and ActiveX, JavaScript should be used as it is natively supported by the browser. Although the program may not run as fast as Java, JavaScript has a big advantage of short initial loading time. Due to the constraints on client-side technologies and time, it would be difficult to build a web-based editor that meet all of the requirements of an ideal CES. Instead of building a complete editing system, this project will focus on the editor. Functions like document management, user management, category/directory management and access control would be provided by the content management system or the web application in which the editor is used. As distributed architecture is not possible in HTTP, a centralized approach would be used instead. Although it would be less efficient, the centralized approach enjoy the following benefits: • Reduced inconsistency problems As the operations can be serialized in the central server, the problem found in REDUCE would be reduced. • Work well behind firewalls Unlike socket that requires some special ports, HTTP and HTTPS use the standard port 80 and 443 respectively, which are not blocked in most firewalls. Page 18 of 73 Currently, most web applications use (X)HTML to mimic the look and feel of local applications. However, this will greatly hinder accessibility, readability, and semantic meaning of the document, as the original purpose of HTML is for use as a hypertext documents, not as an user interface language. In addition, a large proportional of time would be wasted in writing code for emulating some common UI widgets like tabs, trees and lists. Therefore, XUL (XML User-interface Language) would be used instead. It is an application of XML used exclusively on Mozilla browser and its derivatives for defining user interface. The Mozilla browsers are mostly cross-platform in nature, with the exception of Camino that is Mac OS X only. The XUL specification covers most of the commonly used UI widgets. This reduces the software development effort in a way analogous to the savings offered by 4GL tools like Visual Basic. Quick prototyping and rapid application development is also possible with XUL. XUL reuses some of the existing web technologies like CSS (Cascading Style Sheets), DTD (Document Type Definition), RDF (Resource Description Framework) and JavaScript. An XUL-based application is usually separated into these layers: • Structure layer XUL. The flexible box model makes it easy to layout and resize the widgets. • Presentation layer CSS. The syntax is exactly the same, with several Mozilla-specific properties. • Control layer JavaScript. It is a powerful prototype-based scripting language. • Localization (l10n) layer DTD and property files. DTD defines entities for use in XUL documents, while property file (with the extensions of .properties) is a plain text file containing strings for use in JavaScript (e.g. the strings that appear in alert boxes). Page 19 of 73 • Data layer RDF. It is often used as a data source for XUL applications. An interesting thing about XUL applications is that many can simply be opened and run from the browser directly, just like a website. Hence the integration with other web applications could be seamless. Besides XUL, Mozilla browser has outstanding support of the W3C DOM (Document Object Model), which is used to manipulation XML document. In this model, a document is viewed as a tree structure, where each element, attribute and text content is viewed as element node, attribute node and text node respectively. An array of DOM methods are available for reading and writing the DOM tree. DOM is accessible by JavaScript due to the DOM binding in the JavaScript engine. Due to restricted privileges, most privileged objects are not accessible by remote script (remote script means any script that is not found in the browser chrome), unless the script is digitally signed, which is very expensive. One of the privileged objects is the rich HTML editor (also known as “midas”) implemented natively by Mozilla. Interestingly, it is accessible in remote HTML document, but not remote XUL document. In addition, it is also impossible to load remote XUL, DTD, property and RDF documents. To address these problems, the following workarounds were used: • Server-side include (SSI) for combining XUL documents together. • Inline DTD instead of remote DTD. • A JavaScript wrapper class for loading and parsing property file. • eDOM (editor DOM, a JavaScript module for XHTML editing) of mozile (stands for mozilla inline editor) for editing XHTML content. RDF was not used in this project. Page 20 of 73 The following is an example of “Hello World” remote XUL document, showing the various files and the workarounds: <?php // Send correct MIME-type header( "Content-Type: application/vnd.mozilla.xul+xml" ); ?><?xml version="1.0"?> <?xml-stylesheet type="text/css" href="chrome://global/skin/"?> <?xml-stylesheet type="text/css" href="index.css"?> <!-Mozilla cannot load external DTD: https://bugzilla.mozilla.org/show_bug.cgi?id=22942 --> <!DOCTYPE window [ <?php require_once( "index.dtd" ) ?> ]> <window xmlns="http://www.mozilla.org/keymaster/gatekeeper/there.is.only.xul" id="index" title="Hello World"> <script type="application/x-javascript" src="index.js"/> <hbox flex="1" align="center" pack="center"> <label control="say" value="&say.value;"/> <button id="say" label="&say.label;" oncommand="hello()"/> </hbox> </window> Text 2.8.a – Code listing of “index.php” (PHP/XUL layer) :root { background-color: yellow; } button { font-weight: bold; } Text 2.8.b – Code listing of “index.css” (CSS layer) <!ENTITY say.value "Say:"> <!ENTITY say.label "Hello World!"> Text 2.8.c – Code listing of “index.dtd” (DTD layer) function hello() { window.alert( "Welcome to XUL!" ); } Text 2.8.d – Code listing of “index.js” (JavaScript layer) Page 21 of 73 The following is a screenshot of the document (when the button is pressed): Figure 2.8.e – Execution of the “Hello World” remote XUL program As shown in the screenshot, the layers incorporation together nicely. The XUL defined a label and its button, nested inside a horizontal box. The XUL was styled with the CSS document, giving it a yellow background. The custom entities were resolved by the DTD, hence were substituted with the corresponding string. Finally, the action of the button is defined inside the JavaScript file, just like ordinary web applications. Page 22 of 73 3. Methodology This section will give the detail about how this editing system was actually be built. As the editing system was a data-centric application, the focus in this section is the definition of document structure and the XML-based syndication protocol. A modulebased and object-based approach was used in the design of the system. To illustrate the methodology used in this project, XHTML is used as an example of application of XML. However, the protocol was not limited to just XHTML. But note that the editor was limited to other ways due to browser limitation. This editing system was named as “LivePad”, licensed under MPL/GPL/LGPL trilicense. 3.1. Document Model The document model was inspired by MediaWiki: A document was defined as areas called regions and sections. A region was an editable part of the document. It contained one or more sections, each containing exactly one document element. In other words, the regions were horizontally partitioned into sections. Each section had a globally unique identifier (GUID) which was used for random accessing and manipulating the sections. The GUIDs were generated by the central server, so they must be globally unique. Each document must had a mandatory “content” region, consisting of the document content, and an optional “meta” region, consisting of the document meta data. The areas that were not covered by either of the 2 regions were hence not editable. Page 23 of 73 <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> Meta Section Section <head> <title>Hello World!</title> <meta name="keywords" content="a, b, c" /> </head> Content Section Section <body> <p>Lorem ipsum dolor sit amet.</p> <ul> <li>Quisque nec est ultrices justo tincidunt.</li> <li>Vivamus quis erat. <em>Quisque</em> aliquet.</li> </ul> </body> </html> Example 3.1.a – An XHTML document with the corresponding regions and sections In fact, the model was not limited to just XML document. It could be generalized to plain text document, where each line was a section. function foo( bar ) { var a = new Array( 1000 ); for ( var i in a ) { a[i] = 0; } } Section Content Section Section Section Section Section Section Section Example 3.1.b – A plain text document with the corresponding regions and sections Page 24 of 73 3.2. System Architecture Content Provider API call Central server Cache storage Server-side Client-side HTTP call Client 1 Client 2 Client 3 Figure 3.2.a – System architecture The system was divided into 3 parts, 1) the content provider, who provided the document to be edited; 2) the clients, who viewed and edited the documents; and 3) the central server, who coordinated the 2 parties. The central server was tightly coupled with the content provider as it was an integral part of the whole content management system. The communication way was by API call. On the other side, the clients communicated with the central server via HTTP calls. The central server contained a cache storage used for storing immediate data like sessions, log and cache. When an operation was done in the client, it was also propagated to the central server where operations were serialized. From time to time, a client issued syndication operations, which in turn retrieved the operations that need to be redo for that particular client. As a result, all the clients were being syndicated. The details about various operations, syndication mechanism and locking mechanism can be found in the subsequent sections. Page 25 of 73 3.3. Operations An operation was the smallest unit of command in this system. Most of the operations were editing operations and operations related to editing: Type Session Editing Chatting Name Description join To join a session quit To quit a session lock To lock a section for editing unlock To unlock a section for editing append To append a new section as the last child of the region insertBefore To insert a new section before a reference section replace To update a section with a new section remove To remove a section save To save the document back to the content provider post To post a message Resource getPageinfo To get the HTTP header information of an URI getUserinfo To get the user information (e.g. name) getGuidset To get a set of GUIDs getOperationset To get a set of recent operations Table 3.3.a – A table of operations Each operation included, not just the data related to the operation, but also the user ID of the sender and receiver. A special user ID “*” was used to denote the central server. For most operations, the sender was the user while the receiver is the central server. An exception was “post” (for chatting), where the receiver was set as the central server for public message; while it was set as an user ID for private message. Further details are available in section 3.7, “Communication Protocol”. Page 26 of 73 3.4. Locking Mechanism The locking mechanism was straight forward. Before updating and deleting a section, a lock needed to be acquired. There was no need to acquire a lock when creating a section (as the section does no exist yet). Similarly, when finished editing a section, it needed to be unlocked, unless the section no longer exist. A lock was applied to an entire workspace. Like WebDAV, only write lock was available. Locks were acquired explicitly (e.g. pressing button, double-clicking, etc). An implicit way to acquire lock (e.g. when the caret (“text cursor”) was over the workspace) was not suitable as an author may accidentally acquired a lock without noticing. The problem is that other authors can not edit that workspace unless the author realizes the mistake. Undoubtedly, for most of the operations, this locking mechanism worked. But there was an exceptional case: appending a new section (prepending was not used in this system). Consider the following example: At T = 1, client A appended a section SA to the document and at the similar time, client B appended a section SB. At T=2, both clients syndicated, hence the replica in client A became SASB, while that of client B became SBSA. Hence there was inconsistency in the order of the sections. Therefore, the append operation was not used in the client-side if ordering of the elements is important (i.e. for the “content” region). Otherwise, it could be used (i.e. for the “meta” region). Page 27 of 73 3.5. Syndication Mechanism C1 … O2 O3 O1 null O5 O6 O4 O7 … S O2 C2 O3 O5 O1 O6 O1-5 null O7 … Time Key: syndication operation other operation online period (vertical bars mean timeouts) Figure 3.5.a – Syndication mechanism Because of the locking mechanism, the order of arrival of operations was not important. In other words, editing operations were independent. Hence, the clients could issue operation at any time without including inconsistency. Because of the limitation of HTTP, server cannot push information to the clients. Hence, the clients had to pull information from server regularly. The time interval was set as 10 seconds in this implementation. This value was adjustable by modifying the source codes. At each syndication, zero or more operations that needed to be redo on that client were retrieved. As a client did not keep any information about the previous editing sessions, whenever it joined an existing session, a list of the previous operations (if any) was retrieved. This kept the client up-to-date with the current status. Hence, the cache related to a document needed to be kept in the central server until no one was editing that document. Page 28 of 73 3.6. Interactions of Components modify Localization return cache DTD files Session Cache database include include include request include modify response Property files XUL files JavaScript files Coordinator include include refer CSS files save document Image files Adapter Content provider load document Presentation Client-side Server-side Figure 3.6.a – Interactions of components In contrast to the traditional JavaScript-enhanced, form-based web application model, JavaScript was used not just for manipulating the user interface, but also as a bridge between the client and server-sides. This model first appeared around year 2000, as pointed out by Koch (2005), a freelance web developer who experts in JavaScript, CSS and other web technologies. However, the model was not widely known until recently where it was used extensively on several Google services like Gmail and Google Maps. This model was sometimes known as Ajax (a shorthand for “Asynchronous JavaScript + XML”), as coined by Garret (2005). The Ajax model is more responsive as typically less data is involved in each HTTP round trip. The more important point is that it also allows content updating without page reloading, making the web application looks and feels more like a local application. The key difference from the traditional model is that the HTTP requests are issued by not just the web browser, but also by the JavaScript code attached with the rendered document. As JavaScript code can be executed at anytime once the document is loaded, it can retrieve data from server on demand, say when an user interacts with the application interface. Page 29 of 73 In the design of the editor in the client-side, it followed the classical Mozilla application structure: where structure, presentation, control and localization were separated into XUL, CSS/images, JavaScript and DTD/properties respectively. On the other side, the server-side also followed the classical web application structure, where functions were divided into classes and relational database was used to store data. Three notable classes are shown in figure 3.6.1: 1) Adapter, for loading/saving document from/to the content provider; 2) Session, for keeping the session/cache data; and 3) Coordinator, which encapsulates various classes and interacts with the clientside. PHP was used for the server-side scripts and SQLite (a zero-configuration ACIDcompliant SQL database engine) was used for storing session/cache data. PHP was chosen over other server-side scripting languages due to its popularity in open source community and simplify to use. Similarly, SQLite was chosen over other relational databases mainly due to its simplify (easy to setup and use) and adherence to the SQL standard. It was also the database API bundled with PHP since version 5, the latest version at the time of writing. Page 30 of 73 :User :LivePad join :Coordinator request :Session :Adapter add user load document document create cache add cache response Time-driven syndicate request response User-driven (optional) lock/unlock request response append / insert before / update / delete request get operations operations check availability availability lock/unlock check availability availability get cache cache modify cache save cache response quit request response remove user get cache cache save document remove cache Figure 3.6.b – Sequence diagram showing the major server-side message passing. Figure 3.6.b shows a more detail sequence diagram with an emphasis on server-side. Many minor parts are not shown for illustration purpose. The “time-driven” frame refers to the actions that occur regularly when timeout; while the “user-driven” one refers to the editing actions that can occur at anytime between “join” and “quit”. Page 31 of 73 3.7. Communication Protocol The client and server communicated with the use of an application of XML named as Document Section Syndication (DSS), where the full DTD schema can be found in the appendix. This XML-based protocol allows a third-party to implement an editor client that is compatible with the server more easily. The protocol supported all the operations listed in table 3.3.1. The followings are examples of XML contents being sent and received between a client and a server. Assuming: • User ID: “user-c” • Document URI: “hello.xhtml” For most operations (except syndication and miscellaneous ones), if the operation can be done, the response will be: <dss version="0.1" xmlns="http://livepad.klogms.org/livepad/spec/dss0.1.dtd"> <response> <status success="true"/> </response> </dss> Otherwise, the response will be: <dss version="0.1" xmlns="http://livepad.klogms.org/livepad/spec/dss0.1.dtd"> <response> <status success="false"/> </response> </dss> 3.7.1. Joining a session <dss version="0.1" xmlns="http://livepad.klogms.org/livepad/spec/dss0.1.dtd"> <request> <join uri="hello.xhtml" sender="user-c"/> </request> </dss> Page 32 of 73 3.7.2. Posting a message For public message: <dss version="0.1" xmlns="http://livepad.klogms.org/livepad/spec/dss0.1.dtd"> <request> <post uri="hello.xhtml" sender="user-c" receiver="*">Anyone here?</post> </request> </dss> For private message: <dss version="0.1" xmlns="http://livepad.klogms.org/livepad/spec/dss0.1.dtd"> <request> <post uri="hello.xhtml" sender="user-c" receiver="user-z">Still here?</post> </request> </dss> 3.7.3. Locking a section <dss version="0.1" xmlns="http://livepad.klogms.org/livepad/spec/dss0.1.dtd"> <request> <lock uri="hello.xhtml" sender="user-c" region="content" section="ws-d2a15bd761a7fec5c45943ce06c8ed9f"/> </request> </dss> Page 33 of 73 3.7.4. Modifying the section Appending a new section: <dss version="0.1" xmlns="http://livepad.klogms.org/livepad/spec/dss0.1.dtd"> <request> <append uri="hello.xhtml" sender="user-c" region="meta" section="ws-7e8bdf4baad3a0083fafbbca36319899">&lt;meta/&gt;</append> </request> </dss> Inserting a new section before another section: <dss version="0.1" xmlns="http://livepad.klogms.org/livepad/spec/dss0.1.dtd"> <request> <insertBefore uri="hello.xhtml" sender="user-c" region="content" section="ws-08d30e97bcbc1fdd7889ceed8774637d" refSection="ws-c17c732f48886d56d37fa6d1000a36a0">&lt;p&gt;Paragraph 1&lt;/p&gt;</insertBefore> </request> </dss> Replacing a section: <dss version="0.1" xmlns="http://livepad.klogms.org/livepad/spec/dss0.1.dtd"> <request> <replace uri="hello.xhtml" sender="user-c" region="content" section="ws-c17c732f48886d56d37fa6d1000a36a0">&lt;p&gt;Paragraph 2&lt;/p&gt;</replace> </request> </dss> Removing a section: <dss version="0.1" xmlns="http://livepad.klogms.org/livepad/spec/dss0.1.dtd"> <request> <remove uri="hello.xhtml" sender="user-c" region="content" section="ws-c17c732f48886d56d37fa6d1000a36a0"/> </request> </dss> 3.7.5. Unlocking the section <dss version="0.1" xmlns="http://livepad.klogms.org/livepad/spec/dss0.1.dtd"> <request> <unlock uri="hello.xhtml" sender="user-c" region="content" section="ws-08d30e97bcbc1fdd7889ceed8774637d"/> </request> </dss> Page 34 of 73 3.7.6. Syndicating <dss version="0.1" xmlns="http://livepad.klogms.org/livepad/spec/dss0.1.dtd"> <request> <getOperationset uri="hello.xhtml" sender="user-c"/> </request> </dss> which will return this when there are no new operations: <dss version="0.1" xmlns="http://livepad.klogms.org/livepad/spec/dss0.1.dtd"> <response> <resource> <operationset/> </resource> </response> </dss> or something like this when there are new operations: <dss version="0.1" xmlns="http://livepad.klogms.org/livepad/spec/dss0.1.dtd"> <response> <resource> <operationset> <replace uri="hello.xhtml" sender="user-c" region="content" section="ws-c17c732f48886d56d37fa6d1000a36a0">&lt;p&gt;Paragraph 2&lt;/p&gt;</replace> <insertBefore uri="hello.xhtml" sender="user-c" region="content" section="ws-08d30e97bcbc1fdd7889ceed8774637d" refSection="ws-c17c732f48886d56d37fa6d1000a36a0">&lt;p&gt;Paragraph 1&lt;/p&gt;</insertBefore> <unlock uri="hello.xhtml" sender="user-c" region="content" section="ws-c17c732f48886d56d37fa6d1000a36a0"/> <post uri="hello.xhtml" sender="user-z" receiver="user-c">Yes.</post> </operationset> </resource> </response> </dss> 3.7.7. Miscellaneous To get 3 GUIDs: <dss version="0.1" xmlns="http://livepad.klogms.org/livepad/spec/dss0.1.dtd"> <request> <getGuidset length="3"/> </request> </dss> which will return: <dss version="0.1" xmlns="http://livepad.klogms.org/livepad/spec/dss0.1.dtd"> <response> <resource> <guidset> <guid value="ws-c17c732f48886d56d37fa6d1000a36a0"/> <guid value="ws-08d30e97bcbc1fdd7889ceed8774637d"/> <guid value="ws-08d30e97bcbc1fdd7889ceed8774637d"/> </guidset> </resource> </response> </dss> Page 35 of 73 To get user information: <dss version="0.1" xmlns="http://livepad.klogms.org/livepad/spec/dss0.1.dtd"> <request> <getUserinfo user="user-z"/> </request> </dss> which will return: <dss version="0.1" xmlns="http://livepad.klogms.org/livepad/spec/dss0.1.dtd"> <response> <resource> <userinfo name="Zulu" color="#ff3"/> </resource> </response> </dss> To get URI information: <dss version="0.1" xmlns="http://livepad.klogms.org/livepad/spec/dss0.1.dtd"> <request> <getPageinfo uri="http://www.example.com"/> </request> </dss> which will return: <dss version="0.1" xmlns="http://livepad.klogms.org/livepad/spec/dss0.1.dtd"> <response> <resource> <pageinfo type="text/html" length="438" status="200" statusText="OK"/> </resource> </response> </dss> 3.7.8. Quiting the session <dss version="0.1" xmlns="http://livepad.klogms.org/livepad/spec/dss0.1.dtd"> <request> <quit uri="hello.xhtml" sender="user-c"/> </request> </dss> Page 36 of 73 3.8. Database Schema documents PK id content accesses PK,FK1 document_id PK,FK2 user_id sections PK PK,FK1 PK id document_id region_id FK2 user_id last_accessed operations PK PK,FK1 PK,FK2 PK,FK3 id document_id sender_id receiver_id time_created content users PK id name color Figure 3.8.a – The database schema in form of ER diagram The “documents” and “users” tables were the parent tables of the other tables. As suggested by their names: Table Each row represents documents The content of a document, identified with its ID. users The information of an user (name and color), identified with his/her ID. sections The user ID of the user who locked a section (can be null), identified with the section ID, region ID and document ID. operations The information of the operation (content and creation time), identified with the operation ID, document ID, sender ID and receiver ID. accesses The last access time of an user, identified with document ID and user ID. Table 3.8.b – Summaries of the tables Page 37 of 73 3.9. User Interfaces The user interface was similar to that found in most word processors like Microsoft Office and OpenOffice.org's Write. The main interface contained most of the frequently used functions, while the additional functions were accessible via the menus and popup windows. 3.9.1. Main interface Figure 3.9.1.a – The main interface The main interface displayed the document content, with menubar and toolbars on the top of the interface. In addition, a sidebar was used for instant messaging. Page 38 of 73 Figure 3.9.1.b – The content being edited, with sections being locked The sections were laid out vertically in the content area, in the order of appearance in the document. When a section was double-clicked, it was locked by the corresponding user. This section then became editable. Sections locked by other users were grayed out with border colored with their user colors respectively. The section locked by the user himself/herself was indicated with a black border. When the locked section was blurred, it was unlocked and became available to other users. Before unlocking, if the section contained more than one elements, it was then split into multiple sections, so as to keep each section containing exactly one element. If the section contained no element at all, it was removed. Hence, the user could only lock (and edit) one section at a time. Page 39 of 73 Figure 3.9.1.c – The instant messaging interface In the sidebar, the user list showed all the users that were editing the same document. And like most instant messengers, the chat room consisted of one or more tab panels for displaying the message log and for entering the message. Like IRC (Internet Relay Chat), there was always a public chat room, while private chat rooms could be opened and closed by individual users. Page 40 of 73 3.9.2. Properties window Figure 3.9.2.a – The properties window The properties window provided a categorized list of meta data of the document. Except the document title, other properties including <meta>, <link>, <style> and <script> were listed in form of data grid in which users are familiar with. Element meta Description A name/value pair used typically for meta information like author name, generator name, keyword, description, etc. link A link to document related to current document, for example, CSS, index page, previous/next page, etc. style A link to stylesheet (typically CSS) or container of inline styles. script A link to client-side script (typicaly JavaScript) or container of inline script. Table 3.9.2.b – Description of each document property Page 41 of 73 3.9.3. Insert windows Most of the complex objects were inserted via popup windows. These included symbol, hyperlink, table, and multimedia objects. Previewing of the result, if needed, was also available. Figure 3.9.3.a – The insert symbol window Figure 3.9.3.b – The insert link window Page 42 of 73 Figure 3.9.3.c – The insert table window Figure 3.9.3.d – The insert object window Page 43 of 73 3.9.4. Format windows In addition to modification of structural information, there were also popup windows for manipulating the presentational information (i.e. CSS). Unlike word processors, the style formatting was previewed lively using placeholder text like Lorem Ipsum. Figure 3.9.4.a – The format character window Figure 3.9.4.b – The format paragraph window Page 44 of 73 3.9.5. Localization With the use of DTD and property files, localization was done in a structured and organized way. Figure 3.9.5.a – The format character window under British English (en-GB) Figure 3.9.5.b – The same window under Traditional Chinese (zh-TW) Page 45 of 73 3.9.6. Theme The default theme (a set of CSS and image files) blended well with various default browser themes, like Winstripe in Mozilla Firefox and Modern/Classic in Mozilla. It was also possible to have multiple themes for the same editor. But in this implementation, only one theme was available. Figure 3.9.6.a – The insert symbol window under Modern (an official Mozilla theme) Figure 3.9.6.b – The insert symbol window under rein (a third-party Mozilla Firefox theme) Page 46 of 73 4. Results A number of tests were carried out to test the performance of the system. The configuration of the server and the client machines: Machine CPU RAM OS Server Pentium® 4 3.00 GHz 1024MB Windows XP SP2 Client Pentium® 3 800 MHz 256MB Windows 2000 SP4 Table 4.a – The configuration of the machines 4.1. Initial Loading Time This test tried to find out the relationship between the initial loading time (for loading document) and the number (and complexity) of the sections. The document contained a number of identical sections. Three test cases were used: 1. Small paragraph (simple structure) 2. Large paragraph (simple structure) 3. Large table (nested structure) Initial loading time with respect to number of sections 27.5 25 22.5 Time (s) 20 17.5 15 Case A (paragraph; 0.45KB/section) Case B (paragraph; 10.5KB/section) Case C (table; 9.5KB/section) 12.5 10 7.5 5 2.5 0 0 10 20 30 40 50 60 70 80 90 100 Number of sections Figure 4.1.a – The initial loading time with respect to the number of sections Page 47 of 73 Test case Section data Section size Case A <p>The big brown fox jumps over a lazy dog.</p> 0.45KB Case B <p>Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Mauris sem. Donec tellus. Nam odio sem, eleifend ac, tristique nec, vehicula vitae, enim. Sed sed arcu. Vestibulum sed nulla. Aenean a eros. Fusce mauris. Vivamus pretium sem non erat. Donec et neque. Suspendisse consequat molestie metus. Fusce eu urna vitae libero malesuada consectetuer. Donec magna. In hac habitasse platea dictumst. Quisque nibh. Nullam velit est, luctus nec, venenatis et, dignissim eget, urna. Praesent sit amet lectus eu lacus cursus eleifend. Nulla luctus velit non nulla. Integer in lacus. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; Donec odio. Donec quis lectus vitae sapien fringilla varius. Nullam in ante eget nisl posuere feugiat. Duis felis nisl, tincidunt vel, congue ut, tincidunt vel, felis. Pellentesque sit amet ipsum ac felis egestas molestie. Vestibulum dui dolor, molestie et, interdum a, tempor vitae, quam. Nulla pellentesque sapien vel enim. Proin ante mi, ultricies non, vehicula sit amet, malesuada in, ante. Morbi at ipsum.</p> 10.5KB Case C <table><caption>Lorem ipsum dolor sit amet</caption><tr><th>consectetuer adipiscing elit</th><th>Mauris sem. Donec tellus. Nam odio sem</th><th>eleifend ac, tristique nec, vehicula vitae, enim</th><th>Sed sed arcu. Vestibulum sed nulla</th><th>Aenean a eros. Fusce mauris</th></tr><tr><td>Vivamus pretium sem non erat</td><td>Donec et neque</td><td>Suspendisse consequat molestie metus</td><td>Fusce eu urna vitae libero malesuada consectetuer</td><td>Donec magna. In hac habitasse platea dictumst</td></tr><tr><td>Quisque nibh. Nullam velit est, luctus nec</td><td>venenatis et, dignissim eget, urna</td><td>Praesent sit amet lectus eu lacus cursus eleifend</td><td>Nulla luctus velit non nulla</td><td>nteger in lacus</td></tr><tr><td>Vestibulum ante ipsum primis</td><td>in faucibus orci luctus et ultrices</td><td>posuere cubilia Curae; Donec odio</td><td>Donec quis lectus vitae sapien fringilla varius</td><td>Nullam in ante eget nisl posuere feugiat</td></tr></table> 9.5KB Table 4.1.b – The test cases used in comparison of initial loading time The experimental results showed the initial loading time is directly related to the number of sections, but not the size and complexity of sections. Comparing case A and B, the section size in case A is was times of that in case B. Yet, the loading time was just 1.25 times of that in case A. Comparing case B and C, the section sizes were similar, but the content structure was more complex in case C as it was a table (a nested structure with multiple levels of elements). Yet, the loading time was very similar. Page 48 of 73 4.2. Maximum Number of Clients This test tried to try out the maximum number of clients allowed without overloading the server or causing malfunctioning. Due to the lack of machines and limitations in the computing power of the client machine, multiple clients were created in not just the client, but also the server. To prevent overloading the browser program, only a limited number of clients were created per browser. Multiple browsers were used to create a bigger number. After opening 10 clients in the client machine and 40 clients in the server machine, the system still functioned properly (while being slower). The test was then stopped. Machine Client Server Browser Number of clients Mozilla Firefox 1.0.2 5 K-Meleon 0.9 5 Mozilla Firefox 1.0.2 10 Mozilla 1.8 Beta 2 10 K-Meleon 0.9 10 Netscape Browser 8.0 Beta 10 Table 4.2.a – The distribution of clients Page 49 of 73 5. Discussion 5.1. Explanation of Results 5.1.1. Initial loading time The experimental results showed that the loading time is largely related to the number of sections. This was largely because when the document was loaded into a DOM tree, it was scanned for sections, which were used in processing like creating rows in database tables and creating operations in DSS. Clearly, the more the sections, the more amount of processing was required in the server-side. The loading time was not much affected by the section size and complexity. It was due to the fact that a node in a DOM tree is just a reference (or pointer). The processing time for processing a node does not related to the its size and its number of subtrees (if any). The small increase in loading time was mostly contributed by the time taken in writing/reading/transferring the actual data (XML, when serialized, is just a string of characters). 5.1.2. Maximum number of clients Although the maximum number of clients was not found, the possible number of clients should be sufficiently large. Other than the strong computing power of the testing server, one possible explanation is that for each request only a small amount of processing needed to be done. And in most time, the syndication returned no new operations as the human users were not as quick as the syndication interval. Page 50 of 73 5.2. Comparison with Related Works As most of the related works were not available to modern PC or Windows operating system, the following comparison was based on documentation or papers of the corresponding products. Product Real-time Transmission Operation Access Conflict level restriction resolution method LivePad Yes Polling Section-level Yes Prevention Iris Yes Socket Unknown No Recovery REDUCE Yes Socket Character-level No N/A iStorm Yes One-to-one Document- Prevention Yes level SubEthaEdit Yes Socket Character-level No Unknown MediaWiki N/A N/A Recovery No No Table 5.2.a – Comparison of related works Product LivePad Private area Notion of awareness Yes (changes are not Yes (sections are Need central server Yes visible until unlock) colored with user color) Iris Yes Yes Unknown REDUCE Unknown Unknown Partial iStorm No Yes No Yes (text modified are No SubEthaEdit No colored with user color) MediaWiki Yes No Yes Table 5.2.b – Comparison of related works (cont.) This project (LivePad) was more similar to iStorm, as both used locking mechanism to prevent edit conflicts. However, LivePad was more robust as the locking was section-level, not document-level as seen in iStorm. Page 51 of 73 But compared to character-level, socket-based CESes like REDUCE and SubEthaEdit, LivePad was less robust as it required a lock ahead edit. In characterlevel ones, authors could edit naturally at any place as he/she wants. The changes were visible to everyone immediately without waiting or committing. Since socket was used, a central server was not need in most cases. However, character-level CES was not possible over HTTP as the amount and frequent of data transfer would be overkilling. Hence section-level CES remaind the better option when web-based and interoperability is the major concern (when Java is not an option). Page 52 of 73 5.3. Benefits 5.3.1. Benefits of HTTP Besides being friendly to firewalls and routers, HTTP is a high-level protocol. Hence less coding was needed for dealing with communications. 5.3.2. Benefits of document model Thanks to the region/section model, locking was not as restricted and exclusive as document-level locking where only one author can edit at a time. In this model, one author only edited one or more sections at a time. Of course, compared to characterlevel one, access was more restrictive. 5.3.3. Benefits of centralized approach Because of the centralized approach and locks, there was no need to deal with complex situations as seen in character-level editing system like REDUCE. As operations were independent and centralized in the central server, there was only a need of operation serialization. Computing was then much faster in the server-side. 5.3.4. Benefits of XML-based protocol XML, unlike URL-encoded string (query parameters), can contain structured data. As the server-side was a black box from client-side point of view, it was possible to implement another client that was compatible with this implementation, as long as it obeyed the rules and XML format defined in the protocol. Also, as the manipulation of the section content (e.g. how the text is input, how the attributed is modified, etc) was independent of the protocol, it was possible to implement a client with better editing features than the current implementation without the need to alter the protocol. Page 53 of 73 5.3.5. Benefits of Mozilla platform The Mozilla platform provides good support of web standards and technologies. There was much less need to deal with the quirks and bugs in the the rendering engine (e.g. Internet Explorer). Mozilla's XUL is an user-interface-centric markup language. Hence it can do a lots more than HTML, which is a document-centric markup language. There is no need to use HTML to mimic UI widgets. It is also possible to mix multiple XML namespaces together, hence HTML elements can be reused inside XUL document, vica versa. Page 54 of 73 5.4. Limitations 5.4.1. Limitations of HTTP The main drawback of HTTP is that data transmission is limited to the request/response model, meaning that sockets and multicasting is not possible. Hence the workaround was to use polling, in which the clients sent requests to the server in regular time intervals. It was hoped that by minimizing the time interval, the users would perceive a pseudo real-time syndication. However, this may not be feasible for website with limited bandwidth or heavy traffic, as the amount of web traffic can slow down, or even overload the web server (similar to DoS attack and “Slashdot effect”). Another problem of HTTP is that the protocol is stateless, meaning that the server cannot find out if a client is still online or not. The current workaround was that whenever there was new request, the sever would check for idle users and force them to quit, if any. As all properly functioning clients polled the server regularly, an idle client is defined as a client that did not issue any request after a threshold amount of time (which was set as a value bigger than the polling internal). As a client was always anonymous, in this implementation, the client information like user information and document information was sent in every request (similar to other “web API”s like Atom API). 5.4.2. Limitations of document model One of the objectives of this project was to make the system general enough to be usable on any dialects of XML. However, it was found that the region/section model may not fully fulfill this objective. In the model, each section can contain one and only one document element, no matter how broad (e.g. a long list) or how heavily nested it is (e.g. a large table). For semantic-rich XHTML document, this model was ok, as the width and the level of nesting was usually small. Page 55 of 73 However, some XML-based documents are unavoidably heavily nested, and no section can be defined as the whole document consists of one section/region only. An example is MathML, which can be used to express complex mathematical formula. For example, the following MathML code (source: W3C MathML Test Suite) <math xmlns="http://www.w3.org/1998/Math/MathML"> <apply> <int/> <bvar> <ci>x</ci> </bvar> <apply> <fn> <ci>f</ci> </fn> <ci>x</ci> </apply> </apply> </math> would be rendered like this: ∫ f x dx Layout elements like rows and tables exist in MathML. But they are used in a nested, not serial way. Hence it is not possible to apply the region/section definition on this kind of document. In this implementation, plain text documents could also be used. However, the results was not satisfactory as the initial loading time was much longer than that of the XHTML document of the similar file size, as plain text documents typically contain hundreds or thousands of lines (each line is treated as a section). It is also inconvenient to edit as editing involves manipulation of multiple lines. 5.4.3. Limitations of locking mechanism A problem of the locking mechanism was that when a region was initially empty or became empty during editing session, there was no way to ensure that the new section added afterward would be consistent among various clients. In the defined list of editing operations, only “append” could be used to add new section when a region was empty. As no locking was required before appending section and clients could append new section at similar time, there was no guarantee about the order of the newly added sections that appeared in the clients. Page 56 of 73 5.4.4. Limitations of Mozilla platform Because of the strict security restrictions imposed on remote documents, the native HTML editor could not be used. The workaround used was eDOM, a JavaScript module for editing XHTML elements. Besides being slower than the native editor, it also did not support input methods other than direct keyboard input, for example IME in Windows. Hence it was not possible to input non-ASCII characters directly. However, there were some downside in using native HTML editor (“midas”) provided by Mozilla. Firstly, it did not provide all the necessary functions in a word processing system. Secondly, it was not possible to control its behavior, for example, the action to do when the Enter key is pressed. Thirdly, the HTML code generated by the editor was not always valid, for example, deprecated HTML elements were used. Finally, it is not possible to make just a portion of the document editable. Either the whole document is editable, or not. In Internet Explorer, there is an attribute called “contenteditable” which controls the whether an element can be edited or not. In one of the CSS Level 3 drafts, there was a CSS property called “user-modify” which more or less did the same thing. In both “midas” and eDOM, form elements and contents inside frames were not editable. But note that these should not be used in the first place. Last but not least, some commonly used widgets were missing in the XUL specification, noticeably spinbox (a numeric textbox with arrows that change the value by a offset) and slider (a scrollbar-like widget that modifies the values by dragging the thumb). These widgets would most likely to be implemented in the next iteration of the Mozilla rendering engine (known as Gecko). Page 57 of 73 5.4.5. Limitations due to client-side implementation In the user interface, the sections were laid out vertically by their order of appearance. This assumption was not always correct as XHTML elements could also be absolutely or fixed positioned using CSS. This assumption was also not applicable to elements of many other XML-based documents like MathML and SVG, where the layout is independent to its position in the document. Some of the ways to determine the position: • Explicitly defined using X/Y position • Using their semantic meaning (e.g. integration, differentiation) • Using layout elements like tables and boxes Also, because of the lock-ahead-edit restriction, it was more time consuming in deleting multiple sections. Each sections to be deleted had to be locked first before deleting. It was also hard to merge multiple consecutive sections together as the user could only lock one section at a time (this is a restriction imposed by the user interface, not the syndication protocol and mechanism). Another limitation was that authorship was only reflected when the section is locked. All unoccupied sections look the same. There is no indication of authorship of characters like SubEthaEdit. Page 58 of 73 5.5. Applications Given these characteristics, this editing system is most suitable for use in intranet. The system could be used to in any area where chatting and editing are needed at the same time. Some of the possible scenarios: • Collaborative editing • Internet conferencing (document could be used as a memo area) • Group discussion • Brainstorming • Code review Page 59 of 73 5.6. Suggestions 5.6.1. Suggestions on document model In the current model, each section could contain one and only one document element. However, it could be modified so that it allows a maximum of M elements and a minimum of M/2 elements, just like B-tree. Similarly when the section overflows or underflows, it would be split or merged like B-tree. The problem of this suggestion is that it may not hard to determine the right value of M. Also, the segmentation of the document will still look skewed as the size of each section varies. 5.6.2. Suggestions on locking mechanism 5.6.2.1. Solutions to empty region problem There are two ways to avoid of the problem of empty region: • Prevention Both client-side and server-side can prevent the happening of empty region by preventing the deletion of last remaining section and adding a dummy section of the empty region respectively. • Lock region In this implementation locks were applied to sections only. It could be modified to allow locking of the whole region, only as to enable that only one client can append a new section to an empty region. Page 60 of 73 5.6.2.2. Hierarchical locking To get rid of the inconvenience of using sections, it is possible to drop the model and allows the clients to lock any element in the document tree. Before acquiring a lock, the client needs to check for three important prerequisites: • The element itself is not locked This is fundamental in locking mechanism. • No element in its subtrees is locked If this is allowed, this edit would overwrite the other edits. Node to be locked Node to be checked Other node Path to check Figure 5.5.2.a – Element nodes to be checked in its subtrees • No element along the path toward the document root is locked If this is allowed, other edits would overwrite this edit. Node to be locked Node to be checked Other node Path to check Figure 5.5.2.a – Element nodes to be checked in its supertree Page 61 of 73 However, it would be more complicate to handle the locking. In this implementation, the checking of the section status can be done simply by issuing a SQL query. However, if hierarchical locking is allowed, SQL could not be used as the server needs to check prerequisites other than the first one. It need to load the whole document into memory and do the corresponding checking, which would be much more time consuming. 5.6.3. Suggestions on system architecture In this implementation, the editing system was deeply integrated with the content provider. However, it is possible to go into another direction where the content provider, central server and client are separated as follows. • Content provider The content provider could expose its editing interface via not native programming API, but standard web APIs like WebDAV and Atom API. • Central server No change in the coordinator, but the adapter would be modified to issue web API calls. Also, the central server is not longer dedicated to one content providers, but any number of content providers, just like public IRC server. • Client The client would not longer be launched from the web application. It would be modified as a Mozilla/Firefox extension, which is a small XUL/JavaScript program running on top of the browser. Since the program is part of the browser, it can access a variety of privileged objects (e.g. the native HTML editor and numerous other objects) which could improve the user experience. However, the major problem is that the extension manager interface is subject to change as Mozilla/Firefox evolves. An extension would be broken when there is a major release of the browser. Page 62 of 73 6. Conclusion In conclusion, this editing system was feasible for editing XHTML documents collaboratively across intranet or internet (for website with smaller amount of traffic). The primary objectives of the project were met, but further improvement is required before editing other XML-based documents. As a side note, the implementation of web-based CES is not limited to XUL/Mozilla, but XUL was chosen as it is a markup language designed for user interface (where XHTML is a markup language designed for hypertext document). Personally, I had learned much of the XUL and the basic of PHP, which I did not know before the project. My knowledge in JavaScript and W3C DOM was also enhanced. As a whole, CES should be more widely promoted, more widely deployed and more widely available on as many platforms as possible. Web-based CES, while not being ideal, is a simple and feasible solution for large scale adoption of the CES technology. Page 63 of 73 7. References Garrett, J.J. (2005, February 18). Ajax: A New Approach to Web Applications. Available: http://www.adaptivepath.com/publications/essays/archives/000385.php [2005, Feb 28] Koch, M. (1995, August). The Collaborative Multi-User Editor Project Iris. Technical Report TUM-I9524. Institut für Informatik. Universität München. Available: http://www11.informatik.tu-muenchen.de/publications/html/Koch1995/ [2004, Sep 9]. Koch, P.P. (2005, March 13). Ajax, promise or hype? Available: http://www.quirksmode.org/blog/archives/2005/03/ajax_promise_or.html [2005, Mar 17]. MathGameHouse (2002, October). iStorm. Available: http://www.mathgamehouse.com/istorm/ [2004, Sep 9]. RFC 2518 (1999, February). HTTP Extensions for Distributed Authoring – WEBDAV. Available: http://www.ietf.org/rfc/rfc2518.txt [2004, Nov 16]. Sun, C., Jia, X., Yang, Y., and Zhang Y. C. (1997, August). REDUCE: a prototypical cooperative editing system. Proceedings of the 7th International Conference on Human-Computer Interaction, San Francisco, USA, 89-92. Available: http://citeseer.ist.psu.edu/sun97reduce.html [2004, Sep 7]. Sun, C., Jia, X., Yang, Y., Zhang Y. C., and Chen, D. (1998, March). Achieving convergence, causality-preservation, and intention-preservation in real-time cooperative editing systems. ACM Transactions on Computer-Human Interaction 5(1), 63-108. Available: http://citeseer.ist.psu.edu/sun98achieving.html [2004, September 7]. Page 64 of 73 TheCodingMonkeys (2003). SubEthaEdit. Available: http://www.codingmonkeys.de/subethaedit/ [2004, Sep 6]. Wikimedia Foundation. (2002, March 20). MediaWiki. Available: http://en.wikipedia.org/wiki/MediaWiki [2005, Mar 24]. Page 65 of 73 8. Acknowledgment I would like to thank Dr Andy Chun, my supervisor, for giving me much a large degree of freedom and flexibility during the past year. While I had changed my mind about the direction of the project several times in the previous semester, he was being supportive in each decision I made. He had given me important comments and guidances when I was in doubt. I would also like to thank to the Mozilla development teams and the Mozilla communities (including some very talented programmers, writers and artists) for creating such great products (e.g. Mozilla, Firefox), great infrastructures (e.g. bugzilla), great documentations (e.g. XULPlanet) and great support (e.g. XULPlanet forums, mozdev mailing list). Without these, there would not be such project. Page 66 of 73 9. Appendix 9.1. Appendix A: DSS 0.1 DTD Specification <!-The DSS (Document Section Syndication) 0.1 Specification Date: Initial author: Contributors: --> 2005/03/02 Ming Hong Ng <minghong@gmail.com> <!--========================= Document Structure ============================--> <!ELEMENT dss(request|response) <!ATTLIST dss version CDATA #REQUIRED > -- Top-level element --> -- The version number -- <!--=========================== Request Message =============================--> <!ELEMENT request(join|quit|save| lock|unlock|append|insertBefore|update|remove|post| getPageinfo|getUserinfo| getGuidset|getOperationset) -- Request to server --> <!ELEMENT getPageinfo(EMPTY) <!ATTLIST getPageinfo uri CDATA > <!ELEMENT getUserinfo(EMPTY) <!ATTLIST getUserinfo user CDATA > <!ELEMENT getGuidset(EMPTY) <!ATTLIST getGuidset length CDATA > -- Get page information --> #REQUIRED -- The URI of the page --- Get user information --> #REQUIRED -- The user to be viewed --- Get a set of GUIDs --> #REQUIRED <!ELEMENT getOperationset(EMPTY) <!ATTLIST getOperationset uri CDATA #REQUIRED sender CDATA #REQUIRED > -- The number of GUIDs to get --- Get a set of operations --> -- The URI of the documet --- The user who requests -- <!--=========================== Response Message ============================--> <!ELEMENT response(status|resource) -- Response to request --> <!ELEMENT status(EMPTY) <!ATTLIST status success (true|false) > -- Commission status --> #REQUIRED <!ELEMENT resource(pageinfo|userinfo| guidset|operationset) <!ELEMENT pageinfo(EMPTY) <!ATTLIST pageinfo type CDATA length CDATA status CDATA statusText CDATA > -- Operation success or not -- -- Queried server resource --> -- Page information --> #IMPLIED #IMPLIED #IMPLIED #IMPLIED ----- Page 67 of 73 The content The content HTTP status HTTP status type -length -code -text -- <!ELEMENT userinfo(EMPTY) <!ATTLIST userinfo name CDATA color CDATA > <!ELEMENT guidset(guid)+ <!ELEMENT guid(EMPTY) <!ATTLIST guid value CDATA > -- User information --> #REQUIRED #REQUIRED -- The name --- A CSS color --- A set of GUIDs --> -- A global unique ID --> #REQUIRED -- The value -- <!ELEMENT operationset(join|quit|save|lock|unlock| append|insertBefore|update|remove|post)* -- A set of operations --> <!--============================= Operations ================================--> <!ELEMENT join(EMPTY) <!ATTLIST join uri CDATA sender CDATA > <!ELEMENT quit(EMPTY) <!ATTLIST quit uri CDATA sender CDATA > <!ELEMENT save(EMPTY) <!ATTLIST save uri CDATA sender CDATA > <!ELEMENT lock(EMPTY) <!ATTLIST lock uri CDATA sender CDATA region (content|meta) section CDATA > <!ELEMENT unlock(EMPTY) <!ATTLIST unlock uri CDATA sender CDATA region (content|meta) section CDATA > <!ELEMENT append(#PCDATA) <!ATTLIST append uri CDATA sender CDATA region (content|meta) section CDATA > <!ELEMENT insertBefore(#PCDATA) <!ATTLIST insertBefore uri CDATA sender CDATA region (content|meta) section CDATA refSection CDATA > <!ELEMENT replace(#PCDATA) <!ATTLIST replace uri CDATA sender CDATA region (content|meta) section CDATA > -- Join a session --> #REQUIRED #REQUIRED -- The URI of the documet --- The user who joins --- Quit a session --> #REQUIRED #REQUIRED -- The URI of the documet --- The user who quits --- Save a document --> #REQUIRED #REQUIRED -- The URI of the documet --- The user who saves --- Lock a section --> #REQUIRED #REQUIRED #REQUIRED #REQUIRED ----- The The The The URI of the documet -user who locks -section's region -section to be locked -- -- Unlock a section --> #REQUIRED #REQUIRED #REQUIRED #REQUIRED ----- The The The The URI of the documet -user who unlocks -section's region -section to be unlocked -- -- Append a section --> #REQUIRED #REQUIRED #REQUIRED #REQUIRED ----- The The The The URI of the documet -user who appends -section's region -section to be appended -- -- Insert a section --> #REQUIRED #REQUIRED #REQUIRED #REQUIRED #REQUIRED ------ The The The The The URI of the documet -user who inserts -section's region -section to be inserted -reference section -- -- Replace a section --> #REQUIRED #REQUIRED #REQUIRED #REQUIRED ----- Page 68 of 73 The The The The URI of the documet -user who replaces -section's region -section to be replaced -- <!ELEMENT remove(EMPTY) <!ATTLIST remove uri CDATA sender CDATA region (content|meta) section CDATA > <!ELEMENT post(#PCDATA) <!ATTLIST post uri CDATA sender CDATA receiver CDATA > -- Remove a section --> #REQUIRED #REQUIRED #REQUIRED #REQUIRED ----- The The The The URI of the documet -user who removes -section's region -section to be removed -- -- Post a message --> #REQUIRED #REQUIRED #REQUIRED -- The URI of the documet --- The user who sends --- The user who receives -- Text 5.1.a – DTD specification of Document Section Syndication 0.1 Page 69 of 73 9.2. Appendix B: Cache Database SQL Definition ------------------------------------ ***** BEGIN LICENSE BLOCK ***** Version: MPL 1.1/GPL 2.0/LGPL 2.1 The contents of this file are subject to the Mozilla Public License Version 1.1 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.mozilla.org/MPL/ Software distributed under the License is distributed on an "AS IS" basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License for the specific language governing rights and limitations under the License. The Original Code is LivePad. The Initial Developer of the Original Code is Ming Hong Ng <minghong@gmail.com>. Portions created by the Initial Developer are Copyright (C) 2004-2005 the Initial Developer. All Rights Reserved. Contributor(s): Alternatively, the contents of this file may be used under the terms of either the GNU General Public License Version 2 or later (the "GPL"), or the GNU Lesser General Public License Version 2.1 or later (the "LGPL"), in which case the provisions of the GPL or the LGPL are applicable instead of those above. If you wish to allow use of your version of this file only under the terms of either the GPL or the LGPL, and not to allow others to use your version of this file under the terms of the MPL, indicate your decision by deleting the provisions above and replace them with the notice and other provisions required by the GPL or the LGPL. If you do not delete the provisions above, a recipient may use your version of this file under the terms of any one of the MPL, the GPL or the LGPL. ***** END LICENSE BLOCK ***** -- Database creation script. -- Author: Ming Hong Ng <minghong@gmail.com> -- Version: 0.1, 2005/02/13 -- Since: 0.1 BEGIN; -- Begin transaction -- The documents currently being edited CREATE TABLE documents( id VARCHAR(255), -- Document ID (e.g. URI) content TEXT NOT NULL, -- The content (XML document) CONSTRAINT documents_id_pk PRIMARY KEY(id) ); -- The users who joined the sessions CREATE TABLE users( id VARCHAR(255), name VARCHAR(255) NOT NULL, color VARCHAR(255) NOT NULL, CONSTRAINT users_id_pk PRIMARY KEY(id) ); -- User ID -- Screen name -- A CSS color served as avatar -- The access log CREATE TABLE accesses( document_id VARCHAR(255), -- Document ID (e.g. URI) user_id VARCHAR(255), -- User ID last_accessed INTEGER NOT NULL, -- The last time user accessed CONSTRAINT accesses_did_uid_pk PRIMARY KEY(document_id, user_id) CONSTRAINT accesses_document_id_fk FOREIGN KEY(document_id) REFERENCES documents(id) CONSTRAINT accesses_user_id_fk FOREIGN KEY(user_id) REFERENCES users(id) ); Page 70 of 73 -- The operations made during the sessions CREATE TABLE operations( id CHAR(35), -- Operation ID (e.g. GUID) document_id VARCHAR(255), -- Document ID (e.g. URI) sender_id VARCHAR(255), -- User ID of the sender receiver_id VARCHAR(255), -- User ID of the receiver time_created INTEGER NOT NULL, -- The time when it was created content TEXT NOT NULL, -- The content (XML element) CONSTRAINT ops_id_did_sid_rid_pk PRIMARY KEY(id, document_id, sender_id, receiver_id), CONSTRAINT ops_document_id_fk FOREIGN KEY(document_id) REFERENCES documents(id), CONSTRAINT ops_sender_id_fk FOREIGN KEY(sender_id) REFERENCES users(id), CONSTRAINT ops_receiver_id_fk FOREIGN KEY(receiver_id) REFERENCES users(id) ); -- The sections inside regions CREATE TABLE sections( id VARCHAR(32), -- Section ID (e.g. GUID) document_id VARCHAR(255), -- Document ID (e.g. URI) region_id VARCHAR(255), -- Region name user_id VARCHAR(255), -- User ID of the locker CONSTRAINT sections_id_did_rid_pk PRIMARY KEY(id, document_id, region_id) CONSTRAINT sections_document_id_fk FOREIGN KEY(document_id) REFERENCES documents(id), CONSTRAINT sections_user_id_fk FOREIGN KEY(user_id) REFERENCES users(id) ); COMMIT; -- Commit transaction Text 5.2.a – SQL definition of cache database Page 71 of 73 9.3. Appendix C: Web Development Resources The followings are some of the software tools that I had used in this project. They are free of charge (and free as in freedom, except Crimson Editor which is a freeware). • Mozilla/Firefox/Camino (http://www.mozilla.org) Besides being a requirement of running the editing system, these browsers are very standard-compliant, with many user-friendly features not found in other browsers. • SQLite Database Browser (http://sqlitebrowser.sourceforge.net) A GUI front-end for SQLite 2.x database. • Web Developer (http://www.chrispederick.com/work/firefox/webdeveloper/) A must-have Mozilla/Firefox extension for web developer. • LiveHTTPHeaders (http://livehttpheaders.mozdev.org) View HTTP headers in HTTP requests and responses. • Crimson Editor (http://www.crimsoneditor.com) A professional text editor for Windows. The followings are websites that provides information useful when developing Mozilla-based application: • XULPlanet (http://www.xulplanet.com) The largest XUL site on the web, with comprehensive tutorials, detailed references, application downloads and more. • mozilla.org Bugzilla (https://bugzilla.mozilla.org) A public database recording bugs of Mozilla products. Page 72 of 73 9.4. Appendix D: Product Website The product website can be publicly accessed via http://livepad.klogms.org. LivePad is available for download, licensed under MPL/GPL/LGPL tri-license. Figure 8.4.a – A screenshot of the product website (also showing the toolbar of Web Developer extension) Page 73 of 73