Final Report Jen-Cheng Huang Vinit Shah Mike Wilt CS8803—Advanced Internet Application Development Prof. Ling Liu Spring 2008 Project Overview Project Title BuzzPost: A new architecture for online forums Demo Websites http://www.buzzpost.net and http://www.buzzpost.net/vinit/buzzpost3/ Summary BuzzPost is fast and fluid like a chat room but stable like a traditional message board. Ajax programming techniques enhance the user experience. For example, when viewing a thread, the conversation updates as soon as a reply is posted, like an AIM conversation. In addition, a separate page refresh is not required to log in or sign up for a new account; all database queries are executed in the background. In a future iteration of this project, individuals will be able to create their own online forums using the BuzzPost architecture. Individuals will be able to customize the design of their forum using CSS, predefined “skins,” and/or a variety of layouts. Forums will be hosted for free by BuzzPost. Revenue generated by advertisements offsets this cost. To the extent feasible, revenue will be shared with individuals who create and administer forums. Technologies Many technologies make BuzzPost work, including: • • • • • • • • • AJAX PHP MySQL Javascript Javascript Object Notation (JSON) CSS Cookies SHA-1 Encryption CAPTCHA Authentication -1- Objectives Most online forums—even the most vibrant ones—rely on a static message board. A separate page load is required to log in, start a thread or post a reply. Threads are not updated automatically. Conversely, most chat applications update instantaneously. While a traditional chat room might bridge this gap, we find that chat rooms are poorly structured: conversations overlap which makes them difficult to follow or return to at a later date. Thus, the goal of BuzzPost is to combine the stability and structure of a traditional message board with the fluidity and speed of a chat application. Users appreciate instantaneous feedback. Imagine if AOL’s instant messenger required a page refresh to send and receive messages. No doubt its appeal would be diminished. The best part about chatting with someone is that you can interact with them in real time. Chats are usually private conversations between to two people and tend to occur between people already acquainted with each other. Online forums possess a different dynamic. There are many participants at once. Participants share a similar interest, be it a sports team, programming language or television show, but rarely is the entire community on a first name basis with one another. Nonetheless, we believe that applying the “instant” nature of chat applications to an online forum will enhance the user experience. We intend to change expectations for the behavior of an online forum in the same way that Gmail altered expectations for webmail applications. Chats tend to be ephemeral. Once the chat window is closed the conversation is effectively over. On the other hand, exchanges that occur in an online forum don’t necessarily recede into the ether once someone leaves. BuzzPost is designed to afford concurrent, synchronous communication as well as asynchronous communication from users who visit (or revisit) the thread at a later date. Table 1. Comparing a chat application to an online forum Attribute Chat Application Online Forum Participants Two, typically Many Feedback Instant Static; requires page reload Relationship People you know in real life People with similar interest(s) After you leave… Conversation disappears Thread remains -2- Examples AIM, Google Chat, Yahoo! Messenger, MSN Messenger, etc. phpBB, vBulletin, ProBoards A static approach befits the pace of most message boards, particularly those with low traffic. But it’s easy to imagine the opposite situation where many users flock to a particular thread, and the nature of the conversation changes quickly. For this reason, a message board that automatically updates can be advantageous. It transforms a static experience into one that is more dynamic. Even in low traffic situations, the user experience improves if Ajax is employed to make the application more seamless. This might seem trivial to some; after all, a fast server can reduce the overhead from page refreshes. However, one only has to look at the praise garnered by Google Maps, flickr, and the rising popularity of online office suites to know that Ajax—if employed tastefully—can make the user experience infinitely more satisfying. One problem with existing forum software is that the user must install the software on his or her own server. What’s more, it can be difficult to customize the layout. To lower these barriers, we propose to host all forums that use the BuzzPost architecture for free and provide tools that enable users to customize their own forums. Blogger does this already for blogs; in the late 1990s, ONElist (which later merged with eGroups and was purchased by Yahoo!) provided a free mailing list service to anyone who wanted to start a listserv. Our strategy is similar. To offset the cost of hosting many virtual communities, we expect to generate revenue from advertising. Ideally, we will share revenue with forum administrators. In summary, the objectives1 for BuzzPost are as follows: 1. 2. 3. 4. 1 Design an online forum that is fast, responsive and scalable. Enable users to create and customize their own forums. Host forums for free. Develop a sustainable revenue model to support the service. This semester we achieved the first objective and will continue to work on Objectives 2-4. -3- Architecture How does Buzzpost work? In a nutshell, clients repeatedly make HTTP requests to BuzzPost asking if new content has been added to the database. If new content exists, that data is returned to the client in JSON format. The client parses this data and appends it to existing content. To minimize strain on the database, every time a new message is added to the database the result from this insertion is cached. This way other clients can determine the dataset has changed without executing an actual query. A diagram of this architecture, known as “traditional polling,” is contained in Appendix A. During the course of this project, we examined Comet or “Ajax Push” as an alternative to traditional polling. With Comet, once a user sends a message the server pushes it to the client immediately. This reduces the burden on the server since the clients wait patiently for new messages rather than ping the server, over and over again. We installed and tested two strains of Comet: the Bayeaux protocol and the Orbited protocol. Unfortunately, we did not have time to convert the BuzzPost architecture to Comet. A diagram of the Comet architecture is contained in Appendix B. Evaluation and Testing BuzzPost’s current architecture can support at least simultaneous 100 users. With active 100 users, the server’s CPU usage rises to about 20 percent. (See graph below.) Thus, we estimate the average cost per user to be less than $0.20 per user since our host (linode.com) charges $20 per month. After the initial round of testing, we found that performance decreased with about 75 users. However, we discussed this issue with our host, linode.com. Our account was then migrated to a more powerful server with dual quad-core Xeon processors and RAID 1 mirroring for the same cost of $20 per month. This change increased BuzzPost’s capacity; performance did not decrease even with 100 users. Figure 1. BuzzPost CPU usage for 100 active users 100 users -4- We believe that additional improvements could be made to the database architecture to reduce the strain on the server even more. However, we beleive the real key lowering the cost per user is to make the jump from “traditional polling” to Comet. Contributions The BuzzPost architecture emphasizes a user experience that is fast, simple and reliable. Specifically, we developed the following components:2 1. Automatic updates. The list of threads and individual conversations updates automatically, like an AIM conversation. 2. Format content. Users can format submissions. Figure 2. Text formatting and emoticon example 3. Seamless log in and user authentication system. Users can log in, log out and/or register without loading a separate page. Database lookups and insertions will be done “on the fly.” So for example, if the user enters a username that already exists he or she is notified immediately. Figure 3. Instant feedback when the user selects a username that already exists 4. CAPTCHA Authentication. When a user registers for the first time, he or she receives an email with a CAPTCHA code that must be entered at the site to fully activate his or her account. A more conventional solution would be to embed a CAPTCHA in the log in form and conduct the email verification by sending a link the user can click on to verify his or her email address. Although conducting the CAPTCHA test via email makes the process slightly more cumbersome we believe there is a benefit in that the 2 Most components have been implemented in the final version at http://www.buzzpost.net. However, some are still in “beta” at http://www.buzzpost.net/vinit/buzzpost3/ -5- system makes it slightly more difficult for a malicious user or bot to set up multiple accounts. Figure 4. Validation email from BuzzPost with CAPTCHA code to enter at the site 5. Moderation system. A moderation system is available for users to rate comments and flag inappropriate content. Checks and balances are in place so that users cannot rate their own posts or rate a post more than one time. Moderation buttons Figure 5. User can rate each post 6. Edit profiles. Users can edit profiles. We employed Ajax so that users can toggle between “edit” and “save” mode without incurring a separate page refresh. Figure 6. A user edits his profile 7. Search engine. BuzzPost includes a simple search engine. The search mechanism is fast and easy to use; a separate page refresh is not required. Next, we would like to implement an algorithm that considers the forum’s moderation system. -6- 8. Private messaging system. Similar to Facebook, users can send private messages to one another. Figure 7. User is sending a private message 9. Cross-platform compatibility. BuzzPost adheres to this principle. It is important that everyone be able to access BuzzPost forums regardless of web browser or operating system. Extensions 1. Convert to Comet. As noted above, converting the architecture to Comet from traditional polling will key the scalability of this project. 2. Alternate layouts. We will develop alternate layouts to display content. 3. Alternate versions. We will implement a javascript-free version and a version compatible with mobile devices. 4. Content syndication. We will enables RSS feeds of the forum and/or individual threads. In addition, users will be able track thread thanks to email notifications. 5. GUI for webmasters. We will create a GUI for webmasters to manage and design their communities. Forum statistics will also be available here. 6. Advertising model and business plan. We will develop a business plan to assess the feasibility of supporting many forums via advertising. We will consider the pros and cons of “open sourcing” the BuzzPost code. 7. Additional testing. Many of the themes discussed in class are relevant to BuzzPost, particularly those relating to Social Network and Web 2.0. -7- For example, one paper we read entitled “Expertise Networks in Online Communities: Structure and Algorithms” [1] analyzes how expertise is disseminated in an online forum. We believe similar experiments could be conducted using the BuzzPost architecture. In addition, since BuzzPost relies heavily on Ajax and Ajax lowers the overhead associated with each page request, we believe BuzzPost lends itself to experiments that have to do with data caching and replication in a Web 2.0 environment. This semester we read a paper entitled “Enhancing the Web’s Infrastructure --From Caching to Replication” [2]. Perhaps this paper could serve as the foundation for additional studies involving BuzzPost. Related work BuzzPost provides a unique user experience and numerous innovations compared to existing forums. However it is wise to analyze online forums that already exist. Wikipedia contains descriptions of nearly one hundred forums. 3 We examined each one in detail. Some are open source, others are proprietary; some are free, others are not. A variety of programming languages are used: PHP, ASP, Ruby, Perl and others. Most forums do not offer remote hosting as we plan to offer someday; instead users must install the code on their own server. The first table contains details about popular forum software. The second table points to existing forums that appear to have been developed independently. Table 2. Popular Forum Software Name Website Cost Technology Remote Host? phpBB http://www.phpbb.com Free PHP No V Bulletin http://www.vbulletin.com $85/yr PHP No EZ Board (now Yuku) http://www.ezboard.com Free (Ad supported). Can upgrade to “pro” version for $36/yr. Smalltalk Yes Vanilla http://www.getvanilla.com Free PHP/Ajax No ProBoards http://www.proboards.com Free (Ad supported) PHP (?) Yes Orca http://www.demozzz.com/o rca/demo/ Free PHP/Ajax No 3 http://en.wikipedia.org/wiki/Comparison_of_Internet_forum_software_%28ASP%29, http://en.wikipedia.org/wiki/Comparison_of_Internet_forum_software_%28PHP%29, http://en.wikipedia.org/wiki/Comparison_of_Internet_forum_software_%28Other%29 -8- Table 3. Existing Forums Name Website Comment Adobe http://www.adobe.com/cfusion/webforums/forum Similar to phpBB with some unique features. FLV Media Player http://www.jeroenwijering.com/?forum=All Sleek but simple. Cool emoticons and search engine. Phantasy Tour http://www.phantasytour.com/phish/boards.cgi Clean layout and simple functionality. QBN http://www.qbn.com/filter/all/public_voice/recent/2 4_hours/ Innovative layout. I Love Everything http://www.ilxor.com/ILX/index.jsp Proof that a popular forum doesn’t have to look pretty. Think Tank http://www.thinktankforums.com/ Minimalist design. References [1] J. Zhang, M. Ackerman, and L. Adamic. Expertise Networks in Online Communities: Structure and Algorithms. WWW 2007. [2] “Enhancing the Web’s Infrastructure --- From Caching to Replication.” Michael Baentsch, Lothar Baum, Georg Molter, Steffen Rothkugel, and Peter Sturm, IEEE Internet Computing, vol. 1, no. 2, pages 18-27, April 1997 -9- Appendix A: Traditional Polling Architecture A diagram of the traditional polling architecture used by BuzzPost. Step 1: Clients repeatedly check the value contained in a document cached by the database. Our prototype sets an interval of 1000 milliseconds. Step 2: One client inserts a new message into the database. A new result is cached. - 10 - Step 3: The clients detect that the cached value has changed. Step 4: The clients query the database and new content is returned to the client. Return to Step 1. - 11 - Appendix B: Comet or “Ajax Push” Architecture A diagram of Comet or “Ajax Push” architecture that could be implemented by BuzzPost. Step 1: Clients wait for updates from the server. Step 2: Client inserts a new message into the database. Cache value is updated. - 12 - Step 3: Immediately, the new value is pushed to the clients. Step 4: Clients return to “listening mode.” - 13 -