Final Report Jen-Cheng Huang Vinit Shah

advertisement
Final Report
Jen-Cheng Huang
Vinit Shah
Mike Wilt
CS8803—Advanced Internet
Application Development
Prof. Ling Liu
Spring 2008
Project Overview
Project Title
BuzzPost: A new architecture for online forums
Demo Websites
http://www.buzzpost.net and http://www.buzzpost.net/vinit/buzzpost3/
Summary
BuzzPost is fast and fluid like a chat room but stable like a traditional
message board.
Ajax programming techniques enhance the user
experience. For example, when viewing a thread, the conversation
updates as soon as a reply is posted, like an AIM conversation. In
addition, a separate page refresh is not required to log in or sign up for a
new account; all database queries are executed in the background.
In a future iteration of this project, individuals will be able to create their
own online forums using the BuzzPost architecture. Individuals will be
able to customize the design of their forum using CSS, predefined “skins,”
and/or a variety of layouts. Forums will be hosted for free by BuzzPost.
Revenue generated by advertisements offsets this cost. To the extent
feasible, revenue will be shared with individuals who create and
administer forums.
Technologies
Many technologies make BuzzPost work, including:
•
•
•
•
•
•
•
•
•
AJAX
PHP
MySQL
Javascript
Javascript Object Notation (JSON)
CSS
Cookies
SHA-1 Encryption
CAPTCHA Authentication
-1-
Objectives
Most online forums—even the most vibrant ones—rely on a static message
board. A separate page load is required to log in, start a thread or post a reply.
Threads are not updated automatically. Conversely, most chat applications
update instantaneously. While a traditional chat room might bridge this gap, we
find that chat rooms are poorly structured: conversations overlap which makes
them difficult to follow or return to at a later date. Thus, the goal of BuzzPost is
to combine the stability and structure of a traditional message board with the
fluidity and speed of a chat application.
Users appreciate instantaneous feedback. Imagine if AOL’s instant messenger
required a page refresh to send and receive messages. No doubt its appeal
would be diminished. The best part about chatting with someone is that you can
interact with them in real time. Chats are usually private conversations between
to two people and tend to occur between people already acquainted with each
other.
Online forums possess a different dynamic. There are many participants at
once. Participants share a similar interest, be it a sports team, programming
language or television show, but rarely is the entire community on a first name
basis with one another. Nonetheless, we believe that applying the “instant”
nature of chat applications to an online forum will enhance the user experience.
We intend to change expectations for the behavior of an online forum in the
same way that Gmail altered expectations for webmail applications.
Chats tend to be ephemeral. Once the chat window is closed the conversation is
effectively over. On the other hand, exchanges that occur in an online forum
don’t necessarily recede into the ether once someone leaves. BuzzPost is
designed to afford concurrent, synchronous communication as well as
asynchronous communication from users who visit (or revisit) the thread at a
later date.
Table 1. Comparing a chat application to an online forum
Attribute
Chat Application
Online Forum
Participants
Two, typically
Many
Feedback
Instant
Static; requires page reload
Relationship
People you know in real life
People with similar interest(s)
After you leave…
Conversation disappears
Thread remains
-2-
Examples
AIM, Google Chat, Yahoo!
Messenger, MSN Messenger,
etc.
phpBB, vBulletin, ProBoards
A static approach befits the pace of most message boards, particularly those with
low traffic. But it’s easy to imagine the opposite situation where many users flock
to a particular thread, and the nature of the conversation changes quickly. For
this reason, a message board that automatically updates can be advantageous.
It transforms a static experience into one that is more dynamic.
Even in low traffic situations, the user experience improves if Ajax is employed to
make the application more seamless. This might seem trivial to some; after all, a
fast server can reduce the overhead from page refreshes. However, one only
has to look at the praise garnered by Google Maps, flickr, and the rising
popularity of online office suites to know that Ajax—if employed tastefully—can
make the user experience infinitely more satisfying.
One problem with existing forum software is that the user must install the
software on his or her own server. What’s more, it can be difficult to customize
the layout. To lower these barriers, we propose to host all forums that use the
BuzzPost architecture for free and provide tools that enable users to customize
their own forums. Blogger does this already for blogs; in the late 1990s, ONElist
(which later merged with eGroups and was purchased by Yahoo!) provided a free
mailing list service to anyone who wanted to start a listserv. Our strategy is
similar. To offset the cost of hosting many virtual communities, we expect to
generate revenue from advertising. Ideally, we will share revenue with forum
administrators.
In summary, the objectives1 for BuzzPost are as follows:
1.
2.
3.
4.
1
Design an online forum that is fast, responsive and scalable.
Enable users to create and customize their own forums.
Host forums for free.
Develop a sustainable revenue model to support the service.
This semester we achieved the first objective and will continue to work on Objectives 2-4.
-3-
Architecture
How does Buzzpost work? In a nutshell, clients repeatedly make HTTP requests
to BuzzPost asking if new content has been added to the database. If new
content exists, that data is returned to the client in JSON format. The client
parses this data and appends it to existing content. To minimize strain on the
database, every time a new message is added to the database the result from
this insertion is cached. This way other clients can determine the dataset has
changed without executing an actual query. A diagram of this architecture,
known as “traditional polling,” is contained in Appendix A.
During the course of this project, we examined Comet or “Ajax Push” as an
alternative to traditional polling. With Comet, once a user sends a message the
server pushes it to the client immediately. This reduces the burden on the server
since the clients wait patiently for new messages rather than ping the server,
over and over again. We installed and tested two strains of Comet: the Bayeaux
protocol and the Orbited protocol. Unfortunately, we did not have time to convert
the BuzzPost architecture to Comet. A diagram of the Comet architecture is
contained in Appendix B.
Evaluation and Testing
BuzzPost’s current architecture can support at least simultaneous 100 users.
With active 100 users, the server’s CPU usage rises to about 20 percent. (See
graph below.) Thus, we estimate the average cost per user to be less than $0.20
per user since our host (linode.com) charges $20 per month.
After the initial round of testing, we found that performance decreased with about
75 users. However, we discussed this issue with our host, linode.com. Our
account was then migrated to a more powerful server with dual quad-core Xeon
processors and RAID 1 mirroring for the same cost of $20 per month. This
change increased BuzzPost’s capacity; performance did not decrease even with
100 users.
Figure 1. BuzzPost CPU usage for 100 active users
100 users
-4-
We believe that additional improvements could be made to the database
architecture to reduce the strain on the server even more. However, we beleive
the real key lowering the cost per user is to make the jump from “traditional
polling” to Comet.
Contributions
The BuzzPost architecture emphasizes a user experience that is fast, simple and
reliable. Specifically, we developed the following components:2
1. Automatic updates. The list of threads and individual conversations
updates automatically, like an AIM conversation.
2. Format content. Users can format submissions.
Figure 2. Text formatting and emoticon example
3. Seamless log in and user authentication system. Users can log in, log
out and/or register without loading a separate page. Database lookups
and insertions will be done “on the fly.” So for example, if the user enters
a username that already exists he or she is notified immediately.
Figure 3. Instant feedback when the user
selects a username that already exists
4. CAPTCHA Authentication. When a user registers for the first time, he or
she receives an email with a CAPTCHA code that must be entered at the
site to fully activate his or her account. A more conventional solution
would be to embed a CAPTCHA in the log in form and conduct the email
verification by sending a link the user can click on to verify his or her email
address. Although conducting the CAPTCHA test via email makes the
process slightly more cumbersome we believe there is a benefit in that the
2
Most components have been implemented in the final version at http://www.buzzpost.net.
However, some are still in “beta” at http://www.buzzpost.net/vinit/buzzpost3/
-5-
system makes it slightly more difficult for a malicious user or bot to set up
multiple accounts.
Figure 4. Validation email from BuzzPost with
CAPTCHA code to enter at the site
5. Moderation system. A moderation system is available for users to rate
comments and flag inappropriate content. Checks and balances are in
place so that users cannot rate their own posts or rate a post more than
one time.
Moderation buttons
Figure 5. User can rate each post
6. Edit profiles. Users can edit profiles. We employed Ajax so that users
can toggle between “edit” and “save” mode without incurring a separate
page refresh.
Figure 6. A user edits his profile
7. Search engine. BuzzPost includes a simple search engine. The search
mechanism is fast and easy to use; a separate page refresh is not
required. Next, we would like to implement an algorithm that considers
the forum’s moderation system.
-6-
8. Private messaging system. Similar to Facebook, users can send private
messages to one another.
Figure 7. User is sending a private message
9. Cross-platform compatibility. BuzzPost adheres to this principle. It is
important that everyone be able to access BuzzPost forums regardless of
web browser or operating system.
Extensions
1. Convert to Comet. As noted above, converting the architecture to Comet
from traditional polling will key the scalability of this project.
2. Alternate layouts. We will develop alternate layouts to display content.
3. Alternate versions. We will implement a javascript-free version and a
version compatible with mobile devices.
4. Content syndication. We will enables RSS feeds of the forum and/or
individual threads. In addition, users will be able track thread thanks to
email notifications.
5. GUI for webmasters. We will create a GUI for webmasters to manage
and design their communities. Forum statistics will also be available here.
6. Advertising model and business plan. We will develop a business plan
to assess the feasibility of supporting many forums via advertising. We
will consider the pros and cons of “open sourcing” the BuzzPost code.
7. Additional testing. Many of the themes discussed in class are relevant
to BuzzPost, particularly those relating to Social Network and Web 2.0.
-7-
For example, one paper we read entitled “Expertise Networks in Online
Communities: Structure and Algorithms” [1] analyzes how expertise is
disseminated in an online forum. We believe similar experiments could be
conducted using the BuzzPost architecture. In addition, since BuzzPost
relies heavily on Ajax and Ajax lowers the overhead associated with each
page request, we believe BuzzPost lends itself to experiments that have to
do with data caching and replication in a Web 2.0 environment. This
semester we read a paper entitled “Enhancing the Web’s Infrastructure --From Caching to Replication” [2]. Perhaps this paper could serve as the
foundation for additional studies involving BuzzPost.
Related work
BuzzPost provides a unique user experience and numerous innovations
compared to existing forums. However it is wise to analyze online forums that
already exist. Wikipedia contains descriptions of nearly one hundred forums. 3
We examined each one in detail. Some are open source, others are proprietary;
some are free, others are not. A variety of programming languages are used:
PHP, ASP, Ruby, Perl and others. Most forums do not offer remote hosting as
we plan to offer someday; instead users must install the code on their own
server. The first table contains details about popular forum software. The
second table points to existing forums that appear to have been developed
independently.
Table 2. Popular Forum Software
Name
Website
Cost
Technology
Remote Host?
phpBB
http://www.phpbb.com
Free
PHP
No
V Bulletin
http://www.vbulletin.com
$85/yr
PHP
No
EZ Board
(now Yuku)
http://www.ezboard.com
Free (Ad
supported). Can
upgrade to “pro”
version for $36/yr.
Smalltalk
Yes
Vanilla
http://www.getvanilla.com
Free
PHP/Ajax
No
ProBoards
http://www.proboards.com
Free (Ad
supported)
PHP (?)
Yes
Orca
http://www.demozzz.com/o
rca/demo/
Free
PHP/Ajax
No
3
http://en.wikipedia.org/wiki/Comparison_of_Internet_forum_software_%28ASP%29,
http://en.wikipedia.org/wiki/Comparison_of_Internet_forum_software_%28PHP%29,
http://en.wikipedia.org/wiki/Comparison_of_Internet_forum_software_%28Other%29
-8-
Table 3. Existing Forums
Name
Website
Comment
Adobe
http://www.adobe.com/cfusion/webforums/forum
Similar to phpBB with some
unique features.
FLV
Media
Player
http://www.jeroenwijering.com/?forum=All
Sleek but simple. Cool
emoticons and search engine.
Phantasy
Tour
http://www.phantasytour.com/phish/boards.cgi
Clean layout and simple
functionality.
QBN
http://www.qbn.com/filter/all/public_voice/recent/2
4_hours/
Innovative layout.
I Love
Everything
http://www.ilxor.com/ILX/index.jsp
Proof that a popular forum
doesn’t have to look pretty.
Think
Tank
http://www.thinktankforums.com/
Minimalist design.
References
[1] J. Zhang, M. Ackerman, and L. Adamic. Expertise Networks in Online
Communities: Structure and Algorithms. WWW 2007.
[2] “Enhancing the Web’s Infrastructure --- From Caching to Replication.”
Michael Baentsch, Lothar Baum, Georg Molter, Steffen Rothkugel, and Peter
Sturm, IEEE Internet Computing, vol. 1, no. 2, pages 18-27, April 1997
-9-
Appendix A: Traditional Polling Architecture
A diagram of the traditional polling architecture used by BuzzPost.
Step 1: Clients repeatedly check the value contained in a document
cached by the database. Our prototype sets an interval of 1000
milliseconds.
Step 2: One client inserts a new message into the database. A new result
is cached.
- 10 -
Step 3: The clients detect that the cached value has changed.
Step 4: The clients query the database and new content is returned to the
client. Return to Step 1.
- 11 -
Appendix B: Comet or “Ajax Push” Architecture
A diagram of Comet or “Ajax Push” architecture that could be implemented by
BuzzPost.
Step 1: Clients wait for updates from the server.
Step 2: Client inserts a new message into the database. Cache value is
updated.
- 12 -
Step 3: Immediately, the new value is pushed to the clients.
Step 4: Clients return to “listening mode.”
- 13 -
Download