CMS, Wikis, and Wikipedia Ahmed Sameh 1 Wikis Original vision / implementation Ward Cunningham: 1994/1995 See /c2.com/cgi/wiki?WikiWikiWeb Idea: open editing of web content Lots of instances, lots of tools en.wikipedia.org/wiki/ en.wikipedia.org/wiki/List_of_wikis en.wikipedia.org/wiki/List_of_wiki_software en.wikipedia.org/wiki/Comparison_of_wiki_software 2 Wikipedia Anyone can edit Well... en.wikipedia.org/wiki/Wikipedia:Flagged_revisions Interesting read: http://www.independent.co.uk/lifestyle/gadgets-and-tech/features/is-wikipedia-crackingup-1543527.html Recent changes Watchlists Massive amounts of discussion and massive social conventions 3 4 Wikipedia stats One of the 10 most visited web sites http://en.wikipedia.org/wiki/Wikipedia:Statistics Over 2.7 million articles (English) Users have made over 282 million edits, with an average of 17.86 per page, since July 2002. User statistics Over 8.8 million registered user accounts Over 162 thousand active in the last thirty days About 1600 administrators 5 Interesting wikipedia urls http://en.wikipedia.org/wiki/Talk:Bob_Dylan http://en.wikipedia.org/w/index.php?title=Bob _Dylan&action=history 6 Studying Cooperation and Conflict between Authors with history flow Visualizations Fernanda Viegas, Martin Wattenberg, and Kushal Dave 7 Summary Developed history flow visualization Applied it to a (presumably) hand-selected sample of 70 or so Wikipedia pages Identified some interesting patterns Contribution of different authors Vandalism + repair Edit wars Did some statistical analysis Mean/median time to repair one type of vandalism 8 history flow http://www.research.ibm.com/visual/projects/ history_flow/ http://www.research.ibm.com/visual/projects/ history_flow/gallery.htm 9 Interlude... From Priedhorsky et al, GROUP 2007 10 Rapidity of damage repair i.e., 42% had no impact Persistence # incidents ≥ 1 view 58% ≥ 10 views 31% ≥ 100 11% ≥ 1,000 0.75% - 16k ≥ 10,000 0.06% - 1.3k Discussion The impact of damage is low but nonzero ... and rising (?) 1. n humans review each revision quickly 2. ensure each article is on n watchlists n × 28k humans at 5 mins daily work More quantitative research on Wikipedia Coordination and conflict kittur.org/research.html Aaron Halfaker Quantifying value of edits, impact of damage Wikipedians are born, not made Ried Preidhorsky Katie Panciera Edit quality Michael Ekstrand, Jilin Chen 14 SeeSoft The Seesoft software visualization system allows one to analyze up to 50000 lines of code simultaneously by mapping each line of code into a thin row. The color of each row indicates a statistic of interest, e.g., red rows are those most recently changed, and blue are those least recently changed. Seesoft displays data derived from a variety of sources, such as version control systems that track the age, programmer, and purpose of the code (e.g., control ISDN lamps, fix bug in call forwarding); static analyses, (e.g., locations where functions are called); and dynamic analyses (e.g., profiling). By means of direct manipulation and high interaction graphics, the user can manipulate this reduced representation of the code in order to find interesting patterns. Further insight is obtained by using additional windows to display the actual code. Potential applications for Seesoft include discovery, project management, code tuning, and analysis of development methodologies. SeeSoft WikiScanner http://en.wikipedia.org/wiki/WikiScanner Issues history flow visualizations – useful? How confident are you in the results? “this visualization helps identify patterns, not explain them; and it is not clear how it extends onto a larger scope than just one page at a time” Does the paper tell us just about controversial topics? “My only questions revolve around the articles chosen for inspection” 18 Issues What do we know about fixing “vandalism” on Wikipedia after reading this article? Why do users watch? How many users have to be watching? How many users have to be beneficent? Metrics? (see Reid’s work) Computational definitions of “edit wars”, “vandalism”, etc.? See http://en.wikipedia.org/wiki/User:AntiVandalBot 19 Issues Coordination and conflict See kittur.org/research.html How to achieve consensus Limited privileges vs. open privileges + social conventions Note: Flagged Revisions How to avoid “vandalism” Technical approaches vs. limited privileges vs. social norms (aided by mechanisms) Social approaches require scale to work? Large vs. small wikis 20 Issues Tyranny of the first (editor) Do the history flows give any evidence about this? Wikipedia and adversarial collaboration NPOV == POV-X + POV-Y? 21 Issues Other mechanisms Slashdot mass moderation ??? Scenarios – what high-level tool would you choose for… 22 Content Management Systems and Drupal http://en.wikipedia.org/wiki/Content_manage ment_system http://en.wikipedia.org/wiki/Comparison_of_c ontent_management_systems 23 CMS “Publish” web sites without (much) programming (required) Content separated from presentation Templates Workflow Administrative interfaces Database backend Modular architecture Can program as necessary 24 Drupal – a powerful CMS Blogging Syndication: RSS Forums “Books” News + comments Polls User roles + permissions Web-based administration 25 Drupal requirements Web server (Apache) PHP Database server (MySQL) Installation not trivial(?) Already installed on ITLabs machines 26 Drupal information http://en.wikipedia.org/wiki/Drupal http://drupal.org/ http://drupal.org/about (etc.) 27 Drupal examples Communitylab.org Grouplens.org glaros.dtc.umn.edu/ volunteermatch.com civicrm.org www.defectivebydesign.org/en/node 28