The PhD School of Science FACULTY OF SCIENCE UNIVERSITY OF COPENHAGEN DENMARK PhD thesis Rut Jesus Cooperation and Cognition in Wikipedia Articles A data-driven, philosophical and exploratory study Academic advisor: Claus Emmeche Center for the Philosophy of Nature and Science Studies Submitted: 15/7/2010 The picture in the cover uses Wikipedia's puzzle logo, mapped to a Buckminster Fuller's projection. From Wikipedia: "The Dymaxion map or Fuller map is a projection of a World map onto the surface of a polyhedron, which retains most of the relative proportional integrity of the globe map." It represents the art of mapping a thesis. Of mapping ideas into a thesis. The idea of combining new technologies and approaches, projections and pieces of a puzzle onto a new endeavor. Pieces are missing. i A. Summary The new socio-technological systems of the internet involve complex collaborative behaviors, of which Wikis in general are a particular successful case, and especially Wikipedia. This encyclopedia has created and harnessed new social and work dynamics, which can provide insight into specific aspects of cognition, as amplified by a multitude of editors and their ping-pong style of editing, spatial and time flexibility within unique technology-community fostering features. Wikipedia's motto "The Free Encyclopedia That Anyone Can Edit" is analyzed to reveal human, technological and value actors within a theoretical context of distributed cognition, cooperation and technological agency. As this work is part of an emergent field of Wiki Studies, with an interdisciplinary approach, three avenues of inquiry are used to research cooperation and cognition in Wikipedia articles. 1. In the section on data-driven studies using data from Wiki log pages, special patterns of editor-article networks are revealed. The patterns analyzed are both from the pool of Wikipedia content articles but also from the pool of coordination and policy articles in the English, Portuguese and Spanish Wikipedias. Bicliques are used to detect overlapping clustering and extract particular themes of collaboration. Network visualization and bicliques are used and developed to focus closer on the process of collaboration in the production of the article "Prisoner's dilemma" and the policy article "Neutral Point of View". The several tools used reveal clusters of interest, dense areas of coordination, a blend between coordination and direct editing work, and point to Wikipedia's dynamic stability in content and form. 2. In the section on philosophical-cognitive studies, a distinction between Cognition for Planning and Cognition for Improvising is proposed to account for Wikipedia's success and mode of editing whereby many small edits are used for its improvement. This theoretical distinction highlights the role of two forms of cognition for Wikis insofar as a Cognition for Improvising surplus is harnessed. The implications for cognition of this proposal are also investigated including the possibility that the Wiki-article participates in a milestone of cognitive evolution. While there seems to be a strong case for the tapping of the Cognition for Improvising, there seems to be a weak and agnostic case for how much Wikis change or enhance cognition. 3. The exploratory part takes the questions through the WikiWay: by doing an installation of a 'live-Wiki' at relevant conferences, characteristics of Wikis and cooperation are investigated with an artistic angle. As the debate on Free Culture calls for participation, taking one's own education, the 'Our Coll/nn/ective Minds' piece reflects on several aspects of Wikis, free culture, open source, Do-It-Yourself by engaging in the debate in a more creative and participative form, engaging the participants and crystallizing topics of discussion. These studies contribute to constructing an ecology of the article, a vision of humanities bottom-up, and a better understanding of cooperation and cognition within sociotechnological networks. ii B. Resumé (på Dansk) Internettets nye socio-teknologiske systemer involverer kompleks kollaborativ adfærd. Wiki, herunder især Wikipedia, er et særligt vellykket eksempel på dette. Encyklopædien har skabt nye sociale dynamikker og nye sociale arbejdsformer, som kan give indsigt i de aspekter ved kognition, der bliver tydelige, når mange redigerer teknologisk-kollaborativt — når redigeringsprocessen udfolder sig i en ”ping-pong-stil”og når der er en rumlig og tidslig fleksibilitet indenfor rammerne af et teknologisk fællesskab. Wikipedias motto "Den Frie Encyklopædi Som Alle Kan Redigere" analyseres for at indkredse de menneskelige, teknologiske og værdimæssige aktører inden for en teoretisk sammenhæng af distribueret kognition, kooperation og teknologisk handlen. Da denne undersøgelse er en del af et nyt videnskabeligt felt, de såkaldte Wiki Studies, som generelt har en udpræget tværfaglig tilgang, er der i undersøgelsen valgt tre forskellige metodiske veje, som tilsammen udforsker kooperation og kognition i Wikipedia-artikler. 1. Afsnittet data-driven studies inddrager data fra wiki log-sider. Her viser undersøgelsen en række særlige mønstre i de netværk, der afspejler relationerne mellem redaktører og artikler. Både mønstre i indholdsartikler såvel som i koordinerings- og policy- artikler i den engelske, portugisiske og spanske Wikipedia analyseres. Der anvendes bicliques for at opdage overlap mellem klyngedannelse og for at uddrage specifikke eksempler på kollaboration. Netværksvisualisering og bicliques bruges og udvikles endvidere med særligt henblik på at undersøge den kollaborative proces i forbindelse med artiklen “Prisoner’s Dilemma” (fangernes dilemma) og policy-artiklen ”Neutral point of view” (Skriv Wikipedia fra et neutralt synspunkt). De forskellige undersøgelsesværktøjer viser interesse-klynger, fortættede områder med koordinering, en blanding af koordinering og direkte redigering, som peger mod Wikipedias dynamiske stabilitet i indhold og form. 2. I afsnittet om filosofisk-kognitive studier, foreslår undersøgelsen, at et skel mellem Cognition for Planning (planlæggende kognition) og Cognition for Improvising (improviserende kognition), er grundlaget for encyklopædiens succes og redigeringsform, hvor mange små ændringer er afgørende for forbedringen af encyklopædien. Dette teoretiske skel sætter fokus på betydningen af to former for kognition i wiki’er, idet et overskud i den improviserende kognition udnyttes. Implikationerne af den teoretiske distinktion undersøges yderligere, herunder den mulighed at Wiki-artiklerne er en milepæl i den kognitive evolution. 3. I afsnittet med en explorative vinkel, arbejdes spørgsmålene igennem the WikiWay: igennem en installation af en "live-wiki” på relevante konferencer, udforskes de særlige træk ved wiki’er og kooperation ud fra en kunstnerisk vinkel. I dette tilfælde er det debatten om Free Culture der inspirerer. Værket "Our Coll/nn/ective Minds” reflekterer over forskellige aspekter af fænomenet wiki, Open Source, Do-It-Yourself. Og dette finder sted i kraft af en kreativ og engagerende deltagelse, der inddrager andre aktører og som udkrystalliserer diskussioner af emner tydelige former. Undersøgelsen bidrager til en beskrivelse af wiki-artiklernes ”økologi”, en vision om humanvidenskabernes bottom-up-dynamik, samt en bedre forståelse af kooperation og kognition i socio-teknologiske netværk. iii C. Resumo (em Português) Os novos sistemas sócio-tecnológicos da internet envolvem complexas colaborações, entre as quais os wikis são em geral um caso de sucesso e, em particular a Wikipedia. Esta enciclopédia criou e explorou uma nova dinâmica social e de trabalho, que permite obter informações sobre aspectos específicos da cognição, amplificada por um grande número de editores, pelo estilo editorial de revisão múltipla , e pela inerente flexibilidade espacial e temporal característica da tecnologia-comunitária utilizada. O lema da Wikipédia "a enciclopédia livre que todos podem editar" é analisado para revelar os actores humanos, os processos tecnológicos e os valores subjacentes, no contexto teórico da cognição distribuída, cooperação e tecnologia. A presente tese resulta de um trabalho interdisciplinar, englobado no vasto campo de estudos Wiki actualmente emergente, pelo que apresenta três distintas direcções de investigação para a compreensão da cooperação e cognição nos artigos da Wikipedia. 1. Na secção de análise de dados das páginas históricas da wiki revelam-se padrões especiais subjacentes às redes artigo-editor. Os padrões descritos resultaram da análise conjunta de artigos de conteúdo da Wikipedia, e de artigos de coordenação das Wikipedia em Inglês, Português e Espanhol. A utilização de bicliques possibilitou a detecção de sobreposição de clusters e a descoberta de temas específicos de colaboração. Paralelamente, utilizaram-se redes de visualização e bicliques para uma maior compreensão do processo de colaboração na produção de artigos específicos como o "Prisoner’s dilemma" (Dilema do Prisioneiro) e o "Neutral Point of View" (Principio da Imparcialidade). As várias ferramentas descritas permitiram detectar a existência de grupos de interesse, de densas áreas de coordenação, e de misturas entre coordenação e trabalho de edição directa, que realçam a estabilidade dinâmica da Wikipedia em conteúdo e em forma. 2. Na secção de estudos filosófico-cognitivos, propõe-se uma distinção entre a Cognition for Planning (Cognição de Planeamento) e Cognition for Improvising (Cognição de Improvisação) contribuindo para a compreensão do sucesso da Wikipédia e do modo de edição–múltipla, baseado em pequenas contribuições para a melhoria de qualidade dos artigos. Esta distinção teórica destaca o papel das duas formas de cognição referidas e a importância da utilização do excedente de Cognition for Improvising. As implicações desta proposta para a cognição também são investigadas, incluindo a potencial participação de um wiki-artigo numa etapa de evolução cognitiva. Neste contexto, realça-se de um modo marcante o aproveitamento da cognição de improvisação, contrastando com a menor relevância e convicção apresentada em relação à importância dos wikis na mudança ou melhoria da cognição. 3. A secção experimental baseia-se na análise de características de wikis e das relações de cooperação, através de um enquadramento artístico e utilizando os processos da WikiWay. Este estudo fundamentou-se na realização de uma instalação de "live-wiki" em conferências, o que permitiu demonstrar que o debate sobre a Cultura Livre apela à participação e a uma educação própria. A "Our Coll/nn/ective Minds” reflecte sobre vários aspectos dos wikis, “open source”, “Do-It-Yourself”, e analisa a forma criativa e participativa de envolvimento dos participantes e de cristalização de tópicos de discussão. Finalmente considera-se que os estudos apresentados contribuem para a construção de uma ecologia do artigo, uma visão das ciências humanas de baixo para cima, e um melhor entendimento da cooperação e da cognição no âmbito de redes sócio-tecnológicas. iv D. Resumo (en Esperanto) La novaj socio-teknologiaj sistemoj de Interreto inkluzivas kompleksajn kunlaborajn kondutojn, inter kiuj vikioj ĝenerale, kaj precipe Vikipedio, estas aparte sukcesa kazo. Vikipedio kreis kaj utiligis novajn sociajn kaj laborajn dinamikojn, kiu povas provizi ekkomprenon en specifaj flankoj de kogno, kiel amplifigita de homamaso de redaktoroj kaj ilia tablotenisa stilo de redakto, spaca kaj tempa fleksebleco ene de unika teknologikomunumo. La Vikipedia devizo "La Libera Reta Enciklopedio Redatebla de Ĉiuj" estas analizita per la homaj, teknologiaj kaj valoraj aktoroj ene de teoria kunteksto de distribuita kogno, kunlaboro kaj teknologia perado. Ĉar ĉi tiu laboro estas parto de ekaperanta fako de Vikiaj Studoj, kun interdisciplina aliro, tri vojoj estas uzitaj por esplori kunlaboron kaj kognon en Vikipediaj artikoloj. 1. En la sekcio pri datum-pelitaj studoj uzante datumojn de la protokoloj de viki-paĝoj, malkovreblas specialajn formiĝojn de aŭtoro-artikolaj retoj. La formiĝoj analizitaj estas el la aro de Vikipediaj enhav-artikoloj sed ankaŭ el la aro de kunordigaj artikoloj en la angla, portugala kaj hispana Vikipedio. Duklikoj estas uzita por eltiri apartajn temojn de kunlaboro. Videblaj retoj kaj duklikoj estas uzitaj kaj evoluitaj por enfokusigi la procezon de kunlaboro en la produktado de la artikoloj “Prisoner’s Dilemma” (La dilemo de la malliberulo) kaj la meta-artikolo ”The Neutral Point of View” (Neŭtrala Vidpunkto). La pluraj iloj uzitaj malkovris grupiĝojn de interesiĝo, densaj regionoj de kunordigado, mikso inter kunordigado kaj rekta redaktolaboro kaj indikas al la dinamikan stabilecon de Vikipedio enhave kaj forme. 2. En la sekcio pri filozofi-kognaj studoj, distingo inter kogno por planado kaj kogno por improvizo estas proponita por esti la kialo de la sukceso de Vikipedio, kie multaj malgrandaj redaktoj estas uzitaj por ĝia plibonigo. Ĉi tiu teoria distingo akcentas la rolon de du formoj de kogno, kaj vikioj prenas kogno por improvizo superfluo. La konsekvencoj por kogno de ĉi tiu propono estas ankaŭ esploritaj, inkluzive la eblecon, ke la viki-artikolo partoprenas en mejloŝtono de la kogna evoluo. 3. La esplora parto faras la demandojn tra la VikiVojo: farante instalaĵon de 'viva vikio' dum taŭgaj konferencoj helpis kompreni karakteristikojn de vikioj kaj esplori ilin de arta angulo. Kiel la debato pri la Libera Kulturo postulas partoprenon, ĉiu prenante sian propran instruon, la peco 'Our Coll/nn/ective Minds' (Niaj Ko/lektiv/nektit/aj Mensoj) pripensigas pri pluraj aspektoj de vikioj, open source, do-it-yourself, kaj kreaj partoprenaj formoj, okupante la partoprenantojn, kaj kristaligante temojn de diskuto. Ĉi tiuj studoj kontribuas por konstrui ‘ekologion’ de la artikolo, homec-studoj el fundo al supren, kaj komprenante pli bone kunlaboron kaj kognon en soci-teknikaj retoj. v E. Abstract Wikipedia has created and harnessed new social and work dynamics, which can provide insight into specific aspects of cognition, as amplified by a multitude of editors and their ping-pong style of editing, spatial and time flexibility within unique technology-community fostering features. Wikipedia's motto "The Free Encyclopedia That Anyone Can Edit" is analyzed to reveal human, technological and value actors within a theoretical context of distributed cognition, cooperation and technological agency. In the Data-driven studies using data from Wiki log pages, network visualization and bicliques are used and developed to focus closer on the process of collaboration in articles and meta-articles, and inside the article "Prisoner's dilemma" and the policy article "Neutral Point of View". The several tools used reveal clusters of interest, dense areas of coordination, a blend between coordination and direct editing work, and point to Wikipedia's dynamic stability in content and form. In the philosophical-cognitive studies, a distinction between Cognition for Planning and Cognition for Improvising is proposed to account for Wikipedia's success and mode of editing whereby many small edits are used for its improvement. In the exploratory part an installation of a 'live-Wiki' 'Our Coll/nn/ective Minds' piece reflects on several aspects of Wikis, free culture, open source, Do-It-Yourself by engaging in the debate in a more creative and participative form. These studies contribute to constructing an ecology of the article, a vision of humanities bottom-up, and a better understanding of cooperation and cognition within sociotechnological networks. vi F. Table of Contents A. Summary ....................................................................................................... ii B. Resumé (på Dansk) ....................................................................................... iii C. Resumo (em Português) ................................................................................ iv D. Resumo (en Esperanto) .................................................................................. v E. Abstract......................................................................................................... vi F. Table of Contents ......................................................................................... vii G. Preface......................................................................................................... xii H. Acknowledgements..................................................................................... xiii I. INTRODUCTION............................................................................................... 2 1 A motto ........................................................................................................... 4 1.1 Motivation for the structure of the introduction ......................................... 4 1.2 Wikipedia’s motto..................................................................................... 4 1.3 The ........................................................................................................... 7 1.4 Free ........................................................................................................ 10 1.5 Encyclopedia+ ........................................................................................ 12 1.6 That ........................................................................................................ 15 1.7 Anyone ................................................................................................... 17 1.8 Can ......................................................................................................... 21 1.9 Edit......................................................................................................... 25 2 Wikipedia’s Conversation Game.................................................................... 26 2.1 The Board............................................................................................... 27 2.2 The Deck ................................................................................................ 28 2.3 Objectives............................................................................................... 28 2.4 The Cards ............................................................................................... 29 2.5 A game ................................................................................................... 30 2.6 NYAQ .................................................................................................... 30 2.7 Reflections.............................................................................................. 31 3 Research Questions........................................................................................ 32 3.1 Meta-thoughts about questions................................................................ 32 3.2 Two Sets of Research Questions ............................................................. 32 3.3 Thesis’ objective..................................................................................... 33 3.4 Process of Investigation .......................................................................... 33 3.5 Research Paths ........................................................................................ 34 3.6 Roadguide............................................................................................... 36 II. THEORETICAL PERSPECTIVES.................................................................. 37 1 Theories......................................................................................................... 38 1.1 Distributed Cognition.............................................................................. 38 1.2 Hypotheses and Positioning .................................................................... 40 1.3 Cognitive Artifacts.................................................................................. 40 1.4 In the Wild and Doing/Situation and Action............................................ 41 1.5 Time&Space ........................................................................................... 42 1.6 Shortcomings.......................................................................................... 43 2 C5 .................................................................................................................. 45 2.1 Cooperation & Collaboration .................................................................. 45 2.2 &Coordination ........................................................................................ 46 2.3 Comparing cooperation, collaboration and coordination.......................... 47 vii 2.4 Contribution ............................................................................................48 2.5 Community .............................................................................................49 3 Wikis Do More Because They Do Less ..........................................................51 3.1 Wikis’ agency..........................................................................................51 3.2 Transparencies.........................................................................................51 3.3 Actors......................................................................................................53 3.4 Conversations..........................................................................................53 3.5 Mediations...............................................................................................54 3.6 Architecture.............................................................................................54 3.7 Technology .............................................................................................56 III. DATA-DRIVEN STUDIES ............................................................................58 1 Introduction to the data Studies ......................................................................59 1.1 Bottom-Up Humanities............................................................................59 1.2 Methodological choices ...........................................................................60 1.3 The Use of Networks and Bicliques.........................................................61 1.4 The Following Studies.............................................................................63 2 Bipartite Networks of Wikipedia’s Articles and Authors: a Meso-level Approach .............................................................................................................67 2.1 Abstract...................................................................................................67 2.2 Introduction.............................................................................................67 2.3 Method....................................................................................................69 2.4 Results.....................................................................................................72 2.5 Discussion ...............................................................................................80 2.6 Limitations and Recommendations for Future Research...........................82 2.7 Conclusion ..............................................................................................82 2.8 Acknowledgements .................................................................................82 3 We Coordinate, Nosotros Eligimos, Nós Administramos: Articles and Editors of MetaWiki and in Meta Communities of the Ibero-South American Wikipedias 84 3.1 Abstract...................................................................................................84 3.2 Introduction.............................................................................................84 3.3 Methodology ...........................................................................................87 3.4 Results.....................................................................................................89 3.5 Discussion ............................................................................................. 102 3.6 Limitations and Directions for Future Research ..................................... 103 3.7 Conclusion ............................................................................................ 103 3.8 Acknowledgments ................................................................................. 104 4 Context Networks of the Articles ‘Prisoner’s Dilemma’ and ‘Wikipedia:Neutral Point of View’.................................................................................................... 106 4.1 Abstract................................................................................................. 106 4.2 Introduction........................................................................................... 106 4.3 Methodology ......................................................................................... 106 4.4 Results................................................................................................... 108 4.5 Discussion ............................................................................................. 112 4.6 Acknowledgments ................................................................................. 112 5 History of the ‘Prisoner’s Dilemma’............................................................. 114 5.1 Pre-history............................................................................................. 115 5.2 History of the article.............................................................................. 118 5.3 Discussion ............................................................................................. 127 5.4 Present/Future Research ........................................................................ 128 6 Networks of Wikipedia Article: Insides of ‘The Prisoner’s Dilemma’.......... 130 viii 7 IV. 1 2 3 4 V. 1 2 3 6.1 Abstract ................................................................................................ 130 6.2 Introduction .......................................................................................... 130 6.3 Method ................................................................................................. 130 6.4 Results .................................................................................................. 131 6.5 Discussion ............................................................................................ 137 6.6 Limitations and Recommendations ....................................................... 137 6.7 Conclusion............................................................................................ 138 6.8 Acknowledgments ................................................................................ 138 Inside the Policy Article ‘The Neutral Point of View’ .................................. 140 7.1 Abstract ................................................................................................ 140 7.2 Introduction .......................................................................................... 140 7.3 Methodology......................................................................................... 141 7.4 Results .................................................................................................. 142 7.5 Discussion ............................................................................................ 144 7.6 Conclusion............................................................................................ 145 7.7 Acknowledgments ................................................................................ 145 PHILOSOPHICAL/COGNITIVE................................................................. 147 Introduction to Cognition Studies ................................................................ 149 1.1 How does Wikipedia work in practice................................................... 149 1.2 Wikis and Cognition ............................................................................. 150 What Cognition Does for Wikis................................................................... 151 2.1 Cognition for Planning vs. Cognition for Improvising........................... 151 2.2 Cognition for Improvising Surplus........................................................ 157 What Wikis Do For Cognition ..................................................................... 163 3.1 Cognitive milestones............................................................................. 163 3.2 Information’s support............................................................................ 164 3.3 Information production’s consequences ................................................ 165 Implications for Cognition ........................................................................... 166 4.1 ESD Cognition Umbrella ...................................................................... 166 4.2 Cognition vs. Cognizing........................................................................ 167 4.3 Cognition is Thinking and/or Problem-Solving and/or InformationProcessing?.................................................................................................... 167 4.4 Extension.............................................................................................. 168 4.5 Ecological............................................................................................. 168 4.6 Understanding Cognition ...................................................................... 169 4.7 Transience Consequences ..................................................................... 170 EXPLORATION: ARTISTIC/EXPERIMENTAL/EXPERIENTIAL............. 173 Motivation and Explanation, serving as “The Introduction” ......................... 175 1.1 Art and Science..................................................................................... 175 1.2 WikiWay .............................................................................................. 176 1.3 Action/Design-based research ............................................................... 176 Our Coll/nn/ective Mind Proposal & Summary............................................ 177 2.1 Summary .............................................................................................. 177 2.2 The Theoretical Framework .................................................................. 177 2.3 Methodology......................................................................................... 179 2.4 The Scope, The Hypothesis, The Questions........................................... 180 OCM Report “The Results” ......................................................................... 182 3.1 WikiWars.............................................................................................. 182 3.2 WikiSym 2010...................................................................................... 186 3.3 Wikimania 2010.................................................................................... 188 ix 3.4 Comparison of the reception at WikiWars, WikiSym and WikiMania.... 190 4 OCM Thoughts “The Discussion & Evaluation”........................................... 192 4.1 Revisiting visualization-image-mindmap-inspiration ............................. 192 4.2 Revisiting Open Space Technology inspiration...................................... 192 4.3 Revisiting it as a Live Wiki ................................................................... 193 4.4 Thoughts on........................................................................................... 193 5 OCM Future................................................................................................. 196 5.1 Visualization/Communication ............................................................... 196 5.2 Discussion time ..................................................................................... 196 5.3 Game..................................................................................................... 196 5.4 How-to guide?....................................................................................... 197 VI. DISCUSSION............................................................................................... 199 1 Method & methodology ............................................................................... 199 2 Ecology of the article ................................................................................... 201 2.1 Some Sketched Steps for an Ecology of the Article ............................... 201 3 Cooperation and Cognition........................................................................... 205 3.1 Reshuffling............................................................................................ 205 3.2 Cognition supports Cooperation ............................................................ 206 VII. CONCLUSIONS & PERSPECTIVES FOR FUTURE RESEARCH............ 207 VIII. REFERENCES AND BIBLIOGRAPHY.................................................... 210 W. APPENDIX W .......................................................................................... 219 1 On whole datasets and Projects that fall through........................................... 219 2 Wiki-Writing Collaborative Patterns ............................................................ 219 2.1 Goal ...................................................................................................... 219 2.2 Framework ............................................................................................ 219 2.3 Genesis.................................................................................................. 220 2.4 Methodology ......................................................................................... 220 2.5 Contributions......................................................................................... 220 2.6 Discussions ........................................................................................... 221 2.7 Conclusion ............................................................................................ 222 2.8 Bibliography.......................................................................................... 222 3 Nostalgia...................................................................................................... 222 3.1 Networks............................................................................................... 224 4 Dansk........................................................................................................... 225 4.1 Statistik ................................................................................................. 225 5 Esperanto ..................................................................................................... 226 5.1 Rakonto................................................................................................. 226 5.2 Kelkaj statistikaj informoj ..................................................................... 226 5.3 Retoj...................................................................................................... 227 X. APPENDIX X............................................................................................ 229 1 Reflections inside ......................................................................................... 229 2 Reflections on methodology ......................................................................... 230 2.1 About abduction .................................................................................... 230 2.2 On the process from idea to ‘idea that works’ ........................................ 230 2.3 On improving the theory and the tools while researching....................... 231 Y. APPENDIX Y............................................................................................ 232 1 Bipartite Networks of Wikipedia’s Articles and Authors: a Meso-Level Approach ........................................................................................................... 232 2 What Cognition Does for Wikis ................................................................... 232 Z. APPENDIX Z............................................................................................. 247 x 1 2 Cognitive Onto-Play .................................................................................... 247 Ontological Labyrinth.................................................................................. 249 2.1 Setting/Invitation/Acknowledgments .................................................... 249 2.2 Introduction .......................................................................................... 249 2.3 Networks / Cooperation ........................................................................ 251 2.4 Treasure Hunt ....................................................................................... 252 2.5 Gathering in the end.............................................................................. 254 3 The Featured Task Game ............................................................................. 254 META-PhD Description.................................................................................... 256 xi G. Preface It is now more than four years ago that I spent a concentrated cold month of March pondering on writing a project that I would bother carrying through during the years ahead. Many ideas and issues had been with me for a long time, and others had just appeared, from around the corner, random email conversations, and serendipity. I spent that month to consolidate a project that was truthful to the questions I had in mind, and that simultaneously made sense and would be relevant to the world around me. I always cherished cooperation, as I grew up between brothers, friends, and a house full of people, it was obvious that there was magic in sharing, building on each other’s conversations and constructing things together. My education in physics and philosophy made me very aware of the importance of detail and describing situations in precise equations, and simultaneously not forgetting the big questions, for which, perhaps there were no answers, but they were nonetheless important to consider. It must be from there, that all the way I have wanted to ‘ask a big old question’ to a ‘contemporary defined issue’. Perhaps keeping one foot in each chair (and why not, circus are there to prove one can), I wanted to simultaneously investigate something that was relevant in the past and in the future, and something that was relevant in the present, right here, right now. Wikis fascinated me from the start, and I can’t even say that I was aware of them from their start. But those principles that they ran by, those communities they engaged, those attitudes, reminded me of many of the paths that I had also followed, be it learning Esperanto, traveling alone and trusting people, or being fascinated by what can happen in a living community with their permanent games of Nomic. Wikipedia was already a phenomenon, which was still to grow immensely and gain an enormous relevance in the years I followed it closely, and a phenomenon that, with its design and idea, invited to unconventional working processes. There is also no doubt that its success brings about an easy stance of standing on top of the mountain and trying to explain it in an adhoc fashion, but still, its success is notable, and understanding it, from the point of view of cooperation and cognition, was a challenge I wanted to undertake. A thesis is also a report of work done and a process with many turns. This thesis was originally called “Cooperation in Emergent and Distributed Cognition of Socio-Technological Networks, the case of the Wiki(Pedia) article” but it has evolved since — not changing paths, but becoming more precise, which gave rise to the simpler, current title: “Cooperation and Cognition in Wikipedia articles”. As soon it will become clear, with this thesis’ three paths, data-driven, philosophicalcognitive and exploratory-artistic, this project has a quest to find a balance between depth and experimenting, scholarship and questioning. I wonder and worry if the holes that certainly are present regarding its academic scholarship are impairing, or are, as wished, leaving space for the uniqueness of combining other processes of enquiry. To compensate, I have put extra 1 effort into helping with road maps of various kinds, from the word clouds between each major section, to the detailed tables of contents and descriptions of what is ahead starting each chapter. I hope I’ve been able to convey some of the enthusiasm, struggle, discovery, and thought that have been behind this project when the studies ahead go through different paths of investigation. In a way, this is also just a start. I look forward to the sequels. 1 Using www.wordle.net. xii H. Acknowledgements In these last days of the writing process, it has become evident that a thesis is just a part of life, and that life is an essential element of a thesis, in abstract ways, but also in very concrete ways: life didn’t stop, just because there was a thesis: life came with the death of dear ones, with moving, with constructing new projects, with falling in love, and solving conflicts. For a long time, I was convinced I should dedicate this thesis to ‘All and everything that took me away from the PhD. As much as this has been fun, life lies elsewhere.’ Although I still think it is true that those people and those things that took me away from the thesis helped me dare to undertake this struggle, and although I think it is rather funny to mention ‘everything else’, I can’t avoid a much greater holistic attitude about my life being one and whole. Len, cuddle instructor, told me “you’ve been preparing for your next talk/presentation/exam since the day you were born”. This thesis is an important step in my biography, that has been possible because, and not despite of it. Then, how to acknowledge all that have been part, the nights of hope and the 2 days of sorrow? And in particular, in my condition as a fox, ‘who knows many things’ , how to pinpoint the important collaborators? I’ll try the distinction of actors between ‘values’, ‘technologies’ and ‘people’ which I also apply to Wikipedia. Values I’ll start with the most abstract. Inspiration has been a crucial part of my existence. A necessary element. I am thankful to the books, the notes and the blogs that I’ve read. The talks and the conversations I’ve listened to. I am thankful to all those Wiki-writers and Wikidreamers, communities, freedoms, adventures and hopes. Technologies Then, a word to non-human actors, the grant from the Portuguese Foundation for Science and Technology, the internet, pen and paper, wikis, computers, Esperanto, art and science, oxytocin and Akrida-bicycle. Places: Copenhagen, Jomsborg, KoToPo, NydeHave, Christiania’s sauna and morgenstedet. And also, to the body: the lying position, contact-impro and acrobatics. People Family: A word for my academic parents and the dialectic of being wrong to study because of them, and wrong to study despite of them, but who still inculcated in me the pleasure of thinking and learning. Besides those impossible models, a word for MãeWanda’s warm nurture. And then my brothers, Nuno’s permanent availability and Miguel’s consistent depth. And my extended family, siblings of blood and not, half-parents, grandma, Chandini and kids, and Robin. Friends: Very concrete were my friends’ presence and care, both in Copenhagen, at the garden, on the phone and elsewhere: Renato Negô, Thoughtful Linda, Ja’Cometto, RasmusSumsar, Unævnlig Carsten, Peter Squirrel, Henrik BjørnØ, CassPi, Marta, A+K+3, Georg Øko, Katrine, Peter K, Sylvie, AnDi, Alex TE, Tomio Kŭono+Karin, Sis-Sofia+SebastianBro, SJ Care-boy, Tom Elephant-Keeper, MigLito I+ZiZa, André Exótico, Brendan Mr. Spinach, Lene ħ, Joshua, Kragen and Bee-Teea. I am also grateful to Mathias and Jakobo for 2 Read the beginning of Isaiah’s Berlin “The Hedgehog and the Fox”, which elaborates on “The fox knows many things, but the hedgehog knows one big thing.” Archilochus (7th-century b.c.e.) xiii last minute revisions. In the end, through their help and support for the defense and offense, Nuno Mel and Jannik Tender were just invaluable. Colleagues: To those colleagues and friends who embarked in the projects with me, to Sune and our funny Skype brainstorms; and to Anne, a companion on the other side of the Atlantic, sharing the same dreams and visions. To my colleagues at CPNSS, Marcela’s vegan precision, Christian’s queer support, Mikkel’s reliability, and to the greater context of SiV, in our steady discussions. The NBI environment was also highly improved by Flemming’s structural maintenance, Thomas’ readiness for philosophically spied discussions and Jesper’s existential responses. Claus: And purposefully last, I would like to thank Claus Emmeche, my advisor with the charm and wisdom of the Little Prince. I wouldn’t have done this thesis without Claus who, much beyond the duties of support, revision and discussion, was a presence and a role model for seriousness, respect, depth and listening. I hoped for much with doing this thesis, for time to reflect on a theme, for constructive conversations and for a place to grow. I got more. xiv 1 I. INTRODUCTION 1 A motto................................................................................................................4 1.1 Motivation for the introduction’s structure.....................................................4 1.2 Wikipedia’s motto.........................................................................................4 1.3 The................................................................................................................7 1.4 Free.............................................................................................................10 1.4.1 Utopias? ...............................................................................................10 1.5 Encyclopedia+.............................................................................................12 1.5.1 How ‘Pedia’ influences the ‘Wiki’........................................................12 1.5.2 How ‘Wiki’ influences the ‘Pedia’........................................................13 1.6 That.............................................................................................................15 1.7 Anyone........................................................................................................17 1.7.1 Equality................................................................................................17 1.7.2 From ‘Openness’ to ‘Expertise’ ............................................................17 1.7.3 From ‘Expertise’ to ‘Quality’................................................................18 1.7.4 Credibility & Authorship ......................................................................18 1.7.5 Who writes Wikipedia ..........................................................................19 1.8 Can .............................................................................................................21 1.8.1 Cooperation and Social Norms .............................................................21 1.8.2 Emergence, complexity, networking .....................................................22 1.9 Edit .............................................................................................................25 2 Wikipedia’s Conversation Game ........................................................................26 2.1 The Board ...................................................................................................27 2.2 The Deck.....................................................................................................28 2.3 Objectives ...................................................................................................28 2.4 The Cards....................................................................................................29 2.5 A play .........................................................................................................30 2.6 NYAQ.........................................................................................................30 2.7 Reflections ..................................................................................................31 3 Research Questions ............................................................................................32 3.1 Meta-thoughts about questions ....................................................................32 3.2 Two Sets of Research Questions..................................................................32 3.3 Thesis’ objective .........................................................................................33 3.4 Process of Investigation...............................................................................33 3.4.1 Diversity of Methods ............................................................................33 3.4.2 Theory and Data ...................................................................................34 3.5 Research Paths ............................................................................................34 3.5.1 Data-Driven..........................................................................................34 3.5.2 Philosophical/Cognitive........................................................................35 3.5.3 Experimental/Experiencial....................................................................35 3.6 Roadguide ...................................................................................................36 This introduction is comprised of three parts. In the first part, the motto “The Free Encyclopedia that Anyone Can Edit” sets the background for introducing Wikipedia (THE); some of its values (FREE); its lineage and novelty in being an encyclopedia+ (ENCYCLOPEDIA); its contingency (THAT); the people involved and discussions about ‘who edits Wikipedia’ (ANYONE); considerations about the commons (CAN) and introduces 2 the major aspect that is studied in the remaining of the thesis (EDIT). After using Wikipedia’s motto as a departure analysis for some essential aspects of Wikipedia and its studying, the WikiWay of participation and education is followed, and the Wikipedia Conversation Game is introduced. Finally, setting the stage for the remaining of the thesis, the ‘Research Questions’ are put forth. Publication status: - 1: ‘Disecting the Motto of Wikipedia’ was the main idea at a poster accepted at e-Social Science, Manchester 2008. - 2: “The Wikipedia Conversation Game” has been presented at the Seminar in Science Theory, and at Open Spaces in WikiSym 2009, WikiSym 2010 and WikiMania 2010. - 3: “The research questions” are thesis specific. 3 1 1.1 A MOTTO Motivation for the structure of the introduction Writing about Wikipedia, about whichever of its dimensions one focuses on, makes two simultaneous demands, not easy to fulfill simultaneously: on one hand, it is of uttermost importance to write about and consider deeply the general features of Wikipedia’s project, if one is to set the stage where the show can begin; on the other hand, Wikipedia is not only a collection of millions of articles, but has also been accompanied by media and scholarship focused on several of its facets, making it a huge source of interest for research almost impossible to tackle in all aspects. The most common way to introduce a study on Wikipedia is to give some historical background, speak of Nupedia, the predecessor of Wikipedia that hoped for reviewed articles but for which the pace was too slow so that the founders, Jimmy Wales and Larry Sanger, decided to open it to the public. It is also odd to write about a project that describes itself thoroughly. Wikipedia is a main reference and therefore also has good documentation about itself. Introducing Wikipedia, which is to draw on its resources 3 and pages , could hardly avoid presenting the exponential growth and stabilization of articles, contributors, impact, controversies and importance of Wikipedia in the age of Web 2.0. I will refrain from presenting Wikipedia in such a descriptive way, both to avoid an exhaustion with the topic before it even starts to be discussed, and to be able to concentrate on a strategy of presentation that will set the stage from the viewpoint of an (actor-network) analysis of the slogan, bastion and claim “The Free Encyclopedia That Anyone Can Edit”. Moreover, a call on reading material at Wikipedia can be fruitful as not only it can help the informed reader make a better personal judgment with the learning of a dispute, understanding the history and contextualizing it in the culture of Wikipedia, but also, Wikipedia asks for participation. The argument that Wikipedia is not perfect (“I found an error in Wikipedia”) is easily approachable by a call to action (“fix it yourself”: in Wikipedia lingo, SOFIXIT). The best way to get acquainted with the diverse actors at play in Wikipedia, and therefore make a good use of “The Free Encyclopedia That Anyone Can Edit” is by reading attentively and contributing and hands-on discover the values, the guidelines, the tools, and the practices. 1.2 Wikipedia’s motto A motto is, in itself, a strategy of presentation that sets the stage on the claims of a particular project. Wikipedia has been called dead by Carr (2006) because its protection of some pages from anonymous users’ edits was inconsistent with the idea expressed in its main phrase “that 3 Some good entry point pages are: http://en.Wikipedia.org/Wiki/Wikipedia, http://meta.Wikimedia.org/Wiki /Wiki_is_not_paper, http://en.Wikipedia.org/Wiki/Five_pillars_of_Wikipedia, http://en.Wikipedia.org/ Wiki/Free_content, http://en.Wikipedia.org/Wiki/Special:Statistics. 4 4 anyone can edit”. The motto is also at stake in a parody of Wikipedia, as Uncyclopedia calls itself “the content-free encyclopedia that anyone can edit”. Actor-Network Theory (Latour 1991) emphasizes the agency of different actors and would support the idea of regarding the motto as an actor from which the complex network of actions and actors can be drawn. Looking behind Wikipedia’s motto can help to extract Wikipedia’s premises, contextualize its ideals and reveal its policies and guidelines. This introduction, which breaks the phrase “The Free Encyclopedia that Anyone Can Edit” into its word constituents and presents and relates scholarly discussions, values and relevant information related to these words, is inspired by Actor-Network Theory. This strategy — breaking up into little pieces — will not satisfy the complete unwrapping of the full network of meanings but will open up the structures and unveil some actors involved. A complete exploration would go beyond the identification and characterization of the actors and would draw the network of power, exchanges, and influences between the actors. But for the purposes of this study it is enough to use the notion of actor without inheriting a strict ANT approach, that is, as an inspiration rather than making a full-fledged ANT study. The notion of actor, a non-human-privileged semiotic process (Latour 1991) can be used to unwrap some of the editorial patterns and their relationships. The main actors at play in Wikipedia belong to three types: PEOPLE, TECHNOLOGY, and VALUES. The human actors are the producers of the content and the discussions and ultimately also the producers of both the technological and value actors. The technological actors are features that the Wiki technology provides – such as discussion pages in every article and als a kind of transparency that allows for a new research and demands new research methods as one can now follow almost every step in a collaborative enterprise. And the values actors include the NPOV – neutral point of view — and with the five pillars of Wikipedia, they constitute the matters of concern. The five pillars are: Wikipedia is an encyclopedia; Wikipedia has a neutral point of view; Wikipedia is free content; Wikipedians should interact in a respectful and civil manner; Wikipedia does not have firm rules. These actors are essential for an informed and informative reading of Wikipedia or at least, acquaintance with them allows for a better position in understanding the project, using its pages, and contributing. All of these are interrelated and necessary actors in what happens. It is important to have in mind when unwrapping the different aspects of Wikipedia through the analysis of its motto “The Free Encyclopedia that Anyone Can Edit” that these actors are the creators of the social patterns, the technological infrastructure and the structure of discourse. The strategy of this introduction is to go from the simplified phrase “The Free Encyclopedia that Anyone Can Edit” to show a more complex network of meanings behind the Wikipedia project and how some of these meanings are investigated in this thesis. 4 uncyclopedia.Wikia.com 5 6 THE 1.3 The The motto starts with the use of the definite article ‘the’ – this usage can be taken to refer to the uniqueness of the Wikipedia project – Wikipedia stands out as the biggest reference web site, the biggest encyclopedia, the biggest Wiki, a Top 10 website. 5 Some statistics from WikiMania 2009 point to 13 million articles in 271 languages, in total 17 million pages, which get 330 million visitors monthly, and about 100000 active contributors. 50 books have been written about it. 8 languages have 500000 or more articles, 27 have 100000 or more, 90 have 10000, and 177 have at least 1000 articles. As of May 2010, the English Wikipedia had more than 3 million articles, and more than 20 million pages in total. The number of edits is 385 million, more than 12 million registered users, of which about 1700 are administrators. Wikipedia is part of the non-profit Wikimedia Foundation, which manages the Wikipedias and its sister projects (such as Wikiversity, Wikinews). Wikipedia is of iconic importance nowadays. It belongs to western-internet culture in ways that everyone in the knowledge sector knows about. Wikipedia is unique in several aspects. It is the open-content encyclopedia that is freely available, easily accessible, can be edited by everyone, and that fosters participation, communication, interaction and cooperation between the users which comprise editors, community builders and readers. Its public face is accessible at www.Wikipedia.org even though there are more hidden relationships and information being exchanged in other channels. Nonetheless its radical transparency and access to history and the process of construction and discussion have been changing the understanding and production of knowledge. Wikipedia uses a Wiki, called MediaWiki, and Wikis are easy ways of online collaborative production of knowledge, where the allowed people can edit and change the page immediately. Wikipedia is one of the paradigmatic cases of Web 2.0, the trend by which people have moved from interacting with the internet to people interacting through it, and form networks of people cooperating. Wikipedia has a multitude of technology-community fostering features, does not have a centralized command, and every user is free to edit the articles they want, as much as they want, or not to edit at all. Some basic structure on how this is possible is accessible upon browsing the different tabs and options in a Wikipedia article. One will find that every article has a discussion page that allows the contributors/editors to discuss, organize, structure and resolve conflict in relation to the article in question; and a history page where all the contributions by all the editors are recorded allowing anyone to see the development of each article. Another feature, for 5 from Jimmy Wales’ talk, gathered from resources such as http://en.Wikipedia.org/Wiki/Special:Statistics. 7 example, is to ‘watch’ a page, so that it will be added to one’s watchlist (if one is logged in) which enables the possibility of tracking what is happening with the watched page. The main feature, what defines a Wiki, is the ‘edit this page’, that permits everyone (or those with access) to add, delete, reformulate, reshape, and so on, all the content of a given article. For some, it is still surprising that Wikipedia works, because "the problem about Wikipedia is, that it just works in reality, not in theory" (discussed later in greater depth), which makes Wikipedia a very interesting topic of research. The under-layer with the records and the changes of articles plays an important role in this research project because the possibility of reading the discussion pages, of screening their internal disputes and guidelines, of mapping the contributions and of finding structures in the cooperative efforts of the editors, the technology and their relationships, fuels new thoughts about knowledge production, cooperation and cognition. It will be a challenge to dig up this buried information and to treat and understand it bringing to light the hidden relationships and networks. "And what is the most important lesson Wikipedia teaches us? That Wikipedia is possible. A miscellaneous collection of anonymous and pseudonymous authors can precipitate knowledge." (Weinberger 2007) 8 9 FREE 1.4 Free Free has two meanings, gratis (for zero price) and libre (with few or no restrictions), also 6 easily described as “free as in free beer” and “free as in free speech”. Wikipedia is both gratis to access (supporting the access to knowledge, ‘a2k’ movement) and in the tradition of FLOSS, Free/Libre Open Source Software, and in the new movement of Free Culture. This is imbedded in the copyleft licenses that portray to Wikipedia. Wikipedia started under the license GNU Free Documentation License – GFDL — from the Free Software Foundation, and in 2009 migrated to also have a Creative Commons Attribution-Sharealike 3.0 Unported License — CC-BY-SA — license from Creative Commons. These licenses grant that Wikipedia content can be copied, modified and redistributed if and only the new version is made available under the same terms. The value of freedom, insofar as there aren’t restrictions from private ownership of intellectual property, places Wikipedia in the genealogy of the hacker culture and open source & free software (Lih 2009; Suoranta & Vaden 2010), while Wikis and their ease of use also follow a tradition of hypertext (Leuf & Cunningham 2001). Freedom is one of the deep cores of Wikipedia, making an appearance on its 5 pillars: “Wikipedia is free content” but there are also important value-discussions, about utopias that are mentioned below. 1.4.1 Utopias? 7 Along with the motto, Wikipedia relies on another utopia, from the Foundation Website : “Imagine a world in which every single human being can freely share in the sum of all knowledge. That's our commitment.” This statement has been received with great awe, on one hand, and great criticism, on the other hand. Jimmy Wales has addressed his philosophical inspiration by referring to Ayn Rand’s pragmatism. New avenues are always more or less utopic, but quite open-ended, while with time they become more institutionalized. This has certainly been true of Wikipedia – an utopia that was easy to ride on (Navot 2008), is now also a bureaucratic castle, plus a foundation. A lot of what Wikipedia, and other internet successes are playing on are the utopias of a new structure. When they come, they come to revolutionize, to change all practices — until their place is found — and the limitations are also found. This is pointed out by Weinberger (2007): "Wikipedia is not as purely bottom-up as the media keeps insisting it's supposed to be. It's a pragmatic utopian community that begins with a minimum of structure, out of which emerge social structures as needed." 6 In this spirit the artist group SuperFlex created “Free Beer” (http://www.freebeer.org/) which is ‘free as in speech’ and not ‘free as in beer’. In other words, its source code (the recipe) is available and reusable, modifiable and redistributable. 7 http://Wikimediafoundation.org/ 10 11 ENCYCLOPEDIA 1.5 Encyclopedia+ Wikipedia has a clear goal: to be an encyclopedia. There are two aspects that are relevant to discuss in relation to this characteristic: the importance of a clear goal that has helped to hold Wikipedia together and the idea that it is an encyclopedia with some important similiarites and differences from the previous encyclopedias, therefore called here encyclopedia+. 1.5.1 How ‘Pedia’ influences the ‘Wiki’ It could be said that Wikipedia is ‘just’ an encyclopedia written on a Wiki, where the process is different than the one used for traditional encyclopedias but that the goal is similar in being a reference work (where the process is new, but the product is of the same type). This is, however, not the whole story, as the Wiki and the Pedia parts have influenced each other a great deal. The goal of producing a reference work shapes many of the ways in which Wikipedia ventures: for example, the efforts of cooperation and consensus (and origin of conflict, too) are (or should be) geared towards agreeing on how to make an account, or ‘how to tell’, rather than on agreeing on the subject itself. The consensus to be found is not about the content but a working consensus about writing that content. People don’t have to agree on the matters per se; they have to agree on how to present them. It might be the case, and it is probable, actually, that it is easier to achieve this sort of consensus, and it might be for this type of common effort that this sort of social software works or is best suited. Also, the study of Elia (2006) about the double language present in Wikipedia – WikiSpeak (a version of NetSpeak) – which is to be found in the discussion pages and meta pages, and WikiLanguage (an encyclopedic jargon) – which is to be found in the articles – shows how the language used in the articles is aimed at being encyclopedic, that is, an ‘objective’ account and that the division between article and discussion page also supports the encyclopedic aspect, by separating a front page with ‘authority’ from other pages. A major way by which the ‘encyclopedia’ goal determines the features of Wikipedia, is that it strives for a Neutral Point of View. This aim of neutrality is comparable with that of traditional encyclopedias which have a claim to the ‘right’ answer (more scientific, as it were, corresponding with the idea that science may achieve a final consensus about what the objective truth is about a topic, disregarding subjective or culture-specific prejudices – as if in Wikipedia a high quality ‘featured’ article could reflect such an ideal), and still support a never-ending on-going discourse, more aligned with the ideal of the humanities in which rationality is seen as an indefinite conversation, each generation will have to make its own interpretations of the works of the previous ones (as reflected in the transparency of the whole Wikipedia, the history pages, and especially the discussion pages). Neutrality in Wikipedia is 12 clearly seen as a goal, in the best scientific tradition, but acknowledged to be an ongoing process in the best humanities tradition. 1.5.2 How ‘Wiki’ influences the ‘Pedia’ But the Wiki technology has also greatly influenced the ‘encyclopedia’ that is being created. If conservative in terms of the goal to write a reference work, and the organization of articles, much else is novel: there is a low ‘publishing cost’ (participation), there are WikiLinks (connection) which easily send readers to related aspects (showing the strength of hypertext), there is no need for a top-down organization or alphabetical order (miscellaneous organization of information), there is a short editorial cycle (updated articles), there are no size limits (fewer requirements on what is to include), and it is not paper (damaging the environment less). Moreover, Wikipedia has a different stance than most encyclopedias, where transparency is changing the nature of authority as explained by Weinberger (2007): "it would seem that Wikipedia does everything in its power to avoid being an authority, yet that seems only to increase its authority — a paradox that indicates an important change in the nature of authority itself. (…) By announcing weaknesses without hesitation, Wikipedia simultaneously gives up on being an Oz-like authority and helps us better decide what to believe. A similar delaminating of authority and knowledge would have serious consequences for traditional sources of information because their economic value rests on us believing them." This is related to ideals of communicative rationality, the force-less force of the good argument, given that the communicative context is a non-authoritarian one. A good distinction is between authoritarianism (forcing arguments by particular parties) and authoritative statements (that anyone can discuss). 13 14 THAT 1.6 That “That” is an indexical in the phrase. This means that it can mean different things depending on the phrase where it is present. In this case it refers to “the free encyclopedia”. This is an opportunity to make a semiotic bridge (that goes from that-indexical-depending on the phrase-depending on the context) and mention something important: the phenomenon (and success) of Wikipedia is context-dependent. Wikipedia is very much a product of the reality of nowadays: its history is inspired by its predecessors from hacker ethics (culture that arose at MIT in the 50s and 60s) and Free/Libre Open Source Software (FLOSS); its ease of use is dependent on the availability of Wikis; its boom is contextualized in the Web 2.0 trend. Wikipedia’s existence is dependent on many contingencies, which can be illustrated by proposing the following counterfactuals: What if Wikis didn’t exist, and submissions had to be sent for a centralized editing – would Wikipedia still be showing bottom-up emergence8? What if Britannica was freely available 10 years ago? What if there wasn’t a dot-com crash, and Jimmy Wales hadn’t decided to try something non-profit? What if Wikis didn’t have discussion pages, recent changes or history logs – would it still be possible for the discussion process to be so t transparent? Joseph Reagle has called the appearance of Wikipedia "a happy accident", and I would like to add that Wikipedia has also profited from adjusting well to realities. An example of such wise adjustment was the episode in 2002 where the addition of ads was mentioned as a possibility in the Wikipedia email list. Some Spanish contributors were so angry that the possibility of ads was even mentioned that they threatened and ultimately carried out a fork, that is, took the content from the Spanish Wikipedia and founded the Encyclopedia Libre, which was bigger 9 than the Spanish Wikipedia until 2004 . This reaction raised concerns and from then on, ads were never mentioned again and in 2003, the Wikimedia Foundation was started as a nonprofit. 8 That is, emergence of new patterns of collaboration and ways to organize a complex system of texts such as an encyclopedia. Here, emergence should be read “the unpredictable coming into existence of something genuinely new on a higher level of organization than the level of the individual component actors of the system. 9 http://en.Wikipedia.org/Wiki/Enciclopedia_Libre_Universal_en_Espa%C3%B1ol 15 16 ANYONE 1.7 Anyone ‘Anyone’ is a radical openness. It is the main factor of both controversy and success in such a project, namely that ‘anyone’ can edit and how that is enabled by the technology. In relation to ‘anyone’ issues of equality, openness, authorship, expertise and credibility are considered. 1.7.1 Equality The idea of openness sometimes leads to an understanding of equality. But, as Weinberger (2007) says "the media is continually shocked to find that Wikipedia is not egalitarian — well, it was never meant to be.” The focus is better if it is not about Wikipedia being egalitarian, or even the internet, but that it offers more equality opportunities than previous encyclopedias or reference works. The project is open to everyone, as a starting point. With time, a hierarchy was developed, as well as some limitations in editing – for example, closing the option of creating new pages to anonymous users. The hierarchy made of Jimmy Wales, administrators, registered and anonymous users and then banned users, speaks to the openness as a starting point but not necessarily as a destination. 1.7.2 From ‘Openness’ to ‘Expertise’ One of Wikipedia’s most debated features is its openness regarding who contributes. The first claim is that Wikipedia is successful not despite, but because of its openness. Not only does it not have a system to check contributors’ credentials but also it allows contributions by anonymous people. This is an issue of respect and necessity (or deficiency) for expertise. The project has preferred consensus over credentials. In the words of Sanger (2004) Wikipedia’s dissident co-founder and founder of Citizendium, Wikipedia lacks respect for experts: “As a community, Wikipedia lacks the habit or tradition of respect for expertise. As a community, far from being elitist (which would, in this context, mean excluding the unwashed masses), it is anti-elitist (which, in this context, means that expertise is not accorded any special respect, and snubs and disrespect of expertise is tolerated)”. But, on the other hand, it is its open structure that accounts for its popularity and immense success, at least, in terms of numbers of contributors. It is also important to discuss what it is to be an expert regarding Wikipedia. Is it to be an expert in the matters written in a single article, a PhD in the area or someone who can write about it? Certainly it also helps to be an expert in ‘Wikipedia matters’, i.e. in knowing the policies, how to deal with discussions or simply be comfortable with editing a Wiki. This notion of expertise can profit from Collins et al. (2006)’s notion of interactional expertise. Interactional expertise is the ability to talk about a field the same way that someone with contributory expertise (a physicist or a plumber, for example) would be able to do. Someone with interactional expertise about plumbing or physics knows enough to be able to answer questions and write about the field but while the interactional expert can only do that, the contributory expert can also plumb or carry out 17 physics experiments. This Wikipedian-contributory expertise (or a slightly lighter version because it only involves writing knowledgeably about a field) might be the level of expertise that is needed for making of Wikipedia articles. 1.7.3 From ‘Expertise’ to ‘Quality’ The discussion on expertise is intimately related with the quality that are found and attributed to the work done by the contributors. A comparison with Encyclopedia Britannica on equivalent articles yielded that these two encyclopedias, which are done in different ways, are comparable in terms of quality (Giles 2005). It has also been shown that there is an increase of structured citations of the scientific literature, which should create a confidence in Wikipedia as an encyclopedia strongly sourced in articles of high-impact journals (Nielsen 2007). One way to reflect on quality issues is to think upon the similarities and differences between Wikipedian quality control and standard scientific peer review. Comparing Wikipedia to science publishing raises issues of openness in publishing and different recognition quests. One can describe Wikipedia as an open-review process. Mainguy (2007) in “Wikipedia and science publishing. Has the time come to end the liaisons dangereuses?” compares scientific publishing which many researchers want to become more open with the Wikipedia’s publishing which little by little accumulates peer review in the tags or in the election of featured articles. Also, Forte & Bruckman (2005) in “Why do People Write for Wikipedia: Incentives to Contribute to Open-Content Publishing” compares the incentives of Wikipedians and scientists in regards that both groups want to contribute to truth and that publication works as an incentive. Both groups seek credit, but for Wikipedians the quest for credit is done by striving for recognition within the community. As a support for this there are many internal structures, such as the attribution of barnstars, which are symbols for ‘good work’ (“I attribute you a barnstar for your contributions in featuring article X”) which can be understood as valued work (Kriplean et al. 2008). Hirschauer has studied the editorial judgments for publication in a journal, where many processes are at play such as communication with the authors (directly, and indirectly, through the text) and that peer review consists “in the mutual monitoring of expert judgments, complementing and controlling, supervising and competing with each other” (Hirschauer 2010). Wikipedia’s quality control, although not relying on ‘formal expertise’ is also based on communication, directly (at discussion pages) and indirectly (through the text) between the editors. 1.7.4 Credibility & Authorship A surprising study “An empirical examination of Wikipedia’s credibility” by Chesney (2006) reported on the expert and non-expert position towards the credibility of Wikipedia, articles’ authors and articles’ content. It yielded a result that “no difference was found between the two groups [expert or non-expert] in terms of their perceived credibility of Wikipedia or of the 18 articles’ authors, but a difference was found in the credibility of the articles – the experts found Wikipedia’s articles to be more accurate than the non-experts.” There is a new notion of authorship in Wikipedia when compared to more traditional encyclopedias or scientific articles because the articles are not signed. Simultaneously, the technology of Wikis and their history pages provide a record of every step and each change, making it accessible to know who did what. The data about edit counts are used when comparing the different contributors which has a significant role in the internal structure and ‘economy’ of Wikipedia; for example, when applying to be an administrator (a user with more rights and access to otherwise forbidden features, such as blocking other users in case of vandalism). Moreover, by allowing anonymous contributions, the sense of authorship gets linked to a username that may or may not reveal the real name, or even just to an IP address. Also, IP addresses might not be that anonymous overall, as a program is capable of tracking their origin and uncovering information such as Wal-Mart changing their wages (Borland 2007). Wikipedia can be seen as the typical example of “remix text” as there are multiple collaborators remixing media to generate the (temporary) final entry. Therefore, the Wiki-article is a place where the notion of authorship is being negotiated between a non-signed work seen at the public level and the possibility to track the contributions in the accessible but hidden layer of the history of the article. 1.7.5 Who writes Wikipedia Kittur et al. (2007) in “Power of the Few vs. Wisdom of the Crowd: Wikipedia and the Rise of the Bourgeoisie” investigated empirically the “open question whether the success of Wikipedia results from a “wisdom of crowds” type of effect in which a large number of people each make a small number of edits, or whether it is driven by a core group of “elite” users who do the lion’s share of the work. The results suggest that although Wikipedia was driven by the influence of “elite” users early on, more recently there has been a dramatic shift in workload to the “common” user.” There has been a lot of informal discussion into how Wikipedia works and what these masses are. Jimmy Wales has put the focus on the community, “Wikipedia is first and foremost an effort to create and distribute a free encyclopedia of the highest possible quality to every single person on the planet in their own language. Asking whether the community comes before or after this goal is really asking the wrong question: the entire purpose of the community is precisely this goal.” (Wales 2005) while Aaron Swartz showed that large portions of content were added by anonymous editors: “When you put it all together, the story become clear: an outsider makes one edit to add a chunk of information, then insiders make several edits tweaking and reformatting it.” (Swartz 2006) 19 20 CAN 1.8 Can ‘Can’ is the possibility to participate, although only a small percentage of readers also edit Wikipedia. The possibility to edit Wikipedia raises interesting issues about collective action, free-riding and public goods. 1.8.1 Cooperation and Social Norms Cooperation is important in the way Wikipedia works. Reagle (2005) in ‘A case of mutual Aid: Wikipedia, Politeness and Perspective taking” argues that Wikipedia has some norms of ‘mutual aid’ such as ‘assume good faith’. He also “expressed hesitation about viewing interaction as simplistic agreement or disagreement, and how the productive character of this interaction is not readily adapted to traditional models based on quantifiable valuations of interests and divisible joint products.” Wikipedians 10 contribute because they know there are other contributors and that they do it based on the same premises – volunteer work – accessible to everyone. It has also worked because no one felt exploited as no one was earning from Wikipedia (today there are more people making a living from Wikipedia, but they are mostly running the Foundation). Wikipedia can then, be taken as an instance of a public good: “In economics, a public good is a good that is non-rivalrous and non-excludable. Non-rivalry means that consumption of the good by one individual does not reduce availability of the good for consumption by others; and non-excludability that no one can be effectively excluded from using the good.[1] In the real world, there may be no such thing as an absolutely nonrivaled and non-excludable good; but economists think that some goods approximate the 11 concept closely enough for the analysis to be economically useful.” In dealing with what is going on in Wikipedia regarding social norms, public goods and cooperation one can conclude that we are in front of something different from the definiton of community from Ostrom, for example, who found that one of the most important features of successful communities is that they have clearly defined boundaries: "Without defining the boundaries of the [collective good] and closing it to 'outsiders,' local appropriators face the risk that any benefits they produce by their efforts will be reaped by others who have not contributed to those efforts. At the least, those who invest in the [collective good] may not receive as high a return as they expected. At the worst, the actions of others could destroy the resource itself". (Ostrom 1990) A better term, ‘network sociality’ is discussed in Suoranta & Vaden (2010) which is understood in contrast to the notion of community which evokes meanings such as stability, coherence, common history, embeddedness, strong interaction and long lasting ties. Network sociality involves informational acts, social bonds created a project-by-project basis. Another 10 Not easy to define – self-identification, contribution to the project, following the spirit are all possible avenues for the definition. 11 http://en.Wikipedia.org/Wiki/Public_good 21 term worth considering is ‘conditional cooperation’ — a social norm that prescribes cooperation if the other members also cooperate. On the extreme that Wikipedia is used by all Wikipedia readers, this notion cannot be used because Wikipedians don’t wait for all the users to cooperate in order to do so – fortunately, since there wouldn’t be Wikipedia otherwise. On the opposite extreme, where Wikipedians only cooperate if other Wikipedians do, this notion is possibly true but not directly – the nature of the open community permits people to go in and out without perturbing much other’s contributions. But certainly, Wikipedians get motivated (but not prescribed) to cooperate when others do. Wikipedia works, because conditional cooperation works, except that it should be taken in a more abstract way: Wikipedians cooperate because others do (unspecified others and groups with borders loosely defined). Or better, the more people that contribute to Wikipedia, the more will contribute. It is a positive feedback loop, as the number of contributors increases, so does the usefulness of the whole encyclopedia, which attracts more contributors. Concerning the problem of the free rider, Wikipedia is built so that many free-riders exist, by definition. Wikipedia is built to have readers, readers who will never contribute, because they can’t, won’t or wish not to do so. Having that as a beginning, as a starting point – the concept of the free-rider somehow vanishes. This is also related to the fact that there is a difference between a zerosum game of material goods of a given size, and a knowledge economy where me giving you my knowledge don’t mean that I have less of it left to myself. Although the goal is clearly defined – making an encyclopedia – it is not specific enough that the nonparticipation of someone affects the project – it might just move more slowly. Wikipedia is about the creation of a public good. Certainly Wikipedians use Wikipedia and are glad for its existence but its importance expands far beyond Wikipedians’ use of Wikipedia. Actually, fairness and virtue might very well be their own rewards. Anthony et al. (2005) in “Explaining Quality in Internet Collective Goods: Zealots and Good Samaritans in the Case of Wikipedia” point out that: “other factors motivating contributions to open source goods include the very low costs of contributing (Lerner and Tirole 2002), and possibly because contributors are “zealots”, Coleman’s (1990) term for true believers in a collective good who contribute for purely intrinsic value beyond rational expectations (see, e.g., Lakhani and Wolf 2005; Raymond 1999)”. Fairness, virtue and openness might be a motivational reason for the increase in the quantity, of the numbers of articles and languages. 1.8.2 Emergence, complexity, networking To understand the commons, one can also look at collective action, for example, at the email thread, “Wikipedia, emergence and the wisdom of the crowds” 12 they discuss whether crowds are or can be smart or not. Although the community behind Wikipedia is essential to its well 12 http://lists.Wikimedia.org/pipermail/Wikipedia-l/2005-May/021753.html 22 functioning – emergence played a big role in the success of Wikipedia. From the emergence article itself in an earlier version of Wikipedia someone makes the point (later removed): “Popular examples for emergence are Linux and other open source projects, the World Wide Web (WWW), and the Wikipedia online encyclopedia. Emergence is, besides the efforts of the Wikipedia founders Jim Wales and Larry Sanger, the major reason for the great success of Wikipedia. All of these decentralized and distributed projects are not possible without a huge number of participants and volunteers. No participant alone knows the whole structure; everyone knows and edits only a part, although everybody has the feeling of participating in something larger than themselves. The top-down feedback increases motivation and unity, the bottom-up contributions increase variety and diversity. This unity in diversity causes the complexity of emergent structures.”13 The notion of high complexity placed at networking is intimately related to cooperation as Margulis & Sagan (2001) puts it "Life did not take over the globe by combat, but by networking." As in humans, organisms that cooperate with others of their own or different species often out-compete those that do not. 13 http://en.wikipedia.org/wiki/Emergence 23 24 EDIT 1.9 Edit Editing is the great feature of Wikipedia and of Wikis. The notion of editing has been expanded to become the new blend of, what I would call, editing-writing-tweakingcommenting. According to the classic notion of editing, to edit a text is to prepare the written material for publication by correcting, condensing, or otherwise modifying it. In the new, expanded version, as editing-writing-tweaking-commenting, editing is not only a matter of fixing typos, edit is also managing the community. Editing is producing text. Editing is agreeing. Editing is disagreeing. Edit is also the topic that won’t be expanded here, because it is done in greater length in other parts of the thesis. In the chapter “Theoretical Perspectives”, the argument “Wikis do more because they do less”, focuses on the characteristics of editing – and the consequences to understand Wikis from a philosophy of technology point of view. Editing is the central piece that ties the C5 terms: cooperation, collaboration, coordination, contribution and community. In the data studies, edits are used as the links between people and articles to construct networks of co-editing and meaning. Editing is also the theme of the cognition chapter where a notion of cognition for improvising surplus tackles the idea that Wikipedia and Wikis are harnessing a capacity for editing which wasn’t harnessed before (or not at this scale). Also, edit is the central figure in describing the ‘Wikiway’ – a non-precise term to speak of the trend with participation and the possibility to contribute. It is in this lineage, that the following part of the introduction follows: the construction of an interactive game about Wikipedia which would support much of the learning about Wikipedia that can be used to read it more wisely, that can be used to motivate people to contribute, and that can be used to introduce theses such as this one. 25 WIKIPEDIA’S CONVERSATION GAME 2 As it is essential to familiarize the readers with Wikipedia in order to set a necessary background for the research that follows, I decided to do that in a form that is, that can be also useful beyond the thesis. Mine damer og herrer, the Wikipedia Game! As this thesis concerns Wikipedia, it is essential to write about Wikipedia: to give context, to set the stage, be it by dwelling on Wikipedia’s mission, telling the often told Wikipedia’s history, contextualize it in the history of encyclopedias, the history of personal computing, the Free Software Movement, or in the history of Wikis and online communities. This game serves the purpose of presenting introductory material in a way that provides context for the rest of the research but also serves an independent and useful purpose – of outreach — to provide context to the many Wikipedia users, be it in schools, universities, or in other public and private sectors. All of these are also potential contributors, and reaching out to them seems to be an important step. Wikipedia is used more and more, and it is important to familiarize Wikipedia users (who isn’t one?) with some of its features. A game has the property of engaging people actively in the learning process, instead of leaving them sitting back and listening to a talk. In that sense, it also mimics the incentive that is meant to be passed on – that people would be interested in contributing to Wikipedia which is a kind of active participation. Moreover, the game also has elements that support reflection. With reflection it is possible to bring a critical perspective into the learning, which helps to make people aware that there is a necessary critical layer in receiving (and creating) information. In a sense, the hands-on and the reflective parts of the Wikipedia Conversation Game map the two traditions that I’ve found myself immersed in through the research for this project: the DIY (Do It Yourself), SOFIXIT (So Fix It) characteristic of the Free Sofware and Open Source movement, including the spirit lived at Wiki conferences and Wikimania, and the academic, reflective, thoughtful perspective characteristic of universities and peer-reviewed systems. The Wikipedia Conversation Game is inspired by Actor-Network Theory, the Glass Plate Game, Mind Maps and Open Space, and has a very easy structure of play. The purpose is helping the acquaintance of several elements of Wikipedia with different purposes: Learn about WIKIPEDIA - to break down jargon about policy and editing - to demystify the use of Wikipedia - to promote discussion about sources, reliability and the origins of information 26 in order to: USE it BETTER MOTIVATE to EDIT The Wikipedia Conversation Game promotes the understanding of its several elements which helps to understand the layer of transparency so that readers can have a better access to knowledge, and is also interested, by de-mystifying its use, in attracting new contributors and increasing awareness that it is a source written, or possible to be written, ‘by all of us’ (which can be taken as public service). The game also builds on the assumption that there is scattered knowledge pertaining to Wikipedia and that this can be harnessed in a group by a semistructured conversation. I will introduce the game and then revisit the positions that can be addressed from the game. 2.1 The Board The game consists of a card game with two boards. The goal is to use the cards to collaboratively make a mind-map of the components and issues of a Wikipedia article, and to uncover major discussion topics related to Wikipedia. The two boards are: the ‘article’ – the mindmap being constructed about Wikipedia itself; and the ‘discussion’- where a forum for discussion is constructed to provide reflection about some of the topics that Wikipedia raises. WIKIPEDIA’S CONVERSATION GAME DISCUSSION ARTICLE Figure 1-2-1: the board is made of two boards which are very simple and mimic two of the tabs present in each Wiki page: the content – the article; and the space for negotiation, the discussion. 27 2.2 The Deck 14 A card deck is divided between the players. There are two types of cards: normal cards and discussion cards. A normal card has a big word printed on one side, and a small contextual explanation on the other side. Normal cards represent actors involved in the making of Wikipedia. They come in three different colors that represent three types of actors: people, technology and values. Discussion cards state a main issue on the front side, and on the other side have questions to generate discussion. In Figure 2 one can see an example of a normal card (both sides) and a discussion card (both sides). 2.3 Objectives The individual objective of the game is to get rid of the cards in one’s hand. The collective objective is to construct a mindmap that addresses all the relevant topics, and construct discussions that are relevant for the group. For example, a school class playing may want to address the topic of repeated, boring student vandalism to pass time in class, and the consequence of the blocking of the whole school’s IP address, while a company may address the COI policy – conflict of interest and the best practice of writing a concern in the talk page rather than changing the page directly. To contribute to the mindmap one only has to suggest a card to play in relation to one already 15 placed . Then, one has to present one’s card, and explain the relation between the previous card and the card just played (it can be a relation of consequence, of opposition, of inspiration, or many others). Once the card is played and explained, others may contribute with explanations or relations. If all agree that the card and the explanation make sense, then another person can suggest a new card. To contribute to the discussion, one can play one or two discussion cards in the discussion board. Then, the questions behind the cards are divided between a number of discussion groups of 2-4 people and these groups have to write a short phrase (within 2-3 minutes) addressing the question, which gets posted below the discussion card(s) in the discussion board. 14 This can be done in several ways. The easiest consists in dividing the same number of cards by all the players, until there are no more cards. The most complex consists in letting people choose a number of cards, and letting the remaining ones belong to a pile, from which people can take more cards as they please. In the latter form, ‘participation’ becomes a choice, not limited by the number of cards. 15 It is possible to play Wikipedia’s Conversation Game following some sort of rule for the order of participation. Nonetheless, it is highly recommended that no such rule is imposed. “Who wants to start?” helps to get the first two cards on the table, and from there it is easy for someone to contribute with a card. That way makes the game more lively, and follows more the patterns of a conversation or a debate –a person who has something to say gets the word. 28 Figure 2-2-2: two examples from the types of cards present: ‘content’ cards, front and back; and ‘discussion’ cards, front and back. There are few empty cards to allow for topics not present to be taken up. The presence of a facilitator, Wikipedian or educator may smooth the play – the role of the facilitator is to put the cards suggested in the mindmap and to add relations when a card is played. A facilitator can also play as a judge, by accepting/rejecting cards. 2.4 The Cards People Administrators Wikimedia Foundation Board of trustees Village Pump Vandals Wikipedia Signpost Donor Reader Larry Sanger Janitors Jimbo Wales Editor Anonymous Wikify’ers CheckUser ArbCom … Technology Featured Article Template Bots Citation Needed Wiki (quick) Revert Barnstar MetaWiki “Watch this page” Edit Template Info Box Protected Article Nupedia Sandbox Flagged Revisions SockPuppet Category Stub WYSIWYG Commons … Values Creative Commons ShareAlike 3.0 License Consensus “Don’t Bite Newcomers” “WP has a code of conduct” “WP does not have firm rules” “WP:NPOV” 5 pillars “Timeocracy” “Ignore all rules” “Be bold” “The Free Encyclopedia that Anyone can edit” Bureaucracy Love WP is Free Content WP is an encyclopedia 3RR Wiki Projects GDFL Wikipedia is not paper … Table 1-2-1: List of cards in three different groups. 29 Discussions Quality Reliability Expertise Cognitive Surplus Trust Community Contribution Wisdom of Crowds Civility WikiTrust Content: what is in Wikipedia? … 2.5 A game Introduction: someone introduces the game (the facilitator, the educator, the Wikipedian, the one who has found the game): the board, the cards, the purpose. The cards are divided between the participants. Player A plays “Bot” & “Vandalism”; explains succinctly that a Bot is a little program (“robot”) that can make edits; explains succinctly that vandalism is one of the problems of Wikipedia, for which several solutions have been devised, and that usually vandalism does not stay for long in a page; explains the relation between “Bot” and “Vandalism”: nowadays much of the vandalism-fighting is done using bots. Player B plays “Sandbox” next to “Vandalism”. Explains that sandbox is an experimental area to try the Wiki technology and mark-up. Explains that the relation to “vandalism” is that some seemingly vandalism acts are people just trying to see how a Wiki works, but they do it in an article instead of in the sandbox which is meant for such experiments. Player C plays WYSIWYG and explains. Player D plays “Don’t Bite the Newcomers” and explains. Player B plays “Anonymous Editors” and explains. Player D plays the discussion “Expertise”. Explains that is related to the previous cards, because if anyone is allowed to write, and is allowed to write anonymously it is important to discuss “Expertise” and the open position of Wikipedia in relation to expertise. Three groups of three persons are made, and the questions are divided between the groups. The groups discuss briefly, and write a quick answer in a piece of paper. They read it aloud for everyone. The card “Expertise” and the three pieces of paper are posted in the discussion part of the board. (And it goes on…) 2.6 NYAQ Wikipedia’s Conversation Game’ NYAQ (Not Yet Asked Questions): Why a card game? - A card game is fun, is interactive, and can be played within one hour or so, making it a good outreach material. Why not online? - Much of life is still offline. And for those ‘scared’ of online, one more game online, is just one more geek thing they can avoid. Why simple? - If it is to spread, it should be easily graspable. Most rules one does away with anyway – and the purpose is learning about Wikipedia, not about intricate rules. Also, the 30 game’s purpose is to be inclusive, to attract different people to play it. Like a Wiki – it provides a space – and allows for the content to be generated. Why collaborative? - To keep in the spirit of Wikipedia. Why a game, and not writing an article? - Different formats serve different purposes. There is already a good book “How Wikipedia Works”; this game fills a space to introduce Wikipedia in a kind of workshop. Why quick? - It can be played within 1 hour, which is the most time most schools, companies and other institutions are willing to spend on an idea such as this one. Was it done collaboratively? - Somewhat. Drawing upon ideas from many different people, it has nonetheless been started by me, because an academic degree is unfortunately personal, but Wiki here: vulpeto.wiki.nbi.dk. Accessible? Under GFDL or CC. Ideally under a license: any use is OK if reported back (easier than to compile different usages and additions). Changeable? Welcome. Add, play. Other similar projects? So far as I know, there is the Wikipedia Card trading game 16 which had the hope of making a game, competitive, and as an insider thing. For schools: the mind map that is constructed while playing the game can be turned into a poster for the classroom. 2.7 Reflections Constructing, developing, playing and proposing the game has been an interesting point of reflection about Wikipedia and about writing theses. Constructing cards helped to see which ‘actors’ are present, and although many, they are not infinite. Playing the game showed how these different ‘actors’ construct a ‘network’ of meaning that is quite dense. It has also been useful to consider how Wikipedia is a fascinating topic of nowadays that brings many discussions to the table, such as those about trusting sources, about freedom, about participation, about governing modes. The feedback has been often of the kind ‘I go away from here knowing much more about Wikipedia and about its issues, and learned all that almost without noticing’. 16 http://en.Wikipedia.org/Wiki/Wikipedia:Trading_card_game 31 RESEARCH QUESTIONS 3 If we knew what we were doing, it wouldn't be called research, would it? -- Albert Einstein This is where, after a short introduction with meta-thoughts about questions, are defined the research questions for the thesis as well as the thesis’ objective. Furthermore, there are some thoughts on the processes of investigation (theory and data, diversity of methodologies), and finally is presented a road-guide to the remaining of the thesis and is explained how the three research paths (data-driven, philosophical, experiential) lead to the next 4 sections. 3.1 Meta-thoughts about questions Many were the questions that inspired this study. Many more were the questions that were born out of this study. These questions, both those questions that inspired the thesis, and those questions that were inspired by the thesis, have different levels of abstraction and can, therefore, be investigated to equivalent levels of hands/mind(s)-on-research. Some of the questions are so abstract that they only lie in the realm of inspiration. Other questions can be formulated precisely so that they can be investigated in the real world, using specific tools and research-methods. And then, most of the questions that end up being the concern of a study such as this one — are those questions in between — they are concrete enough that they are specific to the case at hand, but also somehow difficult to investigate in a conclusive way. In this study, broad questions such as "What are the similarities and differences between form and content?" "What is the interplay between know-what and know-how?", "What is cooperation?", "How is cognition connected between minds, bodies and worlds, or would it be better to ask, how does cognition connect minds, bodies and worlds?" informed the investigation of emergent and distributed cognition in/through Wiki-articles. So the approach has been to let these big abstract questions motivate the studies of direct touchable cases. And not just in the top-down direction, but also, from the bottom up, to apply a direct touchable case to give some hints as to answer the big abstract questions. The central tags in this work are COOPERATION, COGNITION and WIKIS, and more broadly, general human cognitive and cooperative processes and new socio-technological tools. It is in their vicissitude that these studies will be addressed. 3.2 Two Sets of Research Questions Below, it is shown how the two initial research questions gave rise to three avenues of investigation. Following that, the general thoughts on 'the process of investigation' are presented. 32 The two driving sets of questions are: (a) How is cognition expressed in socio-technological networks? In particular, how to characterize the emergence and distribution of cognition? What is the role of cooperation in the cognitive process, in the interplay between different actors and cognitive tools? (a-b) How to represent and analyze the cognitive task of writing an article together through a Wiki? How do Wikis (and in particular, Wikipedia) work? (b) What are the mechanisms in article-writing and in governance that support the Wikipedia-phenomenon? What are the characteristics of Wikis, of their architecture, of their norms and values that make them work the way they do? 3.3 Thesis’ objective The aim of this project is to elaborate an empirically-grounded study of the notion of cooperation within cognitive aspects of such a complex social and cognitive system as Wikipedia and its context of socio-technological network. 3.4 Process of Investigation “Rare indeed is the man who knows what his thesis is about before he has written it.” Barley Here I make an effort to spell out two approaches that lead to this thesis’ creation, while more specific methodological claims (humanities bottom-up, meso-level) are presented in the introduction to the data-driven studies, and more general reflections on appendix X, and at the ‘Meta-PhD’. One of these approaches consists on the diversity of methods while the other is to make explicit the back-and-forth mode between data and theory, experimentation and discussion. 3.4.1 Diversity of Methods To use several different methods can be either a weakness or a strength, depending if the study fails by dispersion, or gains by triangulation. In the case of these studies, it was necessary to study the phenomena of Wikis and their cognitive status through diverse angles, as only that approach made it possible to both account for the novelty of the technology (there aren’t set procedures on how to look at Wikis), and for the interdisciplinarity of the research 33 on cognition (cognitive science is itself constituted by many disciplines). 3.4.2 Theory and Data There was a permanent dialog between theory and data, and discussion and experimentation, and experimentation and theory, and discussion and data, and theory and discussion, and discussion and theory, and theory and experimentation, and data and discussion and………... The greatest advantage of having a concrete case study and simultaneously a set of questions of a more abstract level, is that both the case study [i.e. the Wiki(pedia)-article] and the theories applied (cognitive theories) needed each other. Learning about a theoretical framework only gives meaning if applied to a case. The data only has meaning if informed by a theory. Many cycles of this kind were present, some of the cycles of sharing information and inspiration between theory and data were very deliberate, while other cycles and influences more subtle and perhaps not even in the awareness field. 3.5 Research Paths Below are introduced the three research paths that constitute the three parts of this thesis. These have slightly different zoom levels and objects of study: in the data-driven studies the object of study are Wikipedia articles from which data are gathered and networks are constructed; in the cognitive chapter, cognitive theory is developed to encompass the cognitive processes in contributing to Wikis, and to understand, in part, the success of Wikipedia; and in the ‘experiential/experimental’ part, Wiki characteristics are studied by developing a ‘live-Wiki’. 3.5.1 Data-Driven This section's main object of focus are the Wikipedia articles. What is the process of production of a Wikipedia article? One can use data from the recording of edits to an article. These data are easily available, but they are far too numerous to be understood with the naked eye. Visual tools, network analysis and biclique analysis are used to process the large quantity of data and to reveal patterns of collaboration. The studies are also performed in complementary ways by changing the subjects both in terms of number and kind, spanning editor-article networks from physics and philosophy articles in the English, as well as from the Meta-Wiki, and from the coordination pages of the English, Portuguese and Spanish Wikipedias, and studying some articles at greater length, namely the 'Prisoner's Dilemma' and 'Neutral Point of View'. This part of the thesis follows the tradition of data-driven studies where patterns may arise from the processing of data, as well as categorizations and insights into the structures of the objects of research represented by the data. To counteract the scientific tradition of 34 calculating parameters of a network such as cluster coefficients, power law equation, or exponents (which can give little insight into 'what was going on'), the approach here is to attempt a kind of 'humanities bottom-up" (confer section III.1.1), or in other words, datadriven social patterns research. The level of analysis is neither macro nor micro: it stays at a meso-level, where the networks studied can reveal information about collaboration in the making of articles. 3.5.2 Philosophical/Cognitive What are the cognitive processes involved in the cooperation mechanism of Wikipedia writing? How do articles get written? A distinction between Cognition for Planning and Cognition for Improvising is proposed and applied to the writing of Wikis, suggesting that a surplus in Cognition for Improvising is harnessed and enables a large encyclopedia to be written modularly, accounting for the many small and fewer large edits by which Wikipedia’s articles grow. This theory includes the understanding of Wikis as cognitive tools and of cognition also as the handling of concrete situations. It is also investigated whether Wikis could or should be thought as examples of a milestone of cognitive evolution. The theoretical framework can be a starting point for a cognitive discussion of Wikis, peer-produced commons and new patterns of collaboration. This part of the thesis follows the scholarly tradition of constructing arguments, showing examples that function as thought experiments and discussing positions. Because the theme is quite concrete and real, following the claim that philosophy should be applied, the examples are from the 'real world' and the analogies are also from the internet, economics, sociology and the like. 3.5.3 Experimental/Experiencial What are the roles and characteristics of Wikis? How can these be studied by changing the support of investigation – both the internet-software and the paper-presentation-investigation? Here several of the characteristics of Wikis are investigated by constructing a live-Wiki, as an artistic interactive installation at a conference. The live-Wiki, by supporting open participation and creativity, serves as a medium to contribute to and reflect upon the discussion on freedom and openness, and creative and participative processes. This part of the thesis follows the approach of directly engaging with the WikiWay – by adopting it. The WikiWay is an umbrella for ‘a way of doing things’ that points to the way that people interact with Wikis and to the kind of values and behaviors that Wikis support. 35 3.6 Roadguide This thesis is a hybrid between a monograph and a collection of papers. While it should read like a monograph, the section “data-driven studies” is organized as a collection of papers, as one of them have been published and two others were conference papers. There are five sections ahead. The first section consists of one chapter “Theoretical Perspectives”, where the theory of distributed cognition which influenced the later work is spelled out, as well as a clarification of the terms C5 = cooperation, collaboration, coordination, contribution and community. In the ‘theoretical perspectives’ chapter, a philosophy of technology detour is made to propose “Wikis do more because they do less”. The following section handles the data-driven research path. It starts with an introduction which clarifies a number of methodological choices, and explains how the next six papers hold together. After the introduction to the data studies, the six papers are presented. The third section ahead consists of one chapter on cognition and Wikis, where the philosophical/cognitive research path is given space. The fourth section consists of one chapter presenting and reflecting on the liveWiki “Our Coll/nn/ective Minds”, which is the exploratory/experiential/ experimental part of the thesis. And finally, in the fifth section there is the discussion/conclusion, where the work is evaluated and new paths proposed. 36 II. THEORETICAL PERSPECTIVES 1 Theories ............................................................................................................ 38 1.1 Distributed Cognition.................................................................................. 38 1.2 Hypotheses and Positioning ........................................................................ 40 1.3 Cognitive Artifacts...................................................................................... 40 1.4 In the Wild and Doing/Situation and Action................................................ 41 1.5 Time&Space............................................................................................... 42 1.6 Shortcomings.............................................................................................. 43 2 C5 ...................................................................................................................... 45 2.1 Cooperation & Collaboration ...................................................................... 45 2.2 &Coordination............................................................................................ 46 2.3 Comparing cooperation, collaboration and coordination.............................. 47 2.4 Contribution ............................................................................................... 48 2.5 Community................................................................................................. 49 2.5.1 Communities of practice and epistemic communities ........................... 50 3 Wikis Do More Because They Do Less ............................................................. 51 3.1 Wikis’ agency............................................................................................. 51 3.2 Transparencies............................................................................................ 51 3.3 Actors......................................................................................................... 53 3.4 Conversations ............................................................................................. 53 3.5 Mediations.................................................................................................. 54 3.6 Architecture................................................................................................ 54 3.7 Technology................................................................................................. 56 In this section, which opens perspectives in theoretical terms, a number of theories are visited, and used as inspiration for the remaining of the thesis. This chapter addresses the three major themes of the thesis: Cognition, Cooperation and Wikis, in order of appearance. Distributed Cognition and cognitive artifacts set the stage as well as thoughts on Actor-Network Theory. Moreover, some terms are investigated, such as cooperation and community, and then, in a philosophy-of-technology introduction, the role of Wikis claiming that Wikis do more because they do less is studied. Publication status: - 1 “Theories” & 2 “C5” have been written exclusively for the thesis. - 3 “Wikis Do More Because They Do Less” has been presented as a short paper at the course “How to Analyze IT”, Aarhus, 2008. 37 1 THEORIES I would like to make a preliminary point about the challenges of interdisciplinarity. One such challenge is to not simply use a sequence of different theoretical perspectives, rooted in separate disciplinary traditions, but also let those perspectives communicate and mutually inspire each other. That may be a process of going beyond using theories, to allow the theories to inform a new, and perhaps more coherent perspective, and thus, in that pursuit extend what one may say if one was to use only one perspective. 1.1 Distributed Cognition Distributed cognition (d-cog) is a theoretical framework developed by Edwin Hutchins and his colleagues in the 1980s and 90s and seeks to understand cognitive systems like any other cognitive theory. It departs from the computational/information processing metaphor of cognitive science in which systems are considered in terms of their inputs and outputs and tasks are decomposed into representational states. This processing metaphor can be seen as a limit to a correct understanding of d-cog. Besides the cognitive science origin, d-cog takes the unit of a cognitive process to be greater than what is containable in one mind or brain – it aims at providing rich descriptions studied in real-world settings. Therefore, it studies the interactions between people, artifacts and both internal and external representations. In 1996, Hutchins published the reference work to the field, “Cognition in the Wild” where he does an extensive analysis of the navigation tasks in a ship. He focuses on the cognitive and communicative processes that happen when a ship needs to be steered into the harbor. It involves a deep analysis of the coordination of representational states across media for each small task. This approach has been used to analyze collaborative work in different application areas (Yvonne Rogers 2004) including cockpits (Hutchins 1995), and design and engineering teams. It has been shown to be useful as a theoretical framework in the field of Human Computer Interaction (HCI) (Hollan et al. 2000). More specifically, it has been evaluated as useful in the field of Computer Supported Cooperative Work (CSCW) (Halverson 2002). Distributed Cognition is surely about cognition. Cognition is any process that involves simple manipulation of symbols, those processes involved in memory, decision-making, inference, reasoning, learning and so on – when there is a propagation and transformation of representations. Magnus (2007) distinguishes between the task that could in principle be carried out in a single mind and the process by which that task is carried out which is not enclosed in a single individual. Such phenomena, where the process is not enclosed in a single individual, would be characterized as distributed cognition (“d-cog”). In this sense, the Wiki-article satisfies the conditions to be d-cog. On the one hand, the task of writing an 38 encyclopedic article is a task that could be carried out by a single individual (brain or mind), but on the other hand, the process by which it is done – the Wiki format allowing several contributions by several people at several times and levels – is a process not enclosed within the boundary of a single organism. Moreover, strictly speaking, not even a lonesome individual writing is enclosed within the boundary of a single organism, as thoughts get externalized and materialized. Writing is distributing one’s thoughts to the screen or paper, making a much more extended self-dialogue possible. Also, the task of writing an article relies on an idealized specification of the behavior to be achieved and in this sense ‘the ideal article’ also plays a role. To precisely define if d-cog is a distribution of an already extended cognitive process into a greater number of people or if it is the distribution of cognition into the world and its artifacts it is important to look at some origins. D-cog’s origin is cognitive science, which classically attempted to discuss cognition in terms of information processing, and the brain as a CPU that can do logic and combine strings of symbols. This view implied a very individualized perspective upon cognition, as if a cognitive process was localized within the skin borders of a person. A new movement in the 1980’s made cognitive science more independent from the classical AI paradigm, which sees the information processing system at work in cognition as formal computations within one rule-based, formal symbol system, and become more attentive to other computational architectures such as connectionist models, allowing for a better treatment of the role of context in tasks such as pattern recognition and categorization. But cognition as treated, for example, by Clark & Chalmers (1998) becomes even more expanded as they suggest that cognition is extended to a shopping list, for example. Artificial Intelligence has also adopted a similar theory – Distributed Artificial Intelligence (DAI). This perspective takes robotics as the challenge of creating an animate machine that can interact and coordinate perception and action. Cognitive Science has always been more interested in how people think while AI has been interested in solving the engineering problem of how to make machine intelligence. Likewise, DAI can be seen as an outcome of aims of more efficient means of computational architectures, while d-cog as a field of research can be seen as an attempt to understand the process of thinking, following the increasing realization that cognition is socially and materially embedded. In the light of where d-cog stands, Hutchins points out which attributions are the concern of d-cog: “If we ascribe to individual minds in isolation the properties of systems that are actually composed of individuals manipulating systems of cultural artifacts, then we have attributed to individual minds a process that they do not necessarily have, and we have failed to ask about the processes they actually must have in order to manipulate the artifacts. This sort of attribution is a serious but frequently committed error.” (Hutchins 1996) 39 D-cog is already a multi-faceted interdisciplinary theory that is multi-faceted and touches different fields. I would like to make a sketch of some of the interactions and how they will inform the work that is carried out later. 1.2 Hypotheses and Positioning I’d like to distinguish between a strong and weak version: Strong version: in the examples studied by the d-cog field of research (e.g., ship navigating) the cognitive process is extended and exemplifies the thesis of the existence of extended minds sensu Clark & Chalmers (1998). Weak version: those same examples of distributed cognition are examples of single cognizing minds using externalized cognitive tools, and other minds, to help their internal cognitive processing sensu Dror & Harnad (2008a). In this thesis, I support the weak version, as it is all that is necessary to investigate distributed cognitive processes in Wikis. Also, an agnostic position towards this metaphysical debate is all that is needed to pursue the research goals. Although I find the strong hypothesis interesting and worthy of debate, I can only be agnostic towards it, as it seems to postulate stronger claims than can be justified on current research. 1.3 Cognitive Artifacts D-cog takes external and internal representations into account and makes use of the idea of cognitive artifacts. Cognitive artifacts can range from physical objects, to behaviors, to processes that are used to aid, enhance or improve cognition. Some examples are a calendar, a shopping list and a computer. Cognitive artifacts are embedded in a larger socio-cultural context. This greater system organizes the practices in which they are used and may crystallize frequently encountered problems in practices, knowledge or material artifacts. The reason to study cognitive artifacts is because in d-cog they are a part of the cognitive process as Hutchins (1996) emphasizes: “Local functional systems composed of a person in interaction with a tool have cognitive properties that are radically different from the cognitive properties of the person alone.” There is a clear analogy to Actor-Network Theory (ANT) (Latour 2005), which wants to map networks, taking into account all possible actors – ranging from material actors to semiotic actors, relations between things, concepts and people. It insists on the agency of all these actors, including the non-human ones. Insofar as Actor-Network theory is concerned, the whole network is taken into consideration (including both what in d-cog would be called cognitive artifacts and the larger socio-cultural system) but it can also serve for the characterization of the problem at hand. The commonality between these two theories will, in the present case, serve to understand the actors at play beyond the human editors of the 40 articles. On one end, there is the characterization of the cognitive artifacts present in the cognitive task of writing a Wiki(pedia) article. To name a few, the Wiki technology, the discussion pages, the ‘watch this page’ button and ‘the five pillars of Wikipedia’. It should also be mentioned that it is not only ANT that shares bridges with d-cog in the case of cognitive artifacts. Cyborg theory stemming from technoscience studies already studies either organisms that have enhanced capacities due to technology (in its weaker version) or a hybrid of machine and organism. Haraway (1991) in “A Cyborg Manifesto: Science, Technology, and Socialist-Feminism in the Late Twentieth Century” discusses how humans mesh with technology. Although the theory contributes to understand what is distinctive about humans by understanding the complementary contributions of both biology and technology, in its stronger version it melts the two. It is not that appropriate for the study of the Wiki (article), which seems to be a weaker case than a person with a pacemaker or a voice synthesizer, which are the paradigmatic examples of cyborgs. Also, Clark & Chalmers (1998) in their philosophy-grounded “Extended Mind” paper propose an active externalism, which is based on the active role of the environment in driving cognitive processes. They extended the cognitive processing beyond the individual mind, arguing that Otto, who suffers from Alzheimer’s disease and carries a notebook with all the information, forms an extended mind case, him and the notebook. Features of constancy, reliance and accessibility are particularly important to the consideration. A Wiki-article is not a case of an extended mind (because it doesn’t qualify in their definition, as the definition includes a constancy, reliance and accessibility that is possible for Otto’s notebook, but not for the momentary, transient uses of Wiki articles). That said, Clark and Chalmers also pose the question of the possibility of a socially extended cognition – that could apply to the Wikipedia article if one considers that the high degree of trust, reliance and accessibility can be found in the whole system and not in the individuals. 1.4 In the Wild and Doing/Situation and Action D-cog puts a great emphasis on the study of functional systems done ‘in the wild’. Associated with the interest in studying phenomena where they happen, it has shares a border with theories that emphasize action, or doing, as the main participatory act. Theoretically this follows a tradition of situated cognition or theories like Vygotsky’s cultural-historical psychology, which wants to understand how cognition develops embedded in a given place and time and Lave & Wenger (1991)’s Communities of Practice, which focuses on the process of social learning where a common interest makes people collaborate over an extended period of time. 41 1.5 Time&Space Two important dimensions to consider and characterize regarding the studies in d-cog is their spread across Time and Space. The field of Computer Supported Collaborative Work, which is concerned with the use of technology to support people in their work organizes the tools on the so-called CSCW Matrix along the lines of space and time and their remoteness/collocated and synchronous/asynchronous dimensions, i.e. activities that occur in the same time/same place (ex: wall displays), same time/different place (ex: chats), different time/same place (ex: groupware) and finally different time/different place. Wikis are an example of tools in “different time/different place” because they are tools for collaborative work that don’t imply people sitting down in the same room or at the same time. Nonetheless, this seems to be a simple categorization specially when compared with other methods of collaborative article writing, for example the traditional passing of drafts around the contributors. A paper written by different people would land on one’s hands first, then it gets to the second set of hands and so on and so forth while in a Wiki (given the number of collaborators) it will develop bit by bit, in little changes that get reactions. This seems to be an important feature because Wikis, although not requiring the simultaneous presence of the contributors, allow the collaboration to happen in a flexible and relatively open-ended (no deadline) interactiveness that most collaborations lack. This interactiveness allows for the current version to be always available and changeable. This feature of a fast ‘ping-pong’ played by several players (and recorded unlike what would happen on a ‘ping-pongy’ conversation) might contribute to bigger consequences. Harnad (2005) in “Distributed Processes, Distributed Cognizers and Collaborative Cognition” argues that even though email and the web have permitted a new form of collaborative cognition that allows individual brains to interact in real-time in ways that are new compared to the oral, written and print interactions, these interactions are still cases of collaborative cognition and not of distributed cognition. The threshold for distributed would be the ability of a system to say ‘I feel a headache’, – which is possible in the internal distributed cognition of one mind. Confronted with this argumentation, one can either a) give up on the distributed cognition term, b) keep it bearing in mind that what is stipulated by the term distributed cognition is a complex of cognitive processes, some of which takes place by the cognizing agents’ use of cognitive artifacts, [as indicated in the weak version of d-cog above] or, c) challenge his way of making the parallel between distributed cognition in a mind and outside of a mind. In the case of the navigation system, even though there is a network of several interactions and representations, the output is still turning to port or starboard. The case of the Wiki-article has a less precise output – article quality doesn’t compare well with the navigational output. On the other hand, when compared with taking ‘science’ as a form of distributed cognition, ( “Distributed Cognition and the Task of Science”) a Wiki-article has a 42 much more precise output than ‘science’ does. And to compare back to Harnad, navigational outputs don’t have headaches, nor do Wiki-articles. Science, though, may well be having a headache . 1.6 Shortcomings D-cog seems to be a good basic stone in the theoretical framework but it also has its shortcomings. I will identify the two most important and propose a way to deal with them. As identified in Heylighen et al. (2007) d-cog is so far just a collection of ideas, observation techniques, preliminary simulations and ethnographic studies. Moreover, it focuses on the ‘how’, but it is essential to study the ‘why’ and ‘what’ and how distributed cognition emerges from a system. The computational metaphor in cognitivist theories departs somewhat from the cybernetic view, according to which a system cannot be understood by a mere collection of beliefs, procedures and representations. It is more similar to an evolving process and to understand it fully it is important to take into account the semiosis, that is, the interpretation that occurs. As Ipsen (2003) says, the semiosis process needs the user: “A close examination will show us that, without the user as a component of the system, information networks are not capable of implementing semiosis”. In the case of Wikipedia, there is a double user – the readers and their interpretation of the information and the contributors and their process of semiosis when interacting with the hypertext. Both the text and the users are subject to change, and that change is also a process of semiosis according to Ipsen as, “It is not only the hypertext base that is subject to possible change. The interpretation of that information base depends on individual and societal semiosis”. Speaking of functional systems provides quite a goal-oriented task which misses semiotic triadic relations. These semiotic approaches depart from mere sociological accounts such as Communities of Practice (Lave & Wenger 1991). In terms of d-cog, the account of functional systems seems to fall short in its treatment of interpretants – which surely are involved. Thus, one can argue that the perspective of d-cog lacks a theory of the semiotic function of representations (e.g., pieces of text being edited) as standing for something (e.g., pieces of knowledge) to some user (doing the interpretation). In relation to this it is relevant to ask a question about the limits in the scale of analysis of a theory. Why isn’t d-cog just cognitive science, actor-network theory or cultural semiotics? We have seen why the boundary of one individual has to be extended analytically but it could be so extended that it would be more appropriate to study it in cultural studies or sociology. Although these approaches could yield important information, as users of a theory we should be concerned with the scale it provides for looking upon particular objects. For the making of the ecology of the Wiki-article (and, as will be argued later, a ‘bottom-up humanities method’) the relative size of analysis is still too small for sociology (insofar as it is the 43 analysis of norms and general claims). Actor-network theory, as shown, shares important features but lacks the cognitive aspect. In short – d-cog is a source of inspiration for these studies because the studies deal with a cognitive phenomenon (collaboratively writing an encyclopedia), it is a distributed process (different people, technologies and values are at play), and the scale of the study fits the cognitive discussion, better than using too general theories from the broad social sciences. 44 2 C5 What is cooperation? And how is it different from collaboration? And what is the role of coordination in Wikipedia, what is the relation of coordination to cooperation and collaboration? And a community performs most of the work contributed in Wikipedia, they say. By a community? What is a community? As viewed by sociologists? By networkanalysts? How does this community get formed, what counts as a contribution? These are just some of the questions that inevitably arise when pursuing the thesis’ objective. These questions which deal with the notions of cooperation, collaboration, coordination, contribution and community = C5 are commented upon in the following remarks. What follows is a way to define better the territory of exploration, although precise definitions, especially at the crossroads of interdisciplinarity, if not undesirable, are at least impossible to draw. Each discipline carries a whole history of the concepts that construct the building of knowledge it is based upon and finding ways in which the nomenclature fits is a challenge beyond the scope of this project. That said, and without going into greater detail, it should be possible to talk about some aspects of the terms, and how they are used (or abused) in these studies. In what follows, I position the remaining work in relation to the ideas of cooperation, collaboration, coordination, contribution and community. I’ll both present some reflections on the nature of these concepts more or less from the everyday perspective and also from an informed perspective drawing upon some scientific contributions. 2.1 Cooperation & Collaboration In one understanding, cooperation is a type of interaction between many agents, which stands in contrast to competition. Competition is when these agents act 'against each other', and cooperation when they act together. This usage is very broad, and can be applied to agents, simpler beings like ants, and also to humans. It does not entail consciousness about the process, and it is the direct interaction that is at stake, not the level of discussion, agreement, or awareness. In this understanding, collaboration is then taken as an 'advanced cooperation' — which can only be done between human beings. In the current usage of the word collaboration, it wouldn't make sense to speak of ‘collaboration of ants’ while building an anthill because collaboration entails knowledge and awareness about what is being done. It entails a level of free will, of decision to, well, to collaborate. In a sense, collaboration (in the old-fashioned style) seems to require a continuous common awareness among the participants about the object that the collective is working on; yet, in the world of Web 2.0, continuous real-time and common awareness seems to be less needed or rather broken, making it flexible (and making the 'collective' always, in principle, open) but 45 maybe also more fragmented. Are mutuality, reciprocity, and an "awareness of the other author's perspective" (in contradiction to simply an awareness of the text one would like to edit) a precondition for talking about true collaboration? It is, though, difficult to assess to which degree a community around an article is deliberately aware of the trajectory of that article in the making. Following this understanding of cooperation and collaboration, studies in this thesis are in between these two notions. If, on one hand, Wikipedia and Wikipedia articles are done by humans, who are aware of their interactions, — making it naturally a study on collaboration, on the other hand, by extracting data that connects people and articles, this is data that is lessspecific to concerned efforts — and therefore it is also a study on lower-level cooperation. One of the paradigmatic cases of cooperation is the construction of anthills and the way ants cooperate. One process that has been discovered in this setting is ‘stigmergy’ – the communication between agents with pheromones left in the environment. The case of the Prisoner’s Dilemma is also one of cooperation, which Axelrod (1997) studied at length in the case of the Iterated Prisoner’s Dilemma to find that the Tit for Tat strategy is an evolutionarily stable strategy for agents playing with each other in several rounds. An enormous body of work using agents and simulations builds on these cases. Collaboration, on another hand, has been the study of processes more from the humanities perspective: Computer-Supported Collaborative Work, for example, but also studies of interaction (human-computer interaction). Writing for Wikipedia is without doubt a collaborative act, as people know consciously that they are collaborating, and contributing to a greater good, and to the commons. That said, the specific practice of writing in a Wikipedia article can follow a stigmergic path (Susi & Ziemke 2001) or follow a specific collaborative practice (Viégas, Wattenberg, Kriss et al. 2007). 2.2 &Coordination Coordination is the agreed act to organize behavior. Wikipedia builds enormously on this process as well. Meta-pages, Wikiprojects, and categorizations such as 'important pages missing' are part of this process of coordinating efforts and orienting them. ‘Important pages missing’ is a Wiki page that collects pages that are not yet written, but seem essential for the project, or for the encyclopedic achievement, which draw contributors who may not know what to do, to something specific that needs to be done. In coordination, people do something together but care who is committed, what are the trust levels, etc, while collaboration is doing something together caring less about who is doing, more focus on the process and the product than on the origin. 46 In rough terms, cooperation is the minimum process happening in the very writing and editing of the articles, collaboration is the process happening in the discussion pages that entails a level of awareness of a common goal, and coordination is the process happening at the metapages. This is, of course, a very rough division, and points out to the importance of all these processes in the making of Wikipedia, since articles, discussions and meta-pages are all essential to what is happening. It is also very rough because there are also coordinating activities happening in discussion pages and because there are articles and discussion pages in the meta-sections, so there is plenty of cross-pollination between these processes of interaction. In the studies done here, all of these three processes are addressed with different focuses depending on the studies. For example, in the biclique studies, it is cooperation that is the main focus, as the study takes the data of who edited what articles and constructs biclique ‘communities’. In the meta-Wikipedia studies, the coordination pages are the focus of the study. 2.3 Comparing cooperation, collaboration and coordination Bottom-up Smaller level of awareness Smaller degree of semiotic freedom cooperation collaboration Top-down Greater level of awareness Greater degree of semiotic freedom coordination Figure 2-1: schema that relates cooperation, collaboration and coordination. The diagram above outlines how cooperation, collaboration and coordination are related, following the remarks above on the common meaning and scientific studies of some aspects of these phenomena. Cooperation is seen here as a greater umbrella term, that can be used in the continuum between interactions of simple agents, to ‘international cooperation’. For these relations it is important to draw attention to: - degree of semiotic freedom; syntax vs. semantics: cooperation can be a lower-level process and therefore can more easily apply, even in cases where the degree of semiotic freedom 17 is small, for example, with simpler organisms, such as ants and 17 Semiotic freedom is a concept taken from the biosemiotics of Jesper Hoffmeyer. In Hoffmeyer (1996) it is discussed how evolution not only leads to more complex organic forms, but also increases the communicative complexity of organisms in their inner and outer relational workings, meaning that their sign actions will have a higher combinatorial and interpretational richness. Biosemiotics argues that the mechanisms of biological 47 bees, or with agents. In a sense, collaboration requires a semantic level that is not necessary with cooperation, as collaboration entails ‘cooperation between people’, therefore needing to include the tendency of meaning-seeking that humans have. Coordination requires, like collaboration, consciousness and the presence of a semantic level. - bottom-up vs. top-down: cooperation, in the stigmergic sense, can be understood as a bottom-up phenomenon. Coordination, by definition, is the structuring of activity, which is a process that happens top-down. The focus of this study is clearly on 'bottom-up' patterns, even though several top-down processes are also at play in Wikipedia, and are acknowledged here. - level of awareness/consciousness: As already mentioned, the level of awareness changes between the phenomena denoted by these concepts. Cooperation is a greater umbrella for all kinds of processes, of which collaboration is an ‘aware state of cooperation’. In that way, cooperation is not necessarily non-conscious, or a lower level process, but it includes those lower-level, non-conscious processes of doing things for mutual benefit. - level of anonymity: cooperation at the lowest level seems to work fine with pseudonymity, perhaps even with anonymity. It is possible to contribute to an article even if one had *no idea* of who is contributing, in other words: it is possible (or even typical) for editors of an article to ascribe a full collaborator status to another editor of the same article, even if the later is anonymous. An ant can follow a trail, even if it doesn’t know which fellow ant left the pheromone. Coordination, on the other hand, certainly needs at least a level of pseudonymity as it is important to be able to account and delegate work. As for collaboration, which seems to fall in between low-level cooperation and coordination, a certain level of knowing with whom one is discussing seems to be useful as well. - Counter vs non-counter: It is also interesting to distinguish work that is ‘countercollaborative’ from work that is ‘non-countercollaborative’. In this sense, vandalism would belong to the first, and a ‘cooperative’ edit would count as the second, even if done ‘unconsciously’, and therefore not being ‘collaborative’. 2.4 Contribution For the studies presented here, it is important to have a clearer notion of contribution. The notion of contribution helps to define when an edit is an improvement. While an edit is what is done any time the edit button is pressed, a contribution entails the addition of value. As an evolution are not only mutations and natural selecion, but also communicative (sign) action, interpretation and metacommunication, and human language with its unique properties and very high degree of semiotic freedom is not the only semiotic code, even though it is unsurpassed in expressive complexity. 48 example, a vandal act does not count as a contribution, but is certainly an edit. Fortunately, Anne Goldenberg (2010) has already done this defining work in her thesis where she studies the process of negotiation in public Wikis such as the Debian Wiki and Wikipedia. From interviews, she has gathered that a ‘good contribution’ is a useful contribution for the project. Contributions have been considered as an object of study in History Flow (Viégas et al. 2004) where the visualization permits the contribution to be traced, and it is possible to see what remains, and what happens to a given contribution. Also WikiTrust (Adler & de Alfaro 2007a) has dealt with contributions – understanding contributions in the light of the reputation of the editors, and the newness of the contribution, it colors the text in order to show a degree of trust. Text that has been written by trustworthy editors (that have written other text that has survived), and has survived for a long time is usually trustworthier than a recent edit by an anonymous editor. Also, a recent paper (Ekstrand & Riedl 2009) on the ‘tree’ histories of editing activity can also be said to be particularly concerned with contributions, as it is interesting to see to what text reversions relate to. By tracking the versions that survive and constitute the article, a notion of contribution as ‘what lasts’ is put forth. This distinction is important in the Wikipedia community, for example, when contributors apply to be administrators, their ‘edit history’ is looked upon. What kinds of edits did this contributor make? Were they ‘contributory’? In this thesis, the notion of contribution is more relevant than ‘edits’, even though, computationally, it is difficult to extract ‘contributions’. Nonetheless, in many of the data-studies, a partial solution is found: by filtering edits to only include 10 or more edits from an editor into an article, we are addressing indirectly that only those ‘committed’ enough to have edit 10 times appear in the networks, in other words, those who ‘contribute’. 2.5 Community One of the most central terms though, is the notion of community. But community is a poorly defined term across disciplines. In sociology, it usually entails some sort of closeness that belonging to a community implies a set number of commitments. It usually entails a level of consciousness, of agreement, like intentional communities, or communities of practice. In the world of networks, a community has been defined by having more intra-links than outside links (Ahn et al. 2009), but even this notion is being revised because clustering techniques have done poorly in human networks, where there is a high degree of overlap. A recent post explaining why Google Buzz is a failure (Lehmann 2010) explains that the notion of community shouldn’t be defined either by a greater number of intra-links, as one can be part of several communities which overlap, and one being part of ‘family’ doesn’t mean there are more links in ‘family’ than there are with ‘job’, ‘hobbies’ or ‘sports’. 49 One of the reasons why the notion of community is so central to Wikipedia concerns the several discussions about who writes Wikipedia. These discussions are important because the credibility of Wikipedia is put into question. As mentioned before, Jimmy Wales (2005) focuses on the community, referring that most work is done by a core number of people while Aaron Swartz (2006), in a famous blog post reminds that there are many substantial edits by anonymous users, and Dalby (2009) makes a balance that many of those anonymous users are possibly members of the community that didn’t sign in. Community is also a source of trust and shared values, which I have certainly witnessed; live at WikiMania, the Wikipedia conference, which has much the same atmosphere that Esperanto conventions or ecovillages. Jimmy Wales addresses the intricacies of communities filled with interpersonal relationships: "Community sometimes is almost meaningless; it just means there's people out there doing stuff. But in Wikipedia, what community means is that they're people who have met each other; they know each other; they've had arguments; they've made up; they've had different kinds of controversies; they've banded together to take care of some problems; they like each other; they don't like each other; sometimes people are dating and then they break up and then there's rumors and scandals, and all of the stuff that makes a rich human community is what goes inside Wikipedia. It's a complete soap opera actually inside our community." Jimmy Wales, quoted by Lessig (2008). 2.5.1 Communities of practice and epistemic communities The notion of community has also been the central term in theories such as theories of communities of practice and theories of epistemic communities. Depending on the focus, the Wikipedia community is both. Communities of practice defend that: "[...] its joint enterprise as understood and continually renegotiated by its members, [...] mutual engagement that bind members together into a social entity [and] [...] the shared repertoire of communal resources (routines, sensibilities, artifacts, vocabulary, styles, etc.) that members have developed over time." (Lave & Wenger 1991) This view serves well the learning process of becoming a Wikipedian (Bryant et al. 2005). While epistemic communities are defined as follows: “They are small groups of agents working on a commonly acknowledged subset of knowledge issues and who at the very least accept a commonly understood procedural authority as essential to the success of their knowledge activities.” (Cowan et al. 2000) Epistemic communities can be defined as a group sharing a common objective of creating or exposing knowledge and a common structure, which allows a shared comprehension of the members of the community. The communities around each article are indeed concerned with a ‘subset of knowledge issues’ and accept ‘a commonly understood procedural authority’. As for the studies in this thesis, they are mostly concerned with cooperation, as it is the broadest term, and one which can also allow for a bottom-up research. 50 3 WIKIS DO MORE BECAUSE THEY DO LESS Here is where we discuss the claim that Wikis do more because they do less, because Wikis give interaction back to humans in the light of Latour, Suchman and Ihde/Verbeek’s work. Architecture, and how technology influences behavior are also considered. 3.1 Wikis’ agency "Wiki is the simplest online database that could possibly work." Ward Cunnigham Wikis are, in the simplest formulation, easily editable websites. Nowadays, they are one of the most widespread collaboration tools. They are efficient in unique ways in this quest for supporting collaboration. Here it is argued that there is a kernel of truth in the paradoxical expression that “Wikis do more because they do less”. In other words, Wikis are a technology that doesn’t take its humanity too far, it rather sets a medium where humans can better interact, and collaborate, and write encyclopedias, for example. By ‘humanity’ I mean a set of hopes of being a ‘cognitive’ technology that would do the ‘cognitive work’ for the people. Wikis do very little in enforcing what is possible and not possible, and by stepping back, give more space to the human interactions, the discussions, the creations, and the policies. In this sense, they ‘do more’. Doing more means that they allow people to get to the core of collaboration, straight away. Not even making a user account on the website of a Wiki is necessary. Open Space Technology, a facilitation technique that also emphasizes ‘stepping back’ has been compared to Wikis. In a sense, Wikis play as little a role as possible, and are a kind of facilitator that ‘holds space’. “There’s very, very little in the software that serves as rule enforcement. It's all about dialogue, it's all about conversation, it’s all about human making decisions... it's about leaving things open-ended, it's about trusting people...” Wales in Lih (2009) 3.2 Transparencies Transparency is one of the key elements of Wikis. There are no hidden texts, no initial bureaucracy, and no other places to go. Of course there are other places, such as email lists and IRC channels. But the bulk of what happens is in written form and accessible by everyone). It is important to make clear the following distinctions: I.: TRANSPARENT: (a) immediately present [the article], (b) less immediately accessible through the history and discussion pages, (c) processed data made comprehensive (e.g., in diagrammes) by analyzing (b) [the less immediately-accessible]. II.: OPAQUE: patterns not completely accessible through analysis of I. III. HIDDEN: e.g., secret email communication outside discussion pages about editorial decisions. 51 Below, two levels of transparency (I) are identified which contribute to the claim that Wikis do more because they do less. One: Wikis record: Wikis keep track of everything that happens and this information is easily accessible, through log pages, one can observe who did what and when. First, the record allows for an extensive tracking of basic features of human-machine-human interaction (data used in the following studies), but also, Wikis naturally become ideal fora for the incremental increase of work contributed by several editors. A way to see this is to reflect on the ease with which vandalism (or incorrect use) can be reverted. The reversion facility supports the rapid reinstatement of the page content. It isn’t costly to correct or revert to a previous version, which makes the content already in place difficult to destroy. Lih (2004) attributes significance to this feature noting that “This crucial asymmetry tips the balance in favor of productive and cooperative members of the Wiki community, allowing quality content to emerge”. This asymmetry is one of the features, which is a consequence of the nature of information, or better, it is a consequence of the fact that information can be stored in very little space (compared to the remaining material world), and therefore many copies (such as copies of all the revisions) are easily kept. By way of comparison, other acts of vandalism such as few forms of graffiti are much harder to erase, delete, or revert – as it takes time, money and effort to clean a wall with graffiti. Two: Wikis step back: Wiki is a software that somehow gives the voice back to humans. It not only provides a space for the production of the articles, but by having discussion pages and the ease of creating new pages (which can always be used for discussion), or just that any content can be written by anyone, it allows humans to interact with a high degree of semantic depth. In other words, Wikis allow for a space of negotiation and discussion. Wikis are constituted by a web of actors, human, technological and actors in the value plane (such as policies) that enhance the human-human collaboration that lies at the center of this system. As it will be expanded later, this follows Suchman’s emphasis on human-human conversation, as well as an understanding of Wikis as mediating technologies. An important consequence of these two kinds of transparency, one that allows for history, and one, which allows for discussion, is that working together allow for new things to emerge as the Wiki platform makes it now possible to collaborate simultaneously with hundreds of people, in a way that is constructive, time- and space-independent. This possibility would corroborate claims of a new cognitive step. If this ‘phase transition’ is true, and Wikis+humans can perform new outcomes, it is, in principle, investigable through its feature 52 of keeping record of (most of) what happens. These speculative claims will be taken again up in the chapter about cognition and Wikis, while here, in the light of Suchman (2006), Ihde/Verbeek (2005), Benkler (2006) and Lessig (2005), I will describe the possible relation between the technological artifact Wiki, humans and their distributed cognition. 3.3 Actors Actor-Network Theory has already been mentioned as one of the inspiring theories for this thesis. Here too, it comes back. Although it is interesting in terms of analysis to give agency to the technology and values, I still find it important to qualify them differently, agreeing with Pickering (1995) who says in respect to humans and nonhumans: "Semiotically, these things can be made equivalent; in practice they are not,” and with Suchman (2006) insofar as persons and artifacts do not constitute each other “in the same way”. This distinction is also important when unpacking the point that Wikis do more because they do less because, for making that claim, it is necessary to distinguish between the human and non-human actors. So, keeping the agency of both technology and humans seems useful, but acknowledging that these agencies are very unique. Because Wikis found their place as ‘dum technology’ they are more useful as they give more space to the agency of humans and promote their interaction. Other non-cognitive ‘dum’ technologies can also do this, such as bicycles. 3.4 Conversations Suchman asks 'How to account for the difference between humans and machines in a nonessentialist way' and in her book “Human-Machine Reconfigurations” (Suchman 2006) she shows how the interactions human-machine interactions with ‘Eliza’, ‘ALICE’ and the ‘Head’ cannot yet be fully taken as a conversation the same way that there are conversations between humans. Particularly important is the point that it is not because 'they are machines' but because conversation requires more than mere interaction. Compared to the stories about human-machine interactions, it is easy to claim that Wikis are not intentional in the way we want robots to be. They are quite passive — constructing and being constructed by the people around, like any other technology, but their greatest strength is in being somehow, at least in a first instance, transparent, and to focus on being a support where other things happen. If we should ascribe intentions to Wikis, we could say that Wikis want to be the background and not the primary presence. They want their presence not to be so known and they want to be able to foster the collaboration between humans. Wikis are not that transparent after all as it can be difficult for the normal user to access the history and discussion pages and make sense of those (not to forget that there are other channels of communication) and only few Wikis are user friendly as they don’t provide WYSIWYG (which has been claimed to be a feature to screen only engaged people). To stretch the point more, Wikis are so successful because they are not trying to be human. 53 In that sense it is more of a communicative tool, more similar to a telephone than to a radio. A very useful distinction is the one that separates how technological artifacts and human relate. 3.5 Mediations Verbeek (2005) explains Don Ihde’s thought as distinguishing three different ways in which humans relate to technology: by mediation, by alteration and by having the technological artifacts in the background. The relation between Wikis and humans seems to be one of mediation, while the interactions between Suchman and talking machines was one of alteration. This distinction accounts for the way Wikis are successful in bringing humans to converse and create with each other. Moreover, one shouldn’t forget Verbeek (2005)’s point that “Mediation does not simply take place between a subject and an object, but rather coshapes subjectivity and objectivity,” so, mediation doesn’t have to keep the rigidity of the previously mentioned positions, where there is a deep separation between humans and technology. Wikis are a technology that does not call attention to itself, are technically serviceable (usable from any computer and internet), need a certain skill to be used (Wiki and editing skills), and aim at making mediated perception (through a mediating technology) of a measure similar to unmediated perception (without a mediating technology). Discussion pages, for example, become quite ‘unmediated’, as everyone can have their meaning and write it as they please (unlike pressing buttons to ‘agree’, ‘disagree’ or ‘vote’. Human and nonhuman actors play their different and essential roles; conversation motivates discussions aimed at real collaborations; and understanding Wikis as mediators shows their useful and successful place by which they do a little, but mediate a lot of collaboration. As a consequence of doing little, little can be said about Wikis’ politics, or equalitarian values. Doing little allows both for ‘human nature’ and a willingness to collaborate, but also allows for governing structures that may not be the most fair. Also, Wikis do nothing regarding the participation of their users, which has the consequences that still male, young, nerdy, white and western people still predominate. 3.6 Architecture After looking at some of the characteristics of Wikis, it is important to fit this discussion within a bigger framework of the role of architecture in constraining/enabling human behavior, and in the greater discussion of the deterministic/or not view of technology. Lessig (2005) in "Free Culture" lists the four ways in which it is possible to alter, or condition, human behavior. One of these constraints is the market — market laws, such as pricing via supply and demand define some of the options in our societies. One is the law — by defining certain activities as in accordance with the law and others against the law, behaviors change. In his view, the current copyright law transforms all teenagers into 54 criminals, by defining 'stealing' copying and sharing file as ‘stealing’. Norms are another of the forms by which behavior is conditioned — social norms demand certain behaviors that we learn from our peers, and other citizens. And a fourth way to influence behavior is 'architecture'. Architecture influences the possibilities that are available — and this extends from building architecture to software architecture. Figure 3-1: Lessig’s 4 constraints for people’s behaviors. All of these constraints influence each other, sometimes peripherically, sometimes in an overwhelming way. This is a useful categorization that can be used to think about what is the subject of concern in this chapter. Wikis are a specific architecture, insofar as the building of a Wiki follows specific principles and allows for certain behaviors. They extend on the hypertext principle, inspired by the connected mind (Venners 2003), support discussion, pseudoanonymity, direct editing, etc. At the same time, several norms are in place, some coming from a hacker/internet culture, others more generally from a western culture. Some of these norms are explicit 'principles' and guidelines, which turn into internal law by which the site follows. Other norms are also at play, as for example, the huge attention Wikipedia gets because it is (and causing it to be) the 5th biggest web site. Although market and law constraints are present, the studies of this thesis takes mostly depart from an interest in Wiki’s architecture. Architecture is really the way 'form' influences content, or that 'form' is really a form of content. Architects, designers, advertisers know this all too well. Media analysts know this also, understanding that media is really an architecture that influences, is influenced by, determines and is determined by the content — the message. Wikipedia, being a Wiki, pertains to all these questions, and, on top, has created a whole meta-structure on how to edit Wikipedia, coordination, etc — a community was formed bottom-up. These norms, usually the realm of sociology, are essential to understand what is 'going on' — understanding some of the 'behind-the-scenes' is also important in this study. 55 3.7 Technology Wikis transformed the process of written collaboration. Is it just that collaboration now can be broken apart and sharable by many people? Is it just that collaboration can be done step-wise? Are the ‘justs’ just ‘just’? While the technology and architecture can change the process of doing things, it isn’t an easy position to both acknowledge the role of technology and not be over deterministic. One extreme is to follow McLuhan, who in the classical example defended that ‘the medium is the message’. If the medium is the message it suggests that there is total superimposition between those two realms – the form and the content. Or architecture and change. The other extreme idea is to say that architecture and content have no influence on each other. But the way technology and particular architectures influence people's behavior doesn't have to be understood as so deterministic. What information and platforms we have has an influence on what is easy to do, and what is difficult. So — technology influences but does not determine action. Yochai Benkler (2006) mentions this: "Different technologies make different kinds of human action and interaction easier or harder to perform. All other things being equal, things that are easier to do are more likely to be done, and things that are harder to do are less likely to be done. All other things are never equal." Another insight from this interplay between technology and action is that new technologies come without practices, and norms, and these need to be borrowed from before, or invented anew. So, in the beginning of a new technology, many are the ways to use it, and there are many utopic levels which are possible. In other words, different goals of utopia can be aimed at when the technology arrives. But, once the technologies settle, there are fewer options with what to do with them, how to use them. Part of the excitement of Wikipedia-like things is the realm of possibilities that are out there. It remains to be seen if this thesis will also be understood as part of the optimistic initial wave. 56 57 III. DATA-DRIVEN STUDIES 1 Introduction to the data Studies ..........................................................................59 1.1 Bottom-Up Humanities ...............................................................................59 1.2 Methodological choices...............................................................................60 1.2.1 No interviews. ......................................................................................60 1.2.2 No experiments/No simulation..............................................................61 1.2.3 No disruption........................................................................................61 1.2.4 Where do I stand...................................................................................61 1.3 The Use of Networks and Bicliques.............................................................61 1.3.1 Visualizations .......................................................................................62 1.3.2 Bicliques ..............................................................................................63 1.4 The Following Studies.................................................................................63 1.4.1 The Two Levels....................................................................................63 1.4.2 The Two Zooms ...................................................................................63 1.4.3 The combinations .................................................................................64 2 Bipartite Networks of Wikipedia’s Articles and Authors: a Meso-level Approach 67 3 We Coordinate, Nosotros Eligimos, Nós Administramos: Articles and Editors of MetaWiki and in Meta Communities of the Ibero-South American Wikipedias........84 4 Context Networks of the Articles ‘Prisoner’s Dilemma’ and ‘Wikipedia:Neutral Point of View’ ....................................................................................................... 106 5 History of the ‘Prisoner’s Dilemma’................................................................. 114 6 Networks of Wikipedia Article: Insides of ‘The Prisoner’s Dilemma’ ............. 130 7 Inside the Policy Article ‘The Neutral Point of View’ ...................................... 140 In this section all the studies that use data are presented. After an introduction, which explains methodological choices and how the studies fit together, the six studies follow. Publication status: 2. “Bipartite Networks of Wikipedia’s Articles and Authors: a Meso-level Approach” was published at WikiSym '09, October 25-27, 2009, Orlando, Florida, U.S.A (original paper in appendix Y). I was the main author of this study, while Martin Schwartz contributed with programming and data-harvesting, and Sune Lehmann as an incitor, interlocutor, and editor. 3. “We Coordinate, Nosotros Eligimos, Nós Administramos: Articles and Editors of MetaWiki and in Meta Communities of the Ibero-South American Wikipedias” was presented as a conference paper at WikiMania 2009 4. Working paper “Context Networks of the Articles ‘Prisoner’s Dilemma’ and ‘Wikipedia:Neutral Point of View’” (earlier version submitted) 5. “History of the ‘Prisoner’s Dilemma” was presented as a conference paper at Internet Research 2009 6. Working paper “Networks of Wikipedia Article: Insides of ‘The Prisoner’s Dilemma’” (earlier version submitted) 7. Working paper “Inside the Policy Article ‘The Neutral Point of View’” (earlier version submitted) 58 1 INTRODUCTION TO THE DATA STUDIES This section contains the research that is based on data harvesting. Wikis record the footprints of their pages, and Wikimedia gives access to full downloads of the different projects. In the following studies, the data is analyzed, but the methods are also studied to understand their usefulness and breadth. When engaging with Wikipedia and Wiki studies as a new field of research, it is imperative to keep the attention both on being innovative and on being critical with the forms of studying these new phenomena with their particularities. Below, are some thoughts on the methodology, encompassing some of the methodological choices and the greater idea of pursuing a kind of ‘humanities bottom-up’. Then some thoughts on the use of networks and bicliques are presented, and finally the four papers following are presented and related to each other. 1.1 Bottom-Up Humanities Research can be done in different fashions and following different disciplinary traditions. It can be data-driven, where data mining gives then access to clustering, pattern finding and categorization. This is the process aimed at in these studies: while Latour would have said “follow your actors” to mean that one should follow the threads of agency and not be too much previously determined by some agenda (in a sense not an approach invented by ANT but inherited in a modified form from anthropology), others are interested in investigating what arises from the data on textual traces of those actors, and would therefore say “follow your data”. One of the great innovations when studying an article embedded in a socio-technological network like Wikipedia is that the information on many (if not all) steps in the editorial changes from one version of the article to the next are available, in addition to some (but not all) parts of the motivational and contextual grounds for the changes that can be seen in the discussion pages. Not only does Wikipedia record all the contributions over time, but most of the interactions also happen online (and are saved), either by postings in discussion pages, talk pages, or on open-access mailing lists. This new mode of communication leaves traces of what is happening. These studies take advantage of the fact that one of Wiki(pedia)’s simple and major innovations (from a researcher’s point of view) is the accessibility of the written products of the process, thus allowing us to investigate its development. So, it is possible to trace who did what when. This higher degree of transparency is exploited in these studies. Availability of data, though, can be misleading, as much important data is slightly hidden behind the scenes. Although it is possible to access the history of each article, and all the editors that have contributed to it, this is not as accessible in any immediately comprehensible form to the readers and researchers 59 for two reasons, already pointed in section II/3.2. One simple reason is because the data is not in the front page, and some of it is a number of mouse ‘cliques’ (and tricks) away. The other reason, is that these data – a list with names and times, and ‘diff’s (the comparison of text from two different versions of the article, where the changes can be seen), are not very meaningful when immediately looked at. They need to be processed, cleaned, visualized, and interpreted, for a story to emerge. The hope is to analyze what is going on half-behind the scenes because even though all the uploaded information is accessible – not everything is really just there to be seen, nor is it just ready-viewable or even of interest for the common passer-by. There are different levels of transparency and the making of the Wiki-article is still somewhat of a black box to the average person. It is, though, less black than in the case of classic encyclopedia articles, because it is more accessible – let’s say a ‘dark shade of grey’ – – and we could say that these studies attempt to transform it into a ‘lighter shade of grey’. To be able to ‘see’ these patterns behind the data, it is possible to use ‘trace ethnography’ as defined by Geiger & Ribes (2010) which accounts for the fact that one can follow the many traces, or footprints, left by editors when editing the Wiki-articles. Studying these traces can be quite revealing, as much of what happens is recorded, either in the edit history of the articles, in their discussion pages, or in the user pages. This methodology is partially used here. Other tools where also used (see “History of the Prisoner’s Dilemma”) to reveal patterns of Wiki article editing. 1.2 Methodological choices I do a triangulation of methods, in such a manner that they do not stand-alone but contribute to the investigation. The structure of this project moves from more analytic and descriptive approaches to inform more normative general aspects. The challenge would be to break the information from the history of the article into digestible pieces – information from the history of the article – who did what when and then putting them together again. By studying the history and discussion pages as well and how they interact I will be studying the networks between people, which will inform questions on the cooperation. The methodological choices come at the expense of some possible complementary methodologies that aren’t followed for either lack of relevance, or lack of research time. 1.2.1 No interviews. The focus in these studies is to treat humans and their relationships in a more abstract sense and ponder on the appearance of masses and their contribution to new forms of collaboration without falling into accounts of motivation or dealing with internal states. Interviews can certainly give a lot of information, about how contributors understand their own activities. Short informal interviews where performed at WikiMania 2008 in order to seek out the areas 60 of ‘greatest collaboration’. WikiProjects seemed to be a place to look for heavy collaboratin and this claim is confirmed in one of the results from the first paper below. Besides, results from qualitative interviews that help the understanding of the general context and incentives have been published in studies such as the one done by Forte & Bruckman (2005). 1.2.2 No experiments/No simulation. Some lines of research support the idea that by constructing an artificial setting, more variables can be controlled. One possibility would be to instruct people to use a Wiki and in real time monitor their response to the challenge. Other lines of research support the idea of simulation, in which case one would construct an artificial Wiki article to which programmed agents would contribute (only syntactically, I suppose, as the construction of meaning is still best attributed to humans). The reason to not follow these lines of research is imbedded in the approach — I am concerned with cognition in the wild – and therefore making phenomena as experiments and simulations do wouldn’t help. 1.2.3 No disruption. Some experiments, though, could be done in the wild, by introducing small disruptions, or other kinds of ‘artificial interventions’ and monitoring what happens thereafter. A good reason not to do so is Wikipedia’s page on ‘Do Not Disrupt Wikipedia to Make a Point’. Beyond that, there are clear ethical worries – there is no ‘informed consent’ form to be filled out, by the editors and readers who would be subject to ‘disruptive’ acts in order to show something. And, I do not want to disrupt a project that I respect. 1.2.4 Where do I stand It is important to clarify the position of the researcher, not only to promote transparency but also to understand where the study fits in relation to the initial standpoint. One part is the extent to which I participate in the project. I don’t actively contribute to Wikipedia. Nonetheless, I am an informed observer – I have made myself known (and available) at my personal page in Wikipedia. I have met many Wikipedians and administrators, I am part of the Wikimedia Chapter in Denmark, and have attended 3 of the 6 community conferences (WikiMania) in which I participated enthusiastically (that is the second part of the disclosure: I like and respect Wikipedia). 1.3 The Use of Networks and Bicliques To use networks is to look for interactions. Latour (1991) points to this: “They have the impression that network analysis recreates ‘that night when all the cows are grey’ ridicule by Hegel. Yet network analysis tends to lead us in exactly the opposite direction. To eliminate the great divides between science/society, technology/science, macro/micro, economics/research, humans/non-humans, and rational/irrational is not to immerse us in relativism and indifferentiation. Networks are not amorphous. They are highly 61 differentiated, but their differences are fine, circumstantial and small; thus requiring new tools and concepts. Instead of ‘sinking into relativism’ it is relatively easy to float upon it.” Most studies about Wikipedia using networks [e.g. Capocci et al. (2006)] choose the nodes and edges as the pages and links between them, which focuses on the network of concepts. In this work, the networks studied are those formed by the editors and the articles that they worked on, and the networks of the co-editing on an article – eventually with editor, paragraph, action, and discussion nodes. To study these networks and the cliques in them reveals layers of dense activity that can bring insight into the patterns and clusters of editing in Wikipedia. In these studies, to construct these networks, data was gathered from Wikipedia by downloading the dumps of several Wikipedias – the English, the Portuguese, the Spanish and MetaWiki and by copying the meta-data regarding the editing history of a specific article. In order to understand the activity of these editorial sequences, in two of these studies, the edits were coded to describe what kind of action the editors had performed (clarify information, format, etc). Networks, which are analyzed here as data organized in nodes and links, seem to naturally underlie these interactions, in particular, as bipartite networks – where nodes in one group – of editors – link – by editing – into nodes of another group – articles or parts of the articles. Because these studies focus on cooperation, and ‘indirect interaction’ where editors ‘work together’, it is natural to look for clusters and dense areas. Most of the literature and algorithms on clusters constructs clusters that exclude their nodes from belonging to other clusters, but these techniques are not particularly appropriate to study Wiki articles where editors can belong to different clusters simultaneously: it is obvious that editing on one article and being part of that ‘dense agglomeration of work’, does not exclude one from being part of another. 1.3.1 Visualizations The visualizations in the following studies all use the ‘organic’ mode, for which the location of nodes signifies a degree of connectedness. These ‘maps’ capture important things about a network, not all of them translatable into text. Comparing different ones, one can see similar patterns (such as the umbrella clouds), and also see which networks are more tight, or have more nodes, and how do these organize into concentric shapes, hub-shapes, etc. Different attitudes can exist towards these kinds of mappings – while, on one hand, those familiar with using data and making graphs have a certain ‘data-acceptance’ and understand that the patterns showed ‘were there’, others, using a certain ‘data-reluctance’ are more keen in being dissatisfied by not seeing anything ‘conclusive’. While I am aware of these attitudes, I keep to a certain ‘data-acceptance’, by which the claim for ‘humanities bottom-up’ can make sense in 62 providing the conditions for the structures to emerge and be seen before a filter of expectations is laid upon them. 1.3.2 Bicliques Bicliques are cliques (maximally connected areas of the graph) in bipartite networks. Biclique-finding algorithms find the dense areas in graphs, which can be overlapping. To allow for overlapping means that editors can be part of several cliques and that the bicliques can reveal interesting areas of collaboration. Bicliques are used in these studies to reveal groupings of subjects connected by co-editors, areas of concern on the coordination of Wikipedias, and editing behaviors in article writing. An important difference between clustering and bicliques is that the process to reveal bicliques is one of ‘discovery’, and not of ‘clustering’ – whereby an algorithmic recipe shows which nodes are closer together. For a specific graph (which, surely, is created by many decisions on what to include and not, how to filter data, etc), the cliques are there. 1.4 The Following Studies Here the six studies are introduced, and it is explained how they span two levels and two zooms. 1.4.1 The Two Levels Wikipedia has different namespaces, which are groupings of articles with different roles. The main namespace is the one that includes all the articles that are part of the encyclopedia. There are articles like "Christiania", "Esperanto", and “Portugal". It is this corpus that is studied in order to understand how knowledge is linked and from where examples of encyclopedic articles can be studied. Other namespaces, specially Wikipedia: is concerned with the pages that address the coordination, the discussions, pages with policies, with internal projects for that language Wikipedia and other similar pages. Typical pages in this namespace are: "Community Portal", "FAQ", and "Department of Fun". These pages can be called meta-pages because they go behind the project and support it, being 'about it'. The Wikimedia projects have a Wiki that is concerned with these similar issues, across projects, which is called Meta-Wiki. In the data analysis are used both networks of editors and pages are used : the one from the encyclopedic pages, and the other from the meta-pages. So, one level is the one with articles, and the other is the one with meta-articles. 1.4.2 The Two Zooms There are different zooms at which to study the activity of editing. A natural one is 'one article' which gathers a group of people actively contributing and discussing one particular issue, namely 'how to write the encyclopedic article about X'. In these studies the zoom has been both ‘zoomed in' and 'zoomed out' of the unit-boundary article. When the zoom is "out" 63 the article is part of a greater group of articles, and these clusters of articles and co-authors can be investigated using bicliques to find dense zones of collaboration. When the zoom is "in" the inside of the article is studied — the relationship between the editors and the different paragraphs and the actions that led to the collaborative editing. 1.4.3 The combinations In the table above, one can see how the two zooms that comprise the inside and the outside of an article and the two levels of articles and meta-articles can be combined. There are several possible orders to present this work. The one chosen for this thesis is to start with the most general first (groups of articles) which gives context for the research, and the first application of the bicliques in "Bipartite Networks of Wikipedia's Articles and Authors: a Meso-Level Approach". In this study articles from the categories of Physics and Philosophy are extracted from the English Wikipedia and studied using bicliques. In the second study, “We Coordinate, Nosotros Eligimos, Nós Administramos: Articles and Editors of MetaWiki and in Meta Communities of the Ibero-South American Wikipedias” there are extracted the pages and editors of the meta-communities of the Portuguese and Spanish Wikipedias, alongside with the pages and editors of the MetaWiki. The networks and clusters of these are formed and provide insight into the organization of the community. These two studies have a ‘general' approach because they study groups of articles. In the third study, “Context Networks of the Articles ‘Prisoner’s Dilemma’ and ‘Wikipedia:Neutral Point of View’”, the 64 focus becomes on the particular articles ‘Prisoner’s Dilemma’ and “Neutral Point of View”, but still from a ‘groups of articles’’ perspective. In the fourth and fifth studies, “History of the ‘Prisoner’s Dilemma’” and “Networks of Wikipedia Article: Insides of ‘The Prisoner’s Dilemma’”, the zoom goes inside the article Prisoner's Dilemma, first with some historical background and then with network visualizations where the methodology is expanded to include nodes of paragraphs, editors and actions performed by the editors. And in the last study “Inside the Policy Article ‘The Neutral Point of View’”, the focus is in the maturation of the policy page on "Neutral Point of View", the main article in the Wikipedia set of values, by studying the ‘inside’ networks of the article during the year 2009. 65 66 BIPARTITE NETWORKS OF WIKIPEDIA’S ARTICLES AND AUTHORS: A MESO-LEVEL APPROACH 2 Rut Jesus Center for the Philosophy of Nature and Science Studies University of Copenhagen Blegdamsvej 17, 2100 Copenhagen, Denmark +45 61339903 vulpeto@gmail.com Martin Schwartz Sune Lehmann IT University of Copenhagen, Center for Complex Network DK-2300 Copenhagen S and Research and Department of Informatics and Mathematical Physics, Northeastern Modelling. Technical University University, Boston and Center of Denmark. DK-2800 Kgs. for Cancer Systems Biology, Lyngby, Denmark Dana-Farber Cancer Institute, +45 50571799 Harvard University, Boston, MA 02115, USA the1schwartz@gmail.com +1(617) 3738806 sune.lehmann@gmail.com 2.1 Abstract This exploratory study investigates the bipartite network of articles linked by common editors in Wikipedia, ‘The Free Encyclopedia that Anyone Can Edit’. We use the articles in the categories (to depth three) of Physics and Philosophy and extract and focus on significant editors (at least 7 or 10 edits per each article). We construct a bipartite network, and from it, overlapping cliques of densely connected articles and editors. We cluster these densely connected cliques into larger modules to study examples of larger groups that display how volunteer editors flock around articles driven by interest, real-world controversies, or the result of coordination in WikiProjects. Our results confirm that topics aggregate editors; and show that highly coordinated efforts result in dense clusters. 2.2 Introduction Wikipedia is a good example of social production of knowledge. Authors and articles constitute a network, which we study here at the meso-level. Investigations on knowledge-producing agents and their networks are of interest to both network and quantitative analysis studies, as well as to the social sciences. Moreover, it is particularly interesting to try to understand the network structure and dynamics inferred from low-level information subsequently complemented with higher-level information. Wikipedia’s network of authors and articles, is more horizontal than other networks (for example, those of the peer-reviewed scientific literature) – e.g., it has more edits per person and per article. 2.2.1 2.2.1.1 Related Literature Network analysis Network analysis has previously been used to describe Wikipedia’s growth. For instance, Capocci et al. (2006), delineate the properties of the growth of Wikipedia as a network, with 67 topics modeled as vertices and hyperlinks between them represented as edges. This study shows how the growth of Wikipedia can be described with local rules such as preferential attachment, while contributors are still free to act globally in the network. It has also been discovered that many network characteristics are similar between different language versions of Wikipedia; examples are degree distribution, growth, reciprocity and clustering [Buriol et al. (2006); (Zlatic et al. 2006)]. 2.2.1.2 Quantitative analysis Quantitative analysis of Wikipedia users has been investigated by Ortega & GonzalezBarahona (2007) in a framework, where editors were classified by their activity during specific time periods. A comparison between imposed classifications and real clustering was performed by Capocci et al. (2008). 2.2.1.3 Cooperation The level of cooperation in Wikipedia has been carefully analyzed by Viégas, Wattenberg, Kriss et al. (2007), who stress the need to study Wikipedia’s growth in terms of its clusters and namespaces beyond the articles. These authors emphasized that the fastest growing areas (namespaces) in Wikipedia are devoted to coordination of article-writing and conventions. They create a grid of categories and code the contents of discussion pages according to that grid. They discover that these pages mostly act as a place for strategic planning of edits and enforcement of standard guidelines. A study by Wilkinson & Huberman (2007) shed light on the stochastic mechanism by which articles accrete edits. They show that there is a positive correlation between article quality and number of edits, thereby validating Wikipedia as a successful collaborative effort. A more recent study by Kittur & Kraut (2008) specifies better the impact of adding editors for the quality of articles: the addition of editors improves the quality of an article in its formative stage, and when the coordination is done directly in the writing of the article, but the addition of editors to an article can be harmful when the coordination is done explicitly in talk pages. 2.2.1.4 Visualizations The visualization of collaboration within Wikipedia is also an active field of research; the tools include: (1) history flow (Viégas et al. 2004) an application that can be used to visualize the contributions to an article; (2) visualization of the whole co-authorship networks (BiukAghai 2006), and (3) the use of revert graph visualizations (Bongwon Suh et al. 2007). 68 2.2.2 2.2.2.1 Conceptual Framework Meso-zoom Most of the work referenced above focuses on either the global statistics of the entire Wikipedia project, or on the atomic descriptions of individual articles. However, collaboration in Wikipedia occurs at the meso-level, where groups of people collaborate in order to create articles. We here focus on the meso-level, not only in terms of scale, but also in terms of analysis. This is a study where low-level phenomena – i.e., agents and their interactions and behaviors, inform a higher level – that of clusters between articles and editors. 2.2.2.2 Meso-approach We stay in the middle. Modules of articles and editors are investigated, rather than whole Wikipedias and their statistics or single discussions and their descriptive sociologies. Moreover, we stay in the middle regarding our approach, supported by our inter-disciplinary skills in physics and philosophy: we employ network visualizations, but we neither make comprehensive statistical analyses nor detailed ethnographic studies. Although this interdisciplinary approach may appear lacking from the point of view of either of these ‘pure fields’, we believe that the inter-disciplinary nature of this study allows us to integrate mathematical tools and sociological methodologies to allow us to see general patterns without the oversimplification that is often the result of a purely quantitative approach. In the following sections we introduce bipartite networks and present and defend our choices concerning data and visualization. Subsequently we show several case studies and examples of bipartite modules surrounding various controversies, interests and projects. We consider the network formed by overlapping clusters of articles and editors and utilize this to detect isolated cliques. We also present the clusters, which are not bounded by content. Finally, we discuss the results and propose lines for future research. 2.3 2.3.1 Method Bipartite Networks A bipartite network is a graph G = (U, V, E) whose vertices (or ‘nodes’) can be divided into two disjoint sets U and V such that every edge (or ‘link’) E connects a vertex in U to a vertex in V; that is, U and V are independent sets. When we consider articles in Wikipedia and their editors, a bipartite network is a convenient representation: U is the set of editors and V is the set of articles in Wikipedia. The bipartite network formalism is ideal for studying collaboration, because the network structure encodes knowledge about which articles editors have edited together. 69 By studying the clusters (or ‘modules’) in the bipartite network, we are able to discover clustering of editors and articles and smaller patterns of collaboration. We choose to call dense groups clusters or modules rather than ‘community', because the latter is an ill-defined concept across disciplines and may imply structures at the macro-level not present in this meso-level study. These dense groups could also be called 'epistemic communities' as used by Roth (2006) where epistemic communities are understood as a descriptive instance only, not as a coalition of people who have some interest to stay in the community: it is a set of agents who participate in building the same knowledge. Bicliques or their various names (closed sets, closed couples, formal concepts, maximal rectangles, bipartite communities) were initially studied by mathematicians Birkhoff (US), Barbut (F) together with Monjardet (F) and by computer scientist Rudolf Wille (DE). And they continue to be explored in formal concept analysis and by mathematical sociologists. We do not choose to review these mathematical formulations of bicliques at length, and focus instead on their use in network research where they are the building block of clusters/communities/groups. One method for detecting modules in bipartite networks, grounded in physics of networks and expanding the work by Palla et al. (2005) was developed by Lehmann et al. (2008). This method is based on detecting the most dense areas of the graph (called maximal bi-cliques) and then agglomerating overlapping bi-cliques into larger modules. More formally, a biclique is a complete subgraph of a bipartite network. A ‘maximal’ biclique is defined as a biclique that is not a subgraph of any larger bi-clique. We use the notation Ku,v to describe a bi-clique with u nodes in node-set U and v nodes in node-set V. Connecting this to the network of editors and articles in Wikipedia, a K3,5 cliques describes a structure where three editors have all edited the same five articles. Two bi-cliques of size Ka,b are adjacent if they share at least a Ka−1,b−1 clique. A Ka,b module (or ‘cluster’) is the union of all adjacent Ka,b cliques. One important feature of this definition is that nodes can belong to more than one cluster; that is, two distinct modules may overlap. Furthermore, by changing the values of a and b allows for different zooms. 2.3.2 2.3.2.1 Data and Visualization Subset We analyze a subset of the English language Wikipedia, namely the articles in the categories Philosophy and Physics to depth level three. The choice of a subset is, after all, arbitrary but our sample was motivated by familiarity with the topics (given our educational background, which is important to make semantic claims about them) and by the size of the disciplines and their representation in Wikipedia. As categories in Wikipedia can be nested recursively, the set of articles includes not only articles inside Physics and Philosophy but also those in 70 different subjects up to three steps of association from the main categories. The decision to include sub-categories and sub-sub-categories is similar to the choice of Halavais and Lackaff (2008). The authors assume that a ‘core’ of the disciplines can be sampled in this way. 2.3.2.2 Filtering Similarly, editors were filtered by the number of edits they had contributed to each article. Editors that edited 7 or more, or 10 or more times in an article were included (both thresholds were applied but for different purposes). This filtering helped to avoid clutter (and allow for computational capacity), and helps us concentrate on the most engaged editors and articles in dense clusters. Although sporadic edits can be important to Wikipedia as a whole, they are less relevant when considering the cooperation and interaction between editors of a small subset of articles. A few examples indicate that the lack of information regarding sporadic editors does not compromise the analysis of highly engaged clusters of editors and articles. 2.3.2.3 Anonymity The ‘real nicknames’ of the editors are kept due to the public nature of their work; as is clear from the examples, their identity is not at stake, not more than by creating an account in the Wikipedia website. 2.3.2.4 Bi-clique visualization The open source program, BCFinder developed by Lehmann et al. (2008) was used to calculate and visualize the modules that arise from combining adjacent bi-cliques. BCFinder allows one to visualize the articles and editors of each cluster (and also easily access those pages and user pages in Wikipedia). In addition, it makes it possible to visualize the network of modules. Each Ka,b module can be thought of as ‘zooming’ into a relevant area of the network. Moreover, one can view the network of modules, where each cluster is a node and two module-nodes are linked if they share either one or more editors or one or more articles (see example in Fig. 5). A distinct network of modules is created by each zoom and yields insight about which zooms divide the network into meaningful sub-parts. Further, the network of clusters allows one to identify isolated clusters. Using the filter of minimum number of edits-per-article-per-editor set to 7, we worked with 33335 editors and 17643 articles (we call this the ‘7-edit network’); when the minimum number of edits-per-article-per-editor was set to 10 we worked with 19612 editors and 13241 articles (the ‘10-edit network’). 2.3.2.5 Typology Upon getting all possible clusters, they were grouped in order to identify the specific examples below. Although taken from a specific Ka, b cluster, these examples are fairly robust to changes in a or b, up to a certain point. The kinds shown below span the possible types found in the data. 71 2.4 2.4.1 2.4.1.1 Results Controversies Evolution/Creationism The first type of collaboration in Wikipedia is the one fueled by deep disagreement. One example of such a cluster is the controversy between evolution and creationism. In Figure 1 we display the major players of this cluster tying controversial articles. Here, we study the 10edit network. The module is composed of adjacent bi-cliques of size K6,3 or greater. The articles present in this cluster show that a debate is taking place. For example, the same group of editors edits the two articles ‘Evolution’ and ‘Creationism’. The controversy here surrounds a religious/non-religious discussion that ultimately questions the validity of science. The presence of ’Atheism’ and of ’Pseudoscience’ supports the debate of religious values in relation to scientific values. Figure 2-1:The Evolution/Creationism debate is mirrored in the way the articles ‘Evolution’, ‘Pseudoscience’, ‘Creationism’, ‘Atheism’ and ‘Creation science’ belong to the same cluster. These articles are edited by at least 13 active editors engaged in this controversy. Figure 2-2: Zooming in the Evolution/Creationism debate by including more edits. The vertices are scaled according to number of links. More articles and more editors are involved in this dispute. This cluster gives clues about some of the hidden players, for example ‘Richard Dawkins’ and the ‘Discovery Institute’. In Figure 2, another cluster around the same topic is shown. In this figure, each vertex is scaled such that nodes with more links are larger; this makes it easier to see that the two major articles are ‘evolution’ and ‘creationism’. Some of the smaller articles yield further insight into other actors participating in this dispute: ‘Richard Dawkins’ “is a British ethologist, evolutionary biologist and popular science writer. In addition to his biological work, Dawkins is well known for his views on atheism, evolution, creationism, intelligent 72 design, and religion. He is a prominent critic of creationism and intelligent design” as is 18 stated in the first lines of the Wikipedia article . An important concept in this controversy seems to have been heavily edited as well: ‘Irreducible complexity’ which “is an argument made by proponents of intelligent design that certain biological systems are too complex to have evolved from simpler, or “less complete” predecessors, through natural selection acting 19 upon a series of advantageous naturally occurring chance mutations” . On the other side of the debate, the major concept at stake is ‘Natural Selection’ which “is the process by which favorable heritable traits become more common in successive generations of a population of reproducing organisms, and unfavorable heritable traits become less common.” 20 These clusters can also reveal players that would be otherwise hidden to those not involved. For example, the Discovery Institute “is a U.S. think tank based in Seattle, Washington, best known for its advocacy of intelligent design and its Teach the Controversy campaign to teach creationist anti-evolution beliefs in United States public high school science courses.” 21 Investigating the other set of nodes (editors) involved in this discussion is also revealing. Represented in their user pages we discover a range of attitudes. One editor states clearly that he was involved with the article ’Intelligent Design’, which he started in 2001, but from which he was banned in 2008. Other editors decided to leave Wikipedia—it is not clear if the controversy discussed here played a role. Still other editors appear to have been highly involved in fighting vandalism; it is well known that controversies are more prone to vandalism (Viégas et al. 2004). 2.4.1.2 Intelligence and Global Warming Several other controversies can be identified based on the modules in our subsection of Wikipedia. The controversy in Figure 3 is based on the 7-edit network, and displays a module based on K5,4 or greater bi-cliques. This group is engaged in a discussion of the issue of intelligence and the validity of the intelligence tests and some claims for correlations. In addition, ‘The Bell Curve’ is a controversial book on how intelligence can be a predictor of social factors. Likewise ‘IQ and the Wealth of Nations’ is another controversial book discussing the relation between IQ prosperity of nations. 1818 From Wikipedia, “Richard dawkins.” Retrieved on May 2nd 2008 from http://en.Wikipedia.org/Wiki/Richard Dawkins. 19 From Wikipedia, “Irreducible complexity.” Retrieved on May 2nd 2008 from http://en.Wikipedia.org/Wiki/ Irreducible complexity. 20 From Wikipedia, “Natural selection.” Retrieved on May 2nd 2008 from http://en.Wikipedia.org/Wiki/Natural selection. 21 From Wikipedia, “Discovery institute.” Retrieved on May 2nd 2008 from http://en.Wikipedia.org/Wiki/ Discovery institute. 73 Figure 2-3: Controversy surrounding intelligence, its measures and correlations comprised of the articles ‘Race and intelligence’, ‘The Bell Curve’, ‘IQ and the Wealth of Nations’, ‘Intelligence quotient’, and ‘Flynn effect’. Figure 2-4: Controversy surrounding global warming. It comprises the articles ‘Solar variation’, ‘El Niño-Southern Oscillation’, ‘Carbon dioxide’, ‘Sea level rise’, ‘Global warming controversy’, and ‘Fossil fuel’. Another characteristic example of a ‘conflict-cluster’ is displayed in Figure 4. Here controversy regards global warming and the diverse factors surrounding this subject. The network is based on the 7-edit filter and the module is slightly more sparse than the ones considered so far, constructed from adjacent K4,3 bi-cliques. The central article in this cluster is ‘Global warming controversy’. But the pages ‘Solar variation’, ‘Carbon Dioxide’, ‘Sea level rise’, ‘Fossil Fuel’ and ‘El Niño-Southern Oscillation’ are all components in the discussion on the human components involved in global warming. 2.4.2 Isolated Clusters In order to understand the significance of the next type of collaboration in Wikipedia, it is useful to first discuss the network of modules. The network of modules allows one to identify modules in the bipartite network of editors and articles, which are not connected to any other modules. Figure 5 is an example of the network between the modules 10-edit network, with modules constructed from K7,2 cliques. Each module is represented by a pie chart colored according to its fraction of editors (red) and articles (blue). The modules are connected by red links (overlapping editors) and blue links (overlapping articles); the width of each link is proportional to the number of overlapping nodes. 74 Figure 2-5: Network of the clusters made of K 7,2 bi-cliques. Circles represent modules, which share articles (blue links) and editors (red links) with each other. The numbers are labels that identify each cluster. The network-ofclusters-view helps to understand the relationships between the clusters and to identify isolated clusters that do not share articles or editors with others. The clusters 780, 771, and 779 are displayed in Fig. 6; the clusters labeled 776 and 781 are displayed in Fig. 7. Figure 5 shows three clusters 780, 771 and 779 (these numbers are just labels) that do not share links (either articles or editors) with the others. Two other clusters 780 and 776 are sparsely connected. Let us investigate these modules and begin to understand the causes underlying this network topology. The three isolated clusters correspond to topics that gather focused and dedicated authors: Mormonism, Zionism and Scientology (Figure 6). It is not fully surprising that all of those topics are isolated from other clusters since it could be argue that their practice in the ‘real world’ is similar: organized in sub-cultures, highly active, but isolated from other areas of knowledge and/or society. In Figure 5, two other clusters are connected with each other but not with the remaining modules; these are plotted in Figure 7. Both modules are devoted to political ’isms’ and share one editor and a single article the one on ‘Anarchy’. One of these clusters is interested in the definition and background of anarchism as the articles are: ‘Individualist anarchism’, ‘Mutualism (economic theory)’ (is an anarchist school of thought) and ‘Anarchism’. This cluster is then related to another interested in defining political ‘isms’: ‘Anarchism’, ‘Anarcho-capitalism’, ‘Socialism’ and ‘Capitalism’. 75 Figure 2-6: Isolated clusters: The left panel is a module focused on the topic of Mormonism, which comprises paradigmatic articles: ‘First Vision’, ‘Mormonism and Christianity’ and ‘Joseph Smith, Jr.’; the middle panel surrounds the topic of Zionism in all three articles: ‘Anti-Zionism’, ‘Zionist political violence’ and ‘Zionism’; the right panel surrounds the topic of Scientology: ‘Dianetics’, ‘Church of Scientology’ and ‘Fair Game (Scientology)’. Figure 2-7: Two connected clusters that are disconnected from the remaining network of modules. (left) Cluster focused on Anarchism. (right) Cluster focused on political ‘isms’: ‘Anarchism’, ‘Anarcho-capitalism’, ‘Socialism’ and ‘Capitalism’. 2.4.3 Shared Interests In all the clusters, the editors share the interest (and practice) of editing the same articles. Some of them can be grouped by a shared interest (or a number of related ones). These groups are revealed by the bi-cliques, some of which turn out to be coordinated through a WikiProject. 2.4.3.1 Mantras Figure 8 shows another example of a cluster realized from shared interest practice, although this one is not concentrated in a WikiProject. This project concerns the topics ‘Buddhism’, ‘Yoga’, ‘Tantra’, ‘Mantra’ and ‘Guru’. It reveals common interests between the 5 editors and the 5 articles in this K4,4 bi-clique cluster based on the 7-edit network. Although ‘Tantra’, ‘Yoga’ and ‘Guru’ are not related directly, they are part of the same vocabulary and interests of the practitioners of yoga, guru followers and tantra interested people. This cluster reflects a practice that happens beyond Wikipedia, but a practice that is mapped onto the way the articles are edited. 76 Figure 2-8: Cluster showing a relation between articles about related practices: ‘Buddhism’, ‘Yoga’, ‘Tantra’, ‘Mantra’, and ‘Guru’. 2.4.4 WikiProjects 2.4.4.1 Elements Figure 9 displays 8 articles and 10 editors, which constitute a K7,3 module in the 10-edit network. This collaboration is a clear example of an orchestrated effort to improve the articles describing the elements of the periodic table. One of the WikiProjects is “a collection of pages devoted to the management of a specific topic or family of topics within Wikipedia; and, simultaneously, a group of editors that use said pages to collaborate on encyclopedic work. It is not a place to write encyclopedia articles directly, but a resource to help coordinate 22 and organize article writing and editing” . The WikiProject about elements presents itself in the following manner: ”This WikiProject has managed to standardize the articles on the known chemical elements (see Guidelines page). Now it is aimed at the maintenance of these at an agreed upon format discussed in Wikipedia talk:WikiProject Elements and at the expansion and improvement of each article to featured article quality (check out our Goals 23 below).” . In this cluster the editors are engaged in improving the following articles: ‘Hydrogen’, ‘Oxygen’, ‘Gold’, ‘Mercury’, ‘Magnesium’, ‘Lithium’, ‘Krypton’, ‘Potassium’. An investigation of their user pages reveals that the editors involved are several administrators with daily activities that range from working mathematicians to geologists and chemists. The various editors have different levels of (dis)comfort with anonymity: some use their real name, some keep it hidden but provide extensive information about their activities and, at least one, copes with anonymity in an interesting manner: ”Male, European, and already paranoid about giving away this much information”. 22 From Wikipedia, “Wiki project.” Retrieved on May 2nd, 2008 from http://en.Wikipedia.org/ Wiki/Wikipedia:WikiProject. 23 From Wikipedia, “Wikiproject elements.” Retrieved on May 2nd 2008 from http://en.Wikipedia.org/Wiki/ Wikipedia:WikiProject Elements. 77 Figure 2-9: Cluster revealing the coordinated effort to improve Wikipedia articles about the elements of the Periodic Table. Figure 2-10: Cluster showing more elements that are part of the WikiProject concerned with improving the articles of the elements of the Periodic Table by decreasing the minimum number of edits allowed. Additional data about this cluster can be obtained by considering the 7-edit network. For the same clique zoom of K7,3, Figure 10 has 16 editors and 20 articles. As it is a coordinated effort, the additional information gained by increasing the number of edits is only that there are more people and articles involved in the same topic: We see 20 elements instead of the 8 elements that were visible in the case of the previous cluster with fewer editors and articles. 2.4.4.2 Electronics Another example of a cluster that reveals a WikiProject is displayed in Figure 11. This project surrounds the topic of electronics and several of its concepts (’Alternating current’, ’Decibel’) and tools (’Oscilloscope’, ’Electric motor’). The K4,4 clique cluster comprises 7 editors and 11 articles in the 7-edit network. The presentation of the WikiProject about Electronics is the following: “The aim of this project is to better organize information in articles related to electronics. This page contains only suggestions, with the hope to help other Wikipedians 24 writing high-quality articles with the minimum effort” . 24 From Wikipedia, “Wikiproject electronics.” Retrieved on May 2nd 2008 from http://en.Wikipedia.org/Wiki/ Wikipedia:WikiProjectElectronics. 78 Figure 2-(Lave & Wenger 1991): Cluster supported by the WikiProject Electronics around the topic of, well, of electronics: ‘Electrometer’, ‘Decibel’, ‘Potentiometer’, ‘Alternating current’, ‘Electrical engineering’, ‘Electronics’, ‘Oscilloscope’, ‘Resistor’, ‘Transistor’, ‘Electric motor’ and ‘Capacitor’. 2.4.5 Non-Content Bounded Clusters The bipartite network of editors and articles also contains modules, in which there is no apparent correlation between the topics. Figure 12 displays such a module with 5 articles and 9 editors around topics as diverse as: ’Joseph Stalin’, ’Martin Luther King, Jr.’, ’Tsunami’, ’Ku Klux Klan’ and ’Albert Einstein’. It is a curiosity to observe what topics would be included in these generalist clusters that are heavily edited and by a small group of editors. Figure 2-11: There are also several clusters such as this one, which are not bounded by content, but probably by editing style edits — maybe for adding links or fighting vandals. A more extensive list from a module of adjacent K4,1 bi-cliques with 45 articles is: Abortion, Jimmy Wales, Solar energy, Evolution, Fuck, Christianity, Ku Klux Klan, Beauty, Galileo Galilei, Racism, Stupidity, Black hole, Plato, Joseph Stalin, Sun, Volcano, Aristotle, Earthquake, Art, Rosa Parks, Nuclear power, Isaac Newton, Computer, Martin Luther King, Jr., Tsunami, Buddhism, Creationism, Bitch, Vietnam War, Tornado, Pi, Shit, Pope John Paul II, Albert Einstein, Internet, Thomas Jefferson, Vladimir Lenin, Love, Cunt, Renaissance, Islam, Slavery, Mother Teresa, Tropical cyclone, Music. One possible way to account for this variety in topic in this example of a cluster not bounded by content is that these articles have very general content, they are not highly specialized and therefore are more accessible to different kinds of editors. Another complementary explanation is that articles are sometimes edited, not in terms of topic, but rather kind of edit. An editor that is concerned with making tables, or fixing links would not be concerned with the specific topic and the edits are therefore due to syntax, layout, or spelling editors. 79 2.5 Discussion By applying clustering tools from social network analysis to a subsection of Wikipedia, several interesting insights regarding the meso-level between single articles and global statistics were uncovered. Although we were limited to the articles that were included in the subsections of the categories Physics and Philosophy and therefore related to these two primary topics, these boundaries gave us a certain familiarity with the topics. This facilitated the extraction of information in a manner that would not have been possible, had the research been performed on random or unfamiliar topics. Controversies give rise to disputes that are not necessarily contained within one article. In fact, controversies typically span multiple articles and form tightly connected modules of editors who edit related topics actively and sometimes in direct opposition to each other. Wikipedia, as expected, mirrors the discussions in society. Clustering tools allow us to probe other structures than the ‘web of knowledge’ that arises from the networks where the nodes are articles and the hyperlinks connect them. The article on ‘Evolution’ links not only to ‘Darwin’ or ‘Wallace’, but also connects to ’Atheism’, for example. This modular structure reflects the controversy currently taking place on the scale of the entire North-American society, which is actively engaged in discussing the possibility of creationism to be taught alongside with evolution. In this manner, analyzing the modules in Wikipedia, provides information about another layer of the construction of knowledge which is not necessarily tied with the topics closest in character, but with those that create issues which must be articulated and disputed in relation to each other. In the case of the coordinated efforts, such as the Project Elements, the attempt to achieve Featured Article status seems to aggregate people (Viégas, Wattenberg & Mckeon 2007) and also, as previously proven by Wilkinson & Huberman (2007) the more edits an article has, the more it is likely to accrete. Therefore, the creation of WikiProjects is shown to be a good way to mobilize work in one direction, especially by trying to produce Featured Articles. WikiProjects are a good example of the work carried out at the meso-level: they do not rely on massive inputs by the ‘wisdom of the crowds’ nor do they rely uniquely on the dedication of one single editor. WikiProjects result in clusters of editors with common interests that have found a way to coordinate work successfully aggregating people and resulting in highly developed articles. As expected, some clusters reflect the way those same clusters manifest in ’real life’. If topics or practices aggregate tight and closed clusters, it is not surprising that the articles about those clusters are also edited by a closed cluster of editors. The bipartite clustering tools and the network of modules can be used, not only to identify some of those modules in ‘real life’, but also to understand the relations between the modules and the most important players. For example, in the controversy between ’Evolution’ and ’Creationism’, there are people and groups who are quite outspoken (’Dawkins’, ’Discovery Institute’) and therefore their articles are edited along with the other articles present in the controversy. Another example is that specific properties about how some articles are edited 80 can be related to some assumed properties of groups in the ‘real world’: the isolated groups on ‘Mormonism’, ‘Scientology’ and ‘Zionism’ may show that these groups in society are also quite isolated and dedicated to their cause. Although it is hard to prove the behavior of these groups in ‘real life’, a recent case of Wikipedia banning the Church of Scientology from editing (Singel 2009) supports that these editing patterns may reflect that these groups edit directly their own pages and that the discussions about them, some even controversial, are quite isolated. These clusters can be seen as ‘epistemic communities’, in the weak sense, that of a group of people gathering around a knowledge topic (and not in the strong sense where Roth & Bourgine (2004)– define an epistemic community by the group of people that maximally share a number of concepts). These clusters are not strictly ’Communities of Practice’ (Lave & Wenger 1991) because the authors need not be acquainted or involved in a common practical task. Regardless, a community of practice is certainly a special type of knowledge community. The participation in Wikipedia as a whole, although a theme to be developed elsewhere, can be said to be a large community of practice where editors interact using shared paradigms, meanings, values and practices and where a lot of the learning is tacit: Wikipedians learn how to edit articles, how to fight vandals, how to use policies to make their points through, how to present themselves in user pages and so on. Bryant et al. (2005) in ‘Becoming Wikipedian: transformation of participation in a collaborative online encyclopedia,” argued that, “observations of members’ behavior in Wikipedia reveals that the three characteristics of Communities of Practice identified by Wenger are strongly present on the site: community members are mutually engaged, they actively negotiate the nature of the encyclopedia-building enterprise, and they have collected a repertoire of shared, negotiable resources including the Wikipedia software and content itself.” Clustering allows us to zoom in into this community of practice and detect more specific modules bounded by shared interest. In WikiProjects, in particular, authors are involved in a common task and are, therefore, creating structures that are closer to mini-communities of practice than the other ‘epistemic communities’. Finally, it should be noted that in order to complete the typology with the clusters found in the data, some of the clusters contain a number of articles in topics as diverse as ‘Tsunami’ and ‘Albert Einstein’. It is not surprising that this module is diverse. The K4,1 cliques are 4 editors that have co-edited just one article. If they had co-edited two or more articles, then one would expect more similarity (in general the articles become more homogeneous). This type of clique with a low second index means that the articles do not have anything in common. This type of clusters are a hint that Wikipedia is also a product of more loose dedications by people who edit in articles which are more broad, but also that there are different editing patterns, and not all are content-driven. Editing to fix typos, or to make tables of contents can also group people. 81 As this is a study with both quantitative and qualitative features, we used semantic categories top-down to describe the kinds of clusters found in the data, harvested bottom-up. This way we assessed qualitatively the nature of collaboration between editors in a subset of the English Wikipedia, grounded on network analysis. 2.6 Limitations and Recommendations for Future Research A more systematic study could reveal possible ‘network signatures’, i.e. ways to identify controversies or WikiProjects directly from the network structure. It will be interesting to complement the insights discovered from our meso-level modules with a deeper probe into the specific connections through discussion pages, on the level of individual articles and paragraphs to understand the patterns of distributed work and perhaps, cognition in greater depth. Analyzing the network of modules informs us about the individual modules and their structural relation to each other. Further work could provide information to understand the network of Wikipedia in relation to other networks (scientific collaborations, open source projects). In the future, it will be interesting to expand the bipartite clustering technique and approach to other areas of Wikipedia and other datasets; to organize the algorithm in order to allow for the module surrounding any article to be visualized, and contextualize the findings in the light of more abstract claims of the power of technology and cluster, in specific Wikis and Wikipedias to organize knowledge, work together, and ultimately be part of a cognitive system that comprises humans, technologies and values. It will also be interesting to pursue the interdisciplinary meso-level of analysis, as it seems to result in insights, which inhabit the area between the quantitative patterns, and the qualitative details usually found by other more traditional disciplines. 2.7 Conclusion Detecting modules of articles and editors in Wikipedia yields important insights into the nature of collaboration. The technique used in the present research probes a level where collaboration is surely taking place because people in fact gather around a number of articles and work intensely on them. 2.8 Acknowledgements Rut Jesus acknowledges support by the Portuguese Foundation for Science and Technology with the grant SFRH/BD/ 27694/2006 and would like to thank Camille Roth for discussions and bibliography concerning biclique history. Sune Lehmann acknowledges support by the Danish Natural Science Research Council and James S. McDonnell Foundation 21st Century Initiative in Studying Complex Systems, the National Science Foundation within the DDDAS (CNS-0540348), ITR (DMR-0426737) and IIS-0513650 programs, as well as by the U.S. Office of Naval Research Award N0001407-C and the NAP Project sponsored by the National Office for Research and Technology (KCKHA005). 82 83 WE COORDINATE, NOSOTROS ELIGIMOS, NÓS ADMINISTRAMOS: ARTICLES AND EDITORS OF METAWIKI AND IN META COMMUNITIES OF THE IBERO-SOUTH AMERICAN WIKIPEDIAS 3 Rut Jesus Center for the Philosophy of Nature and Science Studies, NBI, Blegdamsvej 17, University of Copenhagen, 2100 Copenhagen Ø, Denmark 3.1 Abstract Every language Wikipedia has an increasing number of pages about the Wikipedia project itself including policy, discussions, coordination and processes. Moreover, the Wikimedia Foundation has a “Meta-Wiki” that is used for documentation and coordination of its projects. This exploratory study investigates the networks formed by those ‘meta’ pages “behind the scenes” in the Portuguese and the Spanish Wikipedias and in the Metawiki. We analyze the bipartite networks of articles and editors in the namespaces “Wikipedia:” and “Help:” (and their talk pages) in the Portuguese and Spanish Wikipedias and from the main namespace in MetaWiki. From the bipartite networks, we gather overlapping cliques of densely connected articles and significant editors (who made at least 10 edits in each article) to study the communities that organize, discuss, and lead those Wikipedias. We see the appearance at a meso-level of coordination areas, policy making and general guidelines. The themes of interaction that arise through clustering are: featured articles (artigos/articolos destacados), deletion (paginas e imagens para eliminar), ballots (votacciones), new languages and translations, Wikimedia projects, and organizational projects such as board elections and Wikimedia chapter formation including the organization of Wikimania 2009. This meso-level grasps the use of bottom-level data with integration with qualitative understandings, which enriches our understanding of community KEY WORDS: Community, Wikipedia, MetaWiki, Coordination, Networks THEME & TRACK: Wikimedia Communities, Academic Track 3.2 Introduction The notion of community is at the heart of the project to build ‘The Free Encyclopedia That Anyone Can Edit’, as Wikipedia relies on collaboration for the writing of its millions of articles. Studies on Wikipedia are now of many different kinds, times and with different perspectives. The Wikipedia phenomenon, both in terms of the content that has been produced and aggregated, and the structures that lie behind, is quite new, and Wikipedia is the biggest encyclopedia and the 5th biggest website. Wikipedia has grown immensely in number of articles, but, for this to happen, an accompanying community has grown even faster (Viégas, Wattenberg, Kriss et al. (2007), and, for the purposes of coordination and resolution, the community tackles conflicts between users and develops procedures, committees and rules. The research focus to date has varied spanning questions on how Wikipedia has grown (Viégas, Wattenberg, Kriss et al. (2007), where it is heading (Capocci et al. 2006), aspects of linking (Ciffolilli 2003), quality (Luciana Buriol et al. 2006), trust (Priedhorsky et al. 2007; 84 Stvilia et al. 2008), incentives to write (Forte & Bruckman 2005; Schroer & Hertel 2009), and types of contributors (Kittur et al. 2007; Aniket Kittur & Kraut 2008; Anthony et al. 2009; Panciera et al. 2009). To accompany this new phenomenon it is natural to experiment with methods, that both encompass its size, but also its detail: to see article evolution and collaboration (Nunes et al. 2008; Bongwon Suh et al. 2008) and editing patterns (Wattenberg et al. 2007). The present article tries to describe a little more of what is happening at the communities of the Spanish and Portuguese Wikipedias. Also, MetaWiki will be studied as it is the umbrella of all coordination between the Wikimedia projects. This will be done using a meso-level approach: extracting data from the meta-parts of the Wikipedias and the relevant parts of MetaWiki, we look at what clusters are formed and try to find meaning in them. Needless to say, it is the English Wikipedia that has gotten the most attention, which is unfortunate even if understandable, as it is the biggest and tries some procedures before other Wikipedias. We know little of the community structures and the workings behind other Wikipedias although there are quantitative studies such as Ortega et al. (2007), which compares the 10 biggest Wikipedias. Last year (WikiMania 2009 was held in Buenos Aires) it was also relevant to look at the Ibero-American Wikipedias, as they are the local Wikipedias in South America (and surely, the Wikipedias in other Portuguese and Spanish speaking countries and populations such as Portugal and Spain). 3.2.1 Related Literature on Cooperation, Coordination, and Governance Here we review some of the relevant studies that have focused on the coordination and governance pages, in order to give a background to the research in this paper which analyses the meso-level of articles and authors’ networks in the meta-pages. These meta-pages are those behind the encyclopedic articles where coordination and governance discussions take place. The namespaces with pages about the project were started in February of 2002, about one year after Wikipedia started. 3.2.1.1 Cooperation A good example in terms of cooperation and the relation between articles and accretion of edits, is a study by Wilkinson & Huberman (2007). In their words, ‘they validate Wikipedia as a successful collaborative effort’ by showing a positive correlation between article quality and number of edits. Another relevant study in cooperation is the one by Kittur & Kraut (2008) where the addition of editors to articles is positive in the article’s formative stage, and when the coordination is done directly in the writing of the article – but the addition of editors can be harmful if the coordination is done explicitly in the talk pages. 3.2.1.2 Coordination Viégas, Wattenberg, Kriss et al. (2007) has stressed the need to study Wikipedia’s growth in 85 terms of its clusters in namespaces beyond the articles. These authors emphasized that the fastest growing areas (namespaces) in Wikipedia are in namespaces devoted to coordination and conventions. They focus on the talk pages, and show that these pages are used for strategic planning of edits and enforcement of standard guidelines. Kittur et al. (2007), in "He Says, She Says: Conflict and Coordination in Wikipedia" use the term ‘indirect work’ and examine its growth — coordination to diminish conflict — and find that indirect work is increasing. They define 'indirect work' — as 'excess work in the system that does not directly lead to new article content'. They also use 'conflict and coordination costs' — which has a more negative tone — while indirect work (which really sustains Wikipedia) seems more appropriate. 3.2.1.3 Governance Viégas et al. (2007) note Wikipedia has “myriad guidelines, policies and rules” and “complex and bureaucratic process [that run] counter to naïve depictions of Wikipedia as an anarchic space.” Bryant et al. (2005) explains how this image can be changed: they found that people get to understand Wikipedia as richer than an editable resource, through legitimate peripheral participation, a kind of ‘apprenticeship’ for rules, procedures and customs. Kriplean et al. (2007) found that editors carry on consensus-seeking work on talk pages using and interpreting policies to legitimate their actions. Policies are one of the products of the workings of the community, that is supported by a governance structure with positions of authority (Forte & Bruckman 2008), editors who enforce policy (Forte & Bruckman 2008; Beschastnikh et al. 2008; Butler et al. 2008) and who participate in formal processes (Viégas, Wattenberg & Mckeon 2007) and informal praising processes. One of these informal praising processes has been studied by Kriplean et al. (2008) who analyzed the production of work in the Wikipedia community — by studying barnstars (personalized tokens of appreciation given to participants) to reveal a range of valued work – from social support to administrative actions. Jimmy Wales described the governing characteristics of Wikipedia, back in 2005: “Wikipedia is not an anarchy, though it has anarchistic features. Wikipedia is not a democracy, though it has democratic features. Wikipedia is not an aristocracy, though it has aristocratic features. Wikipedia is not a monarchy, though it has monarchical features.” Wales (in (Reagle 2007a). Reagle (2007) studied the kinds and practices of leadership presented in Wikipedia and formulated a theory of authorial leadership, while Konieczny (2008) characterized Wikiped ia's governance as 'adhocratic' because, as policy pages are editable just like other pages, Wikipedia’s editors decide adhocratically on the governance model, in other words, they choose a governance model – be it consensus, democracy or meritocracy – depending on the situation. 86 3.3 Methodology The main methodology in this study was the calculation of bicliques from the bipartite network of editors and meta-articles. This research was also informed by observation and reading of the traces left by the editors, general statistics and visualizations of the networks (using Cytoscape, see below). Below, the biclique methodology is explained, and the choices regarding the datasets and the filtering and thresholding of the data are presented. 3.3.1 Datasets 3 Wikis were analyzed for this study, two language ones particularly relevant to Iberia and South America – last year’s region for Wikimania, namely the Portuguese and the Spanish Wikipedias, and an umbrella Wiki called Meta-Wiki that presents itself as: “Welcome to Meta-Wiki, the global community site for the Wikimedia Foundation's projects, and coordination and documentation of related projects. For further discussion of the crossproject policies and events, see the Wikimedia mailing lists (particularly foundation-l) and 25 IRC channels (particularly #Wikimedia), and individual sites of local Wikimedia chapters.” For the analysis of the Portuguese and Spanish Wikipedias, all the pages were extracted from ptWiki-20090128-stub-meta-history and esWiki-20090124-stub-meta-history in four namespaces, those that seem to play a role in the coordination and policy-making. In particular, the following namespaces were extracted from the Portuguese Wikipedia: “Wikipedia:”, “Wikipedia Discussão:”, “Ajuda:”, “Ajuda Discussão:” and the following namespaces were extracted from the Spanish Wikipedia: “Wikipedia:”, “Wikipedia Discusión:”, “Ayuda:”, “Ayuda Discusión:”. (These are the equivalents of “Wikipedia”, “Wikipedia Talk”, “Help” and “Help Talk” in the English Wikipedia). These four namespaces gather the work relating to coordination and community aspects. The ‘project namespace’, “Wikipedia:”, is where one can find pages connected with the Wikipedia project itself: information, policy, essays, processes, and discussion. For the analysis of MetaWiki, the pages from metaWiki-20090131-stub-meta-history were used, and the namespaces extracted were: the main namespace, “Meta:” (equivalent to “Wikipedia:”), and “Help:” along with their talk namespaces. 3.3.2 Bipartite Networks26 Bipartite networks and bicliques have been used to study co-authorship in academic papers (Lehmann et al. 2008) and to study clusters of articles and authors in the English Wikipedia in the categories of physics and philosophy (Jesus et al. 2009). The bipartite network formalism is ideal for studying collaboration, because the network structure encodes knowledge about which articles editors have edited together. 25 26 Meta-Wiki, http://meta.Wikimedia.org/Wiki/Main_Page, retrieved 30th of March 2009. This paper uses the same methodology as the previous one, and this section is therefore repetitive. 87 A bipartite network is a graph with nodes in two disjoint sets U and V connected by edges. U is the set of editors and V is the set of meta-articles in the Wikipedias. Lehmann et al. (2008) developed a method for detecting modules in bipartite networks; grounded in physics of networks and expanding the work by Palla et al. (2005). This method is based on detecting the densest areas of the graph (called maximal bi-cliques) and then agglomerating overlapping bi-cliques into larger modules. We use the notation Ku,v to describe a bi-clique with u nodes in node-set U and v nodes in node-set V. By studying the clusters (or ‘modules’) in the bipartite network, we are able to discover clustering of editors and articles and smaller patterns of collaboration. As in the previous study (Jesus et al. 2009), we choose to call dense groups “clusters” or “modules” rather than “communities”, because the latter is an ill-defined concept across disciplines and may imply structures at the macro-level not present in this meso-level study. One important feature of this definition is that nodes can belong to more than one cluster; that is, two distinct modules may overlap and many pairs do. This can better represent the patterns of editing, where an active editor can be part of several modules simultaneously. The open source program, BCFinder developed by Lehmann et al. (2008) was used to calculate and visualize the modules that arise from combining adjacent bi-cliques. Each Ka,b module can be thought of as ‘zooming’ into a relevant area of the network, by changing the values of a and b. 3.3.3 Filtering & Thresholding The edits were filtered to avoid clutter (and allow for computational capacity limitations), and in order to concentrate on the most engaged editors and articles in dense clusters. The filter focused on the editors ranked by the number of edits they had contributed to each article. In practice, editors that have edited the same article 10 or more times were included. Although sporadic edits can be important to Wikipedia as a whole, we hypothesize that they are less relevant when studying the cooperation and interaction between engaged editors of a small subset of meta-articles. In order to visualize the core, we also threshold the values, minimally and maximally, of a, authors and b, articles (in BCFinder). The values, which keep enough bicliques and still ease the computation, are 3-14 for both a and b. In order to make it easier to compare the Portuguese and the Spanish Wikipedias, the same minimal and maximal values were chosen. For the MetaWiki, which has a different profile in the distribution of bicliques and is, by nature, different and not quite comparable to the language Wikipedias, the threshold for a and b became from 0-10. In other words, the choices were made in order to focus on the most interesting parts of the graph and allow for computational capacity limitations. 88 3.3.4 Anonymity We keep the ‘real nicknames’ of the editors because we consider their work public; as is clear from the examples, their identity is no more at stake, than it was by virtue of their creating an account on the Wikipedia website. 3.3.5 Typology It is not the goal here to make a full topology of the biclique groups that arise using BCFinder, although it would be interesting to develop this methodology further and be able to visualize the network typology according to its bicliques. The data from the Portuguese Wikipedia yield 1055 clusters, in 119 biclique groups, the data from the Spanish Wikipedia yield 1941 clusters in 157 biclique groups and the data from MetaWiki yield 696 clusters in 97 biclique groups. We choose relevant bicliques (many are ‘repeated’), group them by themes, and present those below. 3.3.6 Visualization For the network graphs, the program ‘Cytoscape’ 27 was used. The mode chosen was ‘organic’ and the nodes are color-coded. Editor node labels have been excluded to avoid clutter that could make the reading of the pictures nearly impossible. Also, the goal of the visualizations is to give ‘an impression’ of the kinds of networks involved. 3.4 Results The following results are given in the following format (a) an accumulation of data and the use of a tool to look at them, and (b) some reflections on the collected data. 3.4.1 Statistics28 In Table 1, are the general statistics for the three Wikis – Portuguese Wikipedia, Spanish Wikipedia and MetaWiki — investigated in this study. It shows the number of content pages and other pages, the numbers of active users, administrators and bureaucrats (in order of increasing privilege). In Table 2 are the statistics regarding the extracted information from each of the three Wikis, four namespaces in each, and how those numbers compare to the amount of content pages, and all-pages. Table 3-1: Statistics from Wikipedia retrieved 30 March 2009 MetaWiki Content pages (articles) 14306 Pages (all pages in the Wiki, including 131443 talk pages, redirects, etc) 27 28 www.cytoscape.org "No statistical analysis is an alternative to thinking."- Andy Jackson 89 Portuguese Wikipedia 468999 Spanish Wikipedia 457836 1825914 1610623 Page edits since coordinating namespaces were set up Average edits per page Registered users Active users (users who have performed an action in the last 30 days) Bots Administrators Bureaucrats CheckUsers Stewards Importers 1736733 14888384 26715606 13.21 224171 8.15 527758 16.59 1021267 1479 6051 15455 19 83 27 6 43 4 121 66 5 2 186 133 129 3 Table 3-2: Comparison of numbers of meta-articles in 3 Wikis (2 Wikipedias). Wiki Content articles Number of articles in the whole Articles in 4 namespaces (wp,wt,h,ht) Editors present in the 4 namespaces Editors/Meta-article MetaWiki 14306 131443 28541 82345 2.885 Portuguese Wikipedia 468999 1825914 33690 35368 1.050 Spanish Wikipedia 457836 1610623 14975 77036 5.144 It is interesting to notice the similarity in size between the Portuguese and the Spanish Wikipedias just before attaining their half million articles. The Spanish Wikipedia has more edits, and more edits per page, as well as more than double of the number of active users, more bots, more administrators and more bureaucrats. However, there are more non-content pages in the Portuguese Wikipedia than in the Spanish Wikipedia (Table 3-1) and more than double the number of pages of the coordination pages, which are here called meta-articles (Table 3-2). As for the MetaWiki it is worth notice that it has proportionally more registered and active users (Table 3-1) and that the number of pages extracted is comparable to the number of meta-pages in the Portuguese Wikipedia. The number of editors present in the selection of pages from MetaWiki is much higher than the number of editors present in the selection of meta-pages in the Portuguese Wikipedia. The proportion of ‘editors per meta-article’ is even higher in the Spanish Wikipedia. In other words, for the same selection of meta-pages in the Portuguese and Spanish Wikipedias (which have about the same number of content articles), their ‘coordination behavior’, just from the statistics, is very different. The Spanish Wikipedia has half the number of pages (perhaps a sign of less bureaucracy?), but almost three times the number of editors, which perhaps increases the number of discussions and points to a higher engagement. Looking at the statistics is a way to get a ‘feel’ of the dimensions of the datasets, and the dedication of such a large number of individuals, and suggests the importance of better understanding the processes behind the making of Wikipedias, although in this study, only 90 areas of intense activity are studied. 3.4.2 Visualizations Below are presented the network views of the three networks that are part of this study. Figure 3-1: The network view of the meta part of the Portuguese Wikipedia, filtered so that only editors that have edited 10 or more times in an article are included. Blue nodes are editors, and grey nodes are pages. The big umbrella on the left is connected to the node provas (sandbox). Figure 3-2: Network view of the network of editors and meta-articles in the Spanish Wikipedia. Blue nodes are editors and grey nodes are pages. The data is filtered to only include editors who have edited the article at least 10 times. 91 Figure 3-3: network view of the network of editors and meta-articles of MetaWiki. Blue nodes are editors and grey nodes are pages. The data is filtered to only include editors who have edited the article at least 10 times. All of these networks have a concentric shape, which shows the degree of engagement of the community. The concentric shape suggests that the network is quite centralized. As the distance from the center signifies that editors have more spread-out editing activity (by way of which they act as ‘hubs’) if in the center, and more limited editing activity if more in the periphery. 3.4.3 Portuguese and Spanish Wikipedias Before we look at the cliques formed by editors and meta-articles in the Ibero-American Wikipedias, we will just take a small tour of the purpose of these coordination spaces. In the English Wikipedia, the central place for the coordination in the meta-pages (in the namespace Wikipedia:) is the Village Pump. To understand its purpose the easiest approach is to see how the introduction to it is formulated in the Wiki: Village pump: Welcome to the Village pump. This set of pages is used to discuss the technical issues, policies, and operations of Wikipedia, and is divided into four village pump sections. Please use the table below to find the most appropriate section to post in, or post in the miscellaneous section. You can view all village pump sections at once here. Please sign and 29 date your post (by typing ~~~~ or clicking the signature icon in the edit toolbar). Looking at the Portuguese and Spanish Wikipedias, we find equivalents, but not exactly the same, starting with the choice of name, which already reveals both the reverence and the uniqueness of these Wikipedias in relation to one another. In the Spanish Wikipedia, the name 29 http://en.Wikipedia.org/Wiki/Wikipedia_Village_Pump, retrieved 30th of March 2009. 92 ‘Café’ tries to capture the idea of a place where one hangs out, where discussions take place, grasping the concept in the title of the English “Village Pump”, which is also where people gather, discuss village issues, organize the community, but already revealing a cultural trait in calling it “Café”, more urban and adapted to the notion of community in modern times, and especially in the Hispanic cultures, where cafés are definitely a place of gathering. Bienvenido/a al Café de Wikipedia en español. Éste es el conjunto de páginas que usamos para discutir todo tipo de asuntos de interés general, incluyendo algunos importantes como las políticas o las propuestas, o comunicar problemas técnicos. Por favor, selecciona en las tablas que aparecen abajo la sección apropiada para el comentario que desees dejar. Siéntete 30 libre de participar y no olvides firmar. (translation: Welcome to the Café of the Spanish Wikipedia. This is the group of pages that we use to discuss all kinds of themes of general interest, including some important themes like political or proposals, and communicate technical problems. Please, select in the table below the appropriate section for the commentary you want to leave. Feel free to participate and don’t forget to sign.) In the Portuguese equivalent to the Village Pump and to the Café, they introduce it: Bem-vindo à Esplanada! Sente-se, peça um cafezinho e sinta-se à vontade. Esta página é um espaço de entrada para vários locais que servem para todo o género de conversas e perguntas sobre a Wikipédia 31 lusófona. (Welcome to the Esplanade! Sit down, order an espresso and get yourself comfortable. This page is a portal to many places that are used for all kinds of talks and questions about the Portuguese Wikipedia.) And justify their choice of title “Esplanada” (meaning esplanade) in relation to the Spanish Wikipedia in “Razões da criação desta página” (Reasons to create this page): Pois é. Acho que já temos gente suficiente e suficientemente activa para justificar a criação deste espaço comunitário para conversas sobre o projecto que não se restrinjam às discussões de determinadas páginas. Até agora, este tipo de conversas tem passado sempre por discussões em páginas específicas, em especial na minha página, na da língua portuguesa, na da página principal e em mais algumas, mas julgo que é altura de "autonomizar" algumas dessas conversas para um local destinado a elas. Um espaço semelhante a este existe em praticamente todas as outras grandes Wikipédias (sim, a nossa já é uma das grandes). Na inglesa chama-se Village pump, na espanhola Café, etc. "Inspirado" pelo "café" dos "nuestros hermanos", resolvi baptizar a nossa de esplanada, ao ar livre, sítio onde é agradável estar em todo o sítio onde se fala português, julgo. Sentem-se aí. Os preços são baratinhos, e os cães ficam lá fora. --Jorge 00:38, 14 Jul 2004 (UTC) 30 31 32 32 http://es.Wikipedia.org/Wiki/Wikipedia:Caf%C3%A9, retrieved 21st of August 2009. http://pt.Wikipedia.org/Wiki/Wikipedia:Esplanada, retrieved 7th April 2009. http://pt.Wikipedia.org/Wiki/Wikipedia:Esplanada/Julho#Raz.C3.B5es_da_cria.C3.A7.C3.A3o_desta_ p.C3.A1gina, retrieved 7th April 2009. 93 (Well. I think we already have enough people and enough active to justify the creation of this communal space to have talks about the project not limted to the discussions in specific pages. Until now, these kinds of talks were happening in specific pages, specially on my user page, in the Portuguese language page, in the main page and in some others, but I think it is time to “autonomize” some of those talks in a place dedicated to them. A similar space to this one exists in all the other big Wikipedias (yes, ours is now one of the big ones). In the English is called Village Pump, in the Spanish Café, etc. “Inspired” by the “café” of “nuestros hermanos”, I decided to baptize ours as esplanade, open-air, a place where it is pleasant to be in all places where Portuguese is spoken, I guess. Sit down, prices are low and dogs stay outside.) The Wikipedias get inspiration from each other, be it in the choice of names for similar structures, or when comparing their achievements. In the summer of 2009, the Spanish Wikipedia passed the Portuguese in number of content articles, and that is prominently posted in the bulletin: * 17 de agosto: la Wikipedia en inglés llega a los 3.000.000 de artículos. * 12 de agosto: la Wikipedia en portugués llega a los 500.000 artículos. * 5 de agosto: la Wikipedia en español llega a los 500.000 artículos. * 5 de julio: la Wikipedia en español sobrepasa en cantidad de artículos a la Wikipedia en portugués. 33 * 4 de julio: la Wikipedia en español llega a los 490.000 artículos. (* 17th of August: the English Wikipedia gets to 3000000 articles. * 12th of August: the Portuguese Wikipedia gets to 500000 articles. * 5th of August: the Spanish Wikipedia gets to 500000 articles. * 5th of July: the Spanish Wikipedia passed the Portuguese Wikipedia in article quantity. * 4th of July: the Spanish Wikipedia gets to 490000 articles.) 3.4.3.1 Cliques and clusters Below are shown a number of bicliques (dense areas of the graph) of editors and metaarticles. Thematically speaking, these bicliques are, on average, very broad, because an editor working on deletion, i.e. interested in the process of nominating pages that should be removed and carrying out those procedures, may also be interested in board elections, i.e. on participating in the discussions, voting and eventually running for the board. Nonetheless, it is possible to see some groups brought together by common interests, and here we show many of the possible ones. 3.4.3.1.1 General In the Portuguese and Spanish Wikipedias, the clusters of editors and meta-articles often have very general themes. These general themes point to the conclusion that communities are much tighter around coordination than around content-creation. By being tighter, it is meant that the networks are denser (the editors work on more issues simultaneously) and so these communities are also less specific or clusterable than the communities around content33 http://es.Wikipedia.org/Wiki/Portal:Comunidad, retrieved 21st August 2009. 94 creation. In Figure 4 and Figure 5, there are examples that represent these broad cliques both in the Portuguese and in the Spanish Wikipedias. Figure 3-4: Cluster K8,3 of 9 editors and 4 articles in the Portuguese Wikipedia, with a general theme: ‘Pages to delete’, ‘Village Pump’, ‘Requests for administration’ and ‘Pages to Delete/Archive’. Figure 3-5: Cluster K5,3 of 6 editors and 4 articles in the Spanish Wikipedia, with a general theme: ‘Sandbox’, ‘Requests for adminship’, ‘Reference Desk’, and ‘Village Pump’. In these two clusters which are representative of the majority of small clusters in both the Portuguese and Spanish meta-part of their Wikipedias, we see that it gets edited on equivalent pages: both on the ‘Esplanada’ (esplanade) and on the ‘Café’ (‘coffeeshop’), these Wikipedias’ equivalents to the Village pump. Also of note is how administration plays a big role in these coordination processes, both in the Portuguese and in the Spanish Wikipedias there are many edits to the pages regarding the requests to becoming administrators. 3.4.3.1.2 Artigos Destacados (featured articles) One topic that generates some clusters is the issue of featured articles (“artigos destacados” in Portuguese). Proposals, discussions, choices, nominations – all can be part of the process to feature an article. Figure 3-6: Cluster K10,2 of 11 editors and 4 articles in the Portuguese Wikipedia surrounding article featuring: ‘featured articles’, ‘Village pump/proposals’, ‘Election of the featured article’, Village pump/general. 95 Figure 3-7: Cluster K7,3 of 11 editors and 5 articles in the Spanish Wikipedia surrounding article featuring: ‘Featured Articles’, ‘Vandalism now’, ‘Selection of good articles/nominations’, ‘candidates to featured articles’, ‘village pump/archive/miscellaneous/current’. Figures 6 and 7 show the clusters from the Portuguese and the Spanish Wikipedias that relate to the featuring of articles. Featuring articles is a long process, and the decision of which articles ‘make it’ to the front page is filled with coordination and work that happens ‘behindthe-scenes’, namely in the meta/coordination spaces. As noted earlier, the community seems very tight so the clusters don’t split into precise groupings of interest. That is why these clusters, which reveal a higher engagement with the featuring process, also show, edits to pages that are more general such as ‘village pump/general’ or ‘village pump/archive/ miscellaneous/current’. 3.4.3.1.3 Deletion Deletion of articles and what is to be kept is a theme that attracts activism, and strong opinions as the reasons can vary from quality to the old debate between deletionists vs inclusionists. In the following examples, in Figure 5 and Figure 6, from the Portuguese Wikipedia, we can see two clusters that surround the topic of deleting – pages, images and speedy deletion. Figure 3-8:Cluster K6,3 of 7 editors and 4 articles in the Portuguese Wikipedia with two general articles, and two specific about deletion: ‘Village Pump’, ‘Pages to delete’, ‘Speedy deletion’, ‘Requests for administration.’ Figure 3-9: Cluster K8,4 of 9 editors and 5 articles in the Portuguese Wikipedia with two articles that are general, two about administration, and one, well: ‘pages to delete’, ‘village pump/general’, ‘requests of administration’, ‘things to not do’, and ‘images to delete’. In these clusters are also other general issues present, but there is a similar interest, that for example, in Figure 9, there are 9 editors that have edited at least 10 times coordination pages 96 like ‘pages to delete’, or ‘images to delete’ at least 10 times. In these examples, we can also see how the clusters show different ‘zooms’ of activity – these two share one article, but gather different people around them, as well as neighboring articles. They are distinct clusters given the chosen values of thresholds, but still around the same area when using the semantic eye. In a sense, their distinctness is only a matter of the level or elevation of description, like counting peaks in a mountain landscape. 3.4.3.1.4 Ballots Pages where there are ballots to make decisions ranging from the ‘city of the week’ to articles for deletion appear often in the clusters but clusters aren’t only about ballots. In the cluster in Figure 10, from the Spanish Wikipedia one can see the ballot pages for two decisions: “city of the week” and “country of the week”, along with two general pages. Figure 3-10: Cluster K12,3 of 14 editors and 4 articles in the Spanish Wikipedia with two articles about ballots: ‘city of the week/ballot’, ‘country of the week/ballot’, ‘village pump/archive /proposals/actual’, ‘village pump/archive/miscellaneous/actual’. It is also worth noting that this cluster is from a biclique K12,3 – which means that many editors – at least 11, share the edits in these pages, and remembering the filtering, they have edited at least 10 times in each page. Therefore, the page ‘city of the week/ballot’, for example, has had at least 140 edits, just in this little cluster. 3.4.3.2 MetaWiki MetaWiki is a special Wiki, which serves to coordinate the projects of Wikimedia. The next sample of clusters shows the diversity of issues that are discussed on MetaWiki – languages and translations, activism regarding specific projects, and political issues – from election of administrators, to the foundations of Wikimedia local chapters. 3.4.3.2.5 Languages and translations One of the important strengths of Wikipedia is that it exists in so many languages. This creates a need for coordination in relation to languages: how to accept new languages and translation issues. 97 Figure 3-11: Cluster K6,1 of 9 editors and 2 articles about New languages. Figure 3-12. Cluster K5,1 of 24 editors and 4 articles giving more context to the previous cluster about new languages. In this, ‘List of Wikipedias’ and ‘Translation of the week/Translation candidates’ are also edited. In Figure 11, there is a cluster around “Requests for new languages” and its talk page. It is interesting to note in this example how the talk page is integrated with the main page as an important place for discussion. There are many issues regarding new languages, especially after the beginning of Wikipedias when all the expected languages got their Wikipedia. In Figure 12 shows a bigger cluster of bicliques, which really functions as a ‘zoom out’ from the previous clique. It includes the same articles and editors, and it shows the surrounding two pages and fifteen people also associated with the discussions on ‘new languages’ even if slightly less engaged. The other two articles that give context to the discussion on ‘Requests for new languages’ are ‘List of Wikipedias’ where the list of Wikipedias in all the languages is kept, and the other page, possibly a little less relevant, but still within the topic of language and translation is ‘Translation of the week/Translation candidates’. Although it becomes impossible to visualize it is possible to ‘zoom’ even further out, and get there is a cluster formed by 49 editors, and 14 articles, where the articles, in addition to the previous 4, are related to translations: Translation of the Week (and the archives of the years), a page on a specific language for which a Wikipedia is requested: “Dutch Low Saxon”, Steward requests/Permissions, Wikimedia News and Requests for Deletion. 3.4.3.2.6 Wikimedia Projects Although Wikipedias are the best-known Wikimedia projects, there are several more Wikimedia projects under the Wikimedia umbrella, which show up on Meta-Wiki, the 98 community coordination site of the Wikimedia Foundation. In Figure 13 there is a cluster that shows a number of these other Wikimedia projects, and in Figure 14 there is a cluster focused on the Wikimedia project Wikiversity. Figure 3-13: Cluster K3,3 of 4 editors and 5 articles covering a number of Wikimedia projects: ‘Wikiquote’, ‘Table of Wikimedia Projects by Size’. ‘Wikinews’, ‘Wikisource’, ‘Wikibooks/ Table’. In this cluster, four of the projects are shown with an overview page listing all of them, “Table of Wikimedia Projects by Size”. It hints at what those 4 editors were editing at least ten times in each in those pages – some statistical information about their size, as they are connected to/by the article ‘Table of Wikimedia Projects by Size’. Figure 3-14: Cluster K4,1 of 11 editors and 3 articles about the Wikimedia project Wikiversity. This cluster (Figure 3-14) indicates that behind any project there is a small community of engaged people trying to put it through. All these editors were involved in the creation of Wikiversity as they edited the three articles in this cluster, which are: ‘Wikiversity/Modified project proposal’, ‘Talk:Wikiversity’ and ‘Talk:Wikiversity/Modified project proposal’. In this cluster we also see the talk pages playing an important role in what gets decided. 3.4.3.2.7 Politics/Governance Meta-Wiki is also important as the gathering place for decisions involving the community. There is also a community more focused on Meta-Wiki, and, similarly to the other Wikis, it needs administrators and stewards. But it is also where the board electionsthe and Wikimedia local chapters are discussed. 3.4.3.2.8 Steward/admin In Figure 15, there is a cluster with some of the decisions that take place in MetaWiki, regarding governance: Adminships, Stewardships and endorsements for the board. It is, in a sense, an eclectic cluster, and one which reveals some of the editors’ commitment to decisionmaking and participation in these political processes. 99 Figure 3-15: Cluster K4,2 of 6 editors and 3 articles with some relation to governing decisions. 3.4.3.2.9 Board Elections Board elections are a very important part of the community, which elects two members, to better bridge the possible gap between the board and the foundation and the community. These elections involve several steps and a great deal of discussion. Figure 3-16: Cluster K2,7 of 3 editors and 12 articles with many political issues, particularly board elections. In the cluster in Figure 16, 5 of the articles are about the board election of 2008. One is the talk from the general article: ‘Talk:Board elections/2008’, while the others are subpages of Board elections/2008. One concerns translation, while the other three concern the candidates: submissions, questions and checklist. 3.4.3.2.10 Wikimedia Chapters The Wikimedia Foundation oversees the activities in the several projects, but as of 2004 it started to set up ‘chapters’ which are national local organizations with legal capacities that can do some of the roles performed by the Foundation. Figure 3-17: Cluster K6,1 of 9 dedicated Norwegians and 2 articles about Wikimedia Norway: “Talk:Wikimedia Norge/Draft for the Bylaws” and “Talk:Wikimedia Norway”. 100 In Figure 17, the cluster is about Wikimedia Norge (Wikimedia Norway), where two talk pages get many edits (just in this cluster, each of them has at least 60 edits). These talk pages are those related to Wikimedia Norway and the Bylaws, which are necessary steps before the establishment of the chapter. Figure 3-18: Cluster K1,3 of 3 editors and 13 articles related to Wikimedia Sweden. In Figure 18, the cluster shows the work happening at Wikimedia Sweden and the three very active Wikipedians behind this endeavor. It is interesting to note that the pages range over many details of the chapter Wikimedia Sweden: from the board, and meeting agendas to the newsletter, to the number of members. Figure 3-19: Cluster K2,2 of 3 editors and 5 articles about Wikimedia Brazil: “Wikimedia Brazil”, “Talk:Wikimedia Brazil”, “Wikimedia Brazil/Projects”, “Wikimedia Brazil/News” and “Wikimedia Brazil/Projects/WikiBrazil/Registration”. In Figure 19 is the cluster that reveals the efforts to make a Wikimedia chapter in Brazil. 34 Three active Wikipedians are engaged in the discussions about the (possible ) chapter in Brazil, its projects, news and registrations. Figure 3-20: Cluster K1,2 of 2 editors and 3 articles concerning Wikimedia Argentina and Wikimania 2009 in Buenos Aires. Last but not least, in figure 20 is a cluster showing the organization of the Wikimania 2009 conference where this paper was presented. The two main organizers, Barcex, and Patricio.lorente are very active in editing the bids for Wikimania 2009, Wikimedia Argentina, and the bylaws for Wikimedia Argentina. 34 See more in the discussion. 101 3.5 Discussion The first remark to be made concerns the structure of the network that is revealed by the examples above, in comparison to the network of editors and articles in Jesus et al. (2009), because the communities in this study are much tighter and therefore less separated by topic. The communities around community-building/coordination are much more dense than the communities surrounding topics of interest. Everyone is editing everyone else’s pages, and the subjects of interest can range over the whole scale from coordination to translation to deletion. In the content-building communities, there is a neater separation by disciplines, or areas of knowledge, that are analogous to neat departments in a university where there is little interdisciplinarity. At the meta-level, it would be more interesting to compare editors to politicians, or logisticians, or administrators, with all kinds of practical issues at hand, and helping here and there, depending on need and request. The second remark is that the data also included more ‘law-like’ pages, like policy pages and guidelines on how to edit Wikipedia, but, as it can be seen by omission, these pages didn’t form clusters, and where not part of the dense parts of the graph. This is probably due to the fact that there is much more activity in dealing with the daily coordinations than in settling principles, that, although editable, are much less altered. There is more activity in housekeeping so to speak than on building the house. Likewise, not only is there more activity in housekeeping, housekeeping is a broader activity – a comment about an election, a change on a policy, a discussion about deletion. It is also worth to speculate about the nature of cooperation at this meta-level. ‘Community’ is the concept which is widely used to explain the success of Wikipedia, and the tightness of the clusters show that there is indeed a great deal of engagement in the common effort. People also quickly get to know one another and request help in dealing with issues, which can happen directly (by writing on the user pages), but also indirectly, in a stigmergic way – if one sees that one’s ‘friend’ has just edited something, or written a comment, will be more 35 natural to respond . This is quite a different pattern of editing from that organized by interest in the topic, or by kind of activity. Language Wikipedias are whole projects on their own, but highly connected to the whole project. The example on how the Portuguese and the Spanish Wikipedias refer to each other in the choice of community-gathering-place and in the achievements of number of articles reveals that there is an interplay, a conversation going on also in between the different languages and projects. In the formation of chapters, which are national Wikimedia organizations, it is interesting to note that they use the local language, even in the pages of MetaWiki. When I showed the 35 Interviews at WikiMania 2008 pointed to dynamics of ‘replying to those one knows’. It is also common for Wikipedians to leave a note in someone’s talk page asking for advice about a particular issue. 102 above data patterns to the communities, at Wikimania 2009, it yielded engaged feedback. It was positive for them not just to see the themes and issues that form such compact clusters of discussion, but also to identify themselves, and their working peers. It was useful to identify the major actors in the communities of the Portuguese and Spanish Wikipedias, hear stories behind the formation of Wikimedia Norway, by one of the participants, and confirm the engagement of the three main proponents of Wikimedia Sweden, from two of them. The most revealing though, was to speak with one of the people involved in the Wikimedia Brazil, as this is not yet an official chapter. This is due to their resistance in making it a legal entity, which is demanded by the Foundation. This is a paradigmatic case where bottom-up research yielded something that would not have been possible to see from a top-down approach: the Brazilians work as if they had a chapter. Wiki-style this is possible, even defying the powers above. Also particularly appropriate was the small cluster revealing the work behind the organization of a conference and the establishment of a chapter in Argentina, which revealed the dedication of the two key organizers. This approach, which uses networks, and cliques inside of them, to reveal intense collaborations seems to respond to some of the difficulties and challenges in investigating these phenomena. On one hand, the process is much more transparent, and the traces left are numerous, calling for a data-driven approach: on the other hand, there is just too much data, and it needs to be processed in ways that enables one to ‘see’. Using these bicliques, we were first able to identify several areas of interest, and then even extract more information by ‘zooming’ and getting more context, including the articles that were in the surrounds in the landscape of the network of co-authorship. 3.6 Limitations and Directions for Future Research This meso-level analytical approach stands in the middle ground between top-down and a bottom-up approaches: however, at times it seems neither deep enough nor general enough. Therefore it might be useful to expand the directions of this research by studying the specific patterns of the networks and bicliques and constructing a tree of interaction between all of the elements. Also, an ethnographic study following each of the bicliques would reveal more of what is behind the scenes. Moreover, a more exhaustive study including more language Wikipedias would permit a broader comparison, perhaps yielding advice from bigger, more experienced Wikipedias to the smaller ones and/or showing particular patterns in terms of coordination in the smaller ones that get undermined by the success of the English Wikipedia. 3.7 Conclusion This study investigated important areas not yet studied, namely the meta-communities in the Ibero-American Wikipedias, and MetaWiki. This meso-level embraces the use of bottom-up data analysis as well as integration with some qualitative understanding revealing areas of 103 dedicated work. Meta communities are tighter and less specific (everyone does everything) than the content communities. 3.8 Acknowledgments I would like to acknowledge Fundação para a Ciência e Tecnologia, for the grant SFRH/BD/27694/2 006; to acknowledge the technical help for data extraction and filtering from Martin Schwartz; and for inspiring support, presence and help, Sune Lehmann. 104 105 CONTEXT NETWORKS OF THE ARTICLES ‘PRISONER’S DILEMMA’ AND ‘WIKIPEDIA:NEUTRAL POINT OF VIEW’ 4 4.1 Abstract This short study presents graphs of the networks of articles around the English Wikipedia content article ‘Prisoner’s Dilemma’ and around the English Wikipedia policy article ‘Neutral Point of View’. This study serves as a bridge from the previous studies, where groups of articles were chosen, to the following studies where these specific articles are studied in their insides. The rationales for the choices of the Prisoner’s Dilemma and the Neutral Point of View are given. It uses data gathered by an algorithm developed by Kiran Jonnalagadda, and the graphs were made using cCytoscape. The biclique algorithm is also applied to the ‘Prisoner’s Dilemma’ data. These techniques are further explored in the following studies, taking the data sets from the activity in the making of the article, rather than the context articles. 4.2 Introduction Network visualization even without particular clustering methods applied to it, can be useful in several aspects. The first use is to have a ‘picture’ – are there many editors, many actions, and many paragraphs? Are they overlapping or disjoint? The second use is that some information can be immediately noticed, the amount of vandalism and reversion, or the position of these nodes – nodes far out in the periphery have fewer links than nodes in the middle. And the third use is that these networks can help to decide what to research in more depth. 4.3 4.3.1 Methodology Rationales for the Choice of Articles and Time frames The first article to be chosen was to be a ‘normal’ one. It was to be an article big enough and good enough so that the exploration of collaborative patterns would make sense. It wasn’t to be too much of a hot topic because many of the interactions would then have an undesirable political layer. As it was desirable to look at cooperation ‘when it works’, it seemed appropriate to use Wikipedia’s own marking of Featured Article. Many options have been studied (Climate Change, Esperanto, Queer, CO2, Philosophy of Mind, Io, Photon, Atheism, Baden-Powell, Macintosh, Tyrannosaurus) but the choice fell on the “Prisoner’s Dilemma” article (PD) at http://en.Wikipedia.org/Wiki/Prisoner’s_dilemma. Besides the topic of the ‘Prisoner’s Dilemma’ being meta-interesting in a research project about cooperation, it was also an article that was featured rather early on, that is scientific (where there was a minor influence of current problems as it happens on ‘biographies of living people’). In other words, the PD fulfills its sense of a good, normal, precise article, where cooperation patterns could be investigated in depth with smaller disturbance from controversies, news, or being a hot topic. 106 Also of notice is the theme of the topic, which relates deeply to the idea of cooperation. The chosen time frame was the first two years of the article – that is, until it got featured. This way it was possible to investigate the process of featuring, and also the mechanisms of Wikipedia in its earlier days. The other chosen article was to be an ‘Internal’ article – an article from the meta part of Wikipedia — from the section where there are articles that Wikipedians wrote about policy and guidelines for the writing of Wikipedia. Its study would help to reflect on what is going on inside Wikipedia, in Wikipedia – and not only through Wikipedia or at Wikipedia. From those pages used for coordination, under the namespace ‘Wikipedia:’, a policy would be of special interest, because it would allow reflecting also on the governing side of Wikipedia, its construction of a bureaucracy that rules the site. There is a core of policies (the ‘Five Pillars of Wikipedia’), but one policy, in particular, seems to have the central place, namely the ‘Neutral Point of View’ (NPOV). NPOV is often mentioned as the main goal of Wikipedia and it is also the only ‘compulsory’ policy when making new language editions. Wikipedias in new languages can otherwise differ in the modes of organizing, electing administrators, and deciding deletions, to name a few of the activities going on ‘behind the scenes’. The chosen time frame to study the NPOV was the year of 2009. Even though it is quite a central policy article, and it started back in the year 2001 – there are still almost 500 edits in the year 2009. Studying the year 2009 is a way to look at an article that is already ‘mature’, although it still undergoes many changes, and even more, many discussions. 4.3.2 Data extraction Kiran Jonnalagadda and Hans Matthews from the Center for Internet & Society in Bangalore have been working on a method to extract and cluster the articles that have been edited by the editors of a specific article for a specific window of time (Abraham 2010). In making this extraction algorithm they were interested, in particular, to see possible clusters of interest that could inform us about the participation of certain editors. These patterns are likely to be found in controversial articles, or in articles that draw a lot of attention. A random conference dinner meeting suggested the possibility of using the method of extraction of the 'surrounding articles' in the data on the two articles to be studied in depth. Kiran Jonnalagadda extracted the list of articles that were edited by the editors in the articles PD and NPOV during the timeframes chosen. Specifically the algorithm just mentioned was used on the 'Prisoner's Dilemma' article's edits up to March 2004 when the article was featured — the same 'early' data used in the ethnographic, networks and bicliques studies of the Prisoner’s Dilemma (in the next studies), and in the 'Neutral Point of View' article for the year 2009 — the same 107 'mature' data used in the networks and bicliques studies of the Neutral Point of View (in a following article). First, we extracted the editors that contributed to the early period of the 'Prisoner's Dilemma', and to the late period of the 'Neutral Point of View'. Then those editors were 'followed' in those periods of time, and a link table was constructed between editors and the articles that they edited during that time. The PD data had 42 editors and 72233 articles (edited by those 42 editors in the time window), with a total of 91512 edges (number of edits), while the NPOV outsides data had 188 editors, 424065 articles (edited by the 188 editors in the time window) with a total of 534052 edges (number of edits). But this data yields several 'single articles' — edited by one of the editors but by none of the others. That data that can be filtered out to reduce clutter, and the size of the dataset. The chosen filters were 5, 10 and 20 edits minimum for an article to be kept in the filtered data set. 4.3.3 Data Visualization Two techniques were used to visualize the data. To see the networks, Cytoscape was used with the nodes color-coded but labels often hidden to allow the networks to be visible. All of the renderings were done, for simplicity and consistency purposes, using the ‘organic’ rendering. Biclique algorithm and visualization using BCFinder was used following Lehmann et al. (2008). 4.4 4.4.1 Results PD-networks In Figures 1 and 2 we can see the networks of the outer context of the PD, for the time period 2001-2004. The outer context is understood here as the work done in surrounding articles for the same period, and by the same editors, that were editing the PD. 108 Figure 4-1: Network of the surrounding articles of the ‘Prisoner’s Dilemma’ in the period 2001-2004 – articles edited by the same editors for the same period. In this picture the filter was set such that only articles with 10 or more edits are represented. Blue nodes are editors and pink nodes are articles. Figure 4-2: The same data as the previous figure with a coarser filter, set such that articles with 5 or more edits are represented. This yields more data and is possible to see a larger pattern. The isolated pink node is still the ‘Prisoner’s Dilemma’ article. In figure 1, the filtered data yielded 275 edges, 21 articles 36 36 and 42 editors. In figure 2, there To best see the ‘context’, here is the list of the articles: 1 (number), Artificial intelligence, Deaths in 2003, Gödel's incompleteness theorems, Mathematics, Philosophy, Prisoner's dilemma, Sorting algorithm, Wiki, Talk:Main Page, User talk:Mav, Wikipedia:Copyright problems, Wikipedia:Featured article candidates, Wikipedia:Find or fix a stub, Wikipedia:Pages needing attention, Wikipedia:Reference desk/Miscellaneous, 109 are 3330 edges, 42 editors and 557 articles. This kind of data yields networks of a nucleus of editors surrounded by a crown of articles. In figure 1, it is of notice that the articles that were most edited (and therefore appear in the figure) are meta-articles such “Talk: Main Page” or “Wikipedia: Votes for Deletion”. Figure 2, which includes more data (but no labels, to ease the image), is also clearer about the structure: there are a few very engaged editors and many articles around them. All but 5 editors have edited more than just the ‘Prisoner’s Dilemma’ page. 4.4.2 PD-bicliques In order to understand if there are some patterns of editing, the biclique algorithm and visualization was applied to the outer network data for the PD. Two illustrative results are presented in Figures 3 and 4. Figure 4-3: Biclique group (k=3,9) or greater from the network of surrounding articles (with 5 or more edits) of the article ‘Prisoner’s Dilemma’ showing a great quantity of mathematics articles, such as ‘Trigonometric functions’, or ‘Gödel’s incompleteness theorems’. Figure 4-4: Biclique group (k=5,6) from the network of surrounding articles (with 5 or more edits) of the article ‘Prisoner’s Dilemma’ showing a number of ‘normal pages’ and a large number of meta pages, such as ‘Wikipedia: Welcome newcomers’ and ‘Wikipedia: Pages needing attention’. The first group of bicliques, in Figure 3, shows a common interest in mathematics, while in the second, in Figure 4, the biclique group also contains many meta pages, showing that work Wikipedia:Requested articles, Wikipedia:Requests for adminship, Wikipedia:Requests for investigation, Wikipedia:Village pump archive 2004-09-26, Wikipedia:Votes for deletion archive May 2004. 110 between coordination and content articles was done by several of the editors. As an observation, for greater k’s – which means for denser areas, there are more and more metaarticles. 4.4.3 NPOV- networks Below, in Figures 5 and 6, is the network of the context articles of the ‘Neutral Point of View’, for the year 2009. Figure 4-5: Network of the surrounding articles of the article ‘The Neutral Point of View’ – articles that have been edited by the same editors in the year 2009. This picture has the filter set such that only articles with 20 or more edits are represented. Blue nodes are editors, while pink nodes are articles. Figure 4-6: This picture has the same data as the one above, with a coarser filter. The filter was set such that articles with 10 or more edits are represented. The isolated pink node is the ‘Neutral Point of View’. 111 37 In Figure 5, there are 2528 edges, 77 articles , and 183 people while in Figure 6, there are 12340 edges, 891 articles and 182 editors. These networks show a much greater number of editors who only edited the central article (in this case, the NPOV) and nothing else (possibly due to the increasing size of Wikipedia, the number of one-time editors also increased). Otherwise, these networks follow the structure of the networks above – a ‘core’ of editors edits many other articles. 4.5 Discussion When comparing figures 3 and 4 with figures 1 and 2 it is possible to see how the bicliques capture the two directions in which the editors of the ‘Prisoner’s Dilemma’ were also editing – towards mathematics and meta-articles. These networks of outer articles reveal several clusters that would be interesting to analyze further, although a little taster of a mathematics one (in Figure 3) and one with many meta-pages (in Figure 4) surrounding the PD were shown here (it wasn’t possible to apply the biclique algorithm to the NPOV data because the data set was too big). The visualization technique, using the ‘organic’ mode of Cytoscape, helped to construct a quick image of what was happening in the ‘context’ articles of the PD and NPOV. While, for the PD, there were both content articles (related) and meta-pages, for the NPOV there were policy pages (related), many meta-pages and also many user pages. This shows some evidence for the intertwined nature of collaboration – including activity on all fronts – content articles, policy pages, discussion pages of the policy pages and user pages. In this paper, the ‘context’ of an article is operationalized as the articles edited by the same editors during the same time window. It makes the notion of ‘context’ more precise, instead of a more general notion that would have to take all Wikipedia into account, for example, for lack of defining a boundary. This is another instance of cooperation, which only to some extent is collaboration – a wise guess would say that editors are not all aware of being part of the same endeavor, although many recognize again and again the most prolific editors, along with their stances/personalities/styles. 4.6 Acknowledgments I’d like to thank Hans Mathews for inspiration and discussion about the formation of networks; to Kiran Jonnalagadda, for extracting the context data for the PD and the NPOV articles so promptly, and to Jay Armas, Kragen Sitaker, and Vasco Jesus for programming help in dealing with large sets of data. 37 These articles are: 2009, 2009 flu pandemic, Barack Obama, Human, Michael Jackson, Wikipedia, Talk:Barack Obama, Talk:Main Page, Template talk:Did you know, User talks: Alansohn, ChildofMidnight, Cirt, DGG, Durova, Gwen Gale, J.delanoy, Jehochman, Jimbo Wales, Juliancolton, MastCell, MZMcBride, NuclearWarfare, Rlevse, SlimVirgin, William M. Connolley, Xeno, Wikipedia namespace: Administrator intervention against vandalism, Administrators' noticeboard, Administrators' noticeboard/Edit warring, Administrators' noticeboard/Incidents, Arbitration/Requests, Arbitration/Requests/Case, Arbitration/Requests/Enforcement, Biographies of living persons, Biographies of living persons/Noticeboard, Conflict of interest/Noticeboard, Consensus, Editor assistance/Requests, Fringe theories/Noticeboard, Good article nominations, Help desk, Huggle/Users, Huggle/Whitelist, Manual of Style, Miscellany for deletion, Neutral point of view, Neutral point of view/Noticeboard, No original research, No original research/Noticeboard, Policies and guidelines, Reliable sources, Reliable sources/Noticeboard, Requested moves, Requests for page protection, Requests for permissions/Rollback, Sandbox, Talk page guidelines, Third opinion, Usernames for administrator attention, Verifiability, Village pump (miscellaneous), Village pump (policy), Village pump (proposals), Village pump (technical), What Wikipedia is not, Wikiquette alerts, Wikipedia talk:Arbitration Committee/Noticeboard, Arbitration/Requests, Biographies of living persons, Criteria for speedy deletion, Flagged revisions/Trial, Neutral point of view, No original research, Requests for adminship, Verifiability, What Wikipedia is not. 112 113 5 HISTORY OF THE ‘PRISONER’S DILEMMA’ In which the history of the article ‘Prisoner’s Dilemma’ is told and meanwhile different tools to study Wiki-articles are presented and it is shown what to see with them. It all started with a big theme, the quest to understand cooperation in emergent and distributed cognition of socio-technological networks. It continued by picking a specific case, a choice, that turned out to be about Wikis, those software tools for cooperation. And in Wikis, cooperation happens at different levels, one of which is the article-level. There, around an article, a group gathers and constructs a feasible article. There, engaged in the production of a page (in the case of Wikipedia, normally an encyclopedic article), a group of people meets, adds, deletes, discusses and negotiates. The article can be seen as one of the natural units in the construction of a Wiki. To proceed, it is important to choose ‘how to look’, and ‘what to look at’. To investigate ‘how to look’, the research process is extended to include research on what tools exist to look at an article’s history and particularities. As Wikis are a relatively new phenomenon — so are the diverse tools that have popped up around them. There are tools to investigate Wiki(pedia) articles that have evolved alongside with the project, in order to provide a visualization of what is happening below the surface layer. Wikis have the special particularity that the history pages, or logs, are saved alongside with the product of the collaboration. The history is kept, and it is worthwhile to look at those footprints of collaboration. When data becomes immense, ‘looking’ also has other needs: no longer is it enough to look with ‘bare eyes’, but it becomes more and more necessary to use visualization techniques to emphasize particular aspects. The different ways to focus, what data is extracted and how it is presented, determines what can be seen. In this section we will look at some of these tools and see what can be ‘seen’ with them. The choice taken here of ‘what to look at’ is to look at some article in depth, an article from the three-million-strong pool of content articles in the English Wikipedia. The choice (somewhat constrained to the biases of the human-actor) falls to the article on the “Prisoner’s dilemma”. Besides being of meta-interest because it is a known case when investigating cooperation, the prisoner’s dilemma article is one of the early ones in Wikipedia and gained featured status quite early, in 2004, which makes it a good case for studying the origins of an article, and the practices of early Wikipedia. Furthermore, it is also interesting as a noncontroversial article (the focus is on collaboration, not its breakdowns), where there are facts about a more or less settled topic which drive the writing (and not heated discussions, or recent world events) and that is of high quality. This article was indeed written by several editors and chosen early on as one of the articles that the community is proud of producing. 114 Then this article has had its own history: incidents of vandalism, ups and downs in quality and in the definition of quality. The article is not featured nowadays, as the criteria for featuring have changed over time: Figure 5-1: tag from September 2008 that informs of the prisoner’s dilemma article former featured status. It was featured on the main page on March 16, 2004. So, a part of the early history of this article will be presented, along with some of the tools that can help unveil the different sets of information in a Wiki article. In addition, the information that can be gathered in the search for distributed cognition and cooperation will be characterized and evaluated. Recording the activity in Wikis or in virtual communities seems to be a feature that is here to stay, so this study can offer information on the tools to look at such data, what they can yield, what can be found and how to best use them. 5.1 Pre-history The first finding is the first post in the article. It shows that the ‘Prisoner’s dilemma’ article was merged from the ‘Prisoners dilemma’ article (different spelling). The unique challenge of looking at an article from the beginning of Wikipedia is that it requires a Sherlockholmsian approach in order to understand the first stages of the text that was to become the widely read article on the prisoner’s dilemma. This approach is possible because there are present (almost) all of the footprints left in the construction of the article. The challenge is to reconstruct what happened, with clues from here and there. In more recent articles, the technology and nomenclature were more in place, making it possible to investigate the history just by looking at the article’s history. The article on the ‘Prisoner's dilemma’ starts on the 01:57, 11 March 2002, by ‘The Anome’ who moves the content from the “prisoners_dilemma” page. To properly dig through history, we then must look at the ‘prisoners_dilemma’ page to find out how did that start. ‘Prisoners_dilemma’ page was started at 21:25, 21 August 2001 (almost 8 months earlier). The anonymous user ‘129.186.19.xxx’ apparently added a lot of text, and three days later ‘217.98.151.xxx’ added a whole discussion page. These two additions are too complete to have been the first footprints as they are extended texts. They could, in principle, have been written by a single contributor (although unlikely), but a clue left in the talk page suggests that the page was started somewhere else. The talk page includes a comment with a time stamp from 2001-06-19 by ‘Hornlo’, which was two months before the start of the article, suggesting that the article’s germ was started somewhere else. Studying the contributions by user ‘Hornlo’ on that date, using http://en.Wikipedia.org/ 115 Wiki/Special:Contributions/ doesn’t yield any clues. So, to try to understand where the text comes from, the link ‘what links here from the prisoner's_dilemma’, which shows all the pages that redirect to it (pages which, when searched for, would redirect the searcher to the page), was used. The hope is to find the original article that led to the ‘prisoners dilemma’ (wrong spelling, first article) and eventually to the ‘prisoner’s dilemma.’ These redirects show a lot of care concerning 38 capitalizations , misspellings (prisoner’s delimma, prisoner’s dillema), related issues (PD scenario, iterated prisoner’s dilemma). A possible predecessor would be ‘Payoff Matrix’ as the user ‘129.186.19.xxx’ started the article on 21:25, 21 August 2001, stating “free linked "payoff matrix"” while user ‘217.98.151.xxx’ started the talk page on 10:08, 24 August 2001 with comment “game matrix”- maybe it all comes from a previous article called game_matrix? But these avenues didn’t show the article’s origins. Another clue finding was done by copying a phrase from the talk page of the ‘prisoners dilemma’ and searching for it in Google, which showed the result for Nostalgia Wikipedia. Figure 5-2: screenshots of the original Prisoners dilemma article and talk page from http://nostalgia.Wikipedia.org/Wiki/Prisoners_dilemma/Talk. 38 Concerning capitalizations, and other kinds of standardizations, it is interesting to note how with the growth of Wikipedia, so many standards were implemented. To show an example from this case study: ‘Prisoner's_Dilemma’ is just a redirect from other capitalizations. On 16:29, 24 February 2003 it was done, while on 15:28, 11 December 2007 was added: “This is a redirect from a title with another method of capitalisation. It leads to the title in accordance with the Wikipedia naming conventions for one of the below templates to redirects created for this purpose. Other variants should use one of the other redirect templates such as from alternative spelling or from alternative name. Pages linking to any of these redirects may be updated to link directly to the target page. However, do not replace these redirected links with a piped link unless the page is updated for another reason. For more information, see Category:Redirects from other capitalisations.” 116 One more reference, before the August 21, 2001 start date, was a mention on what was then called Wikipedia NEWS 39 from June 13 19 2001: “Prisoner's Dilemma Posted June 18 Sjc posts an article on one of my favorite Game Theory strategies — the "Prisoner's Delimma". I've often wondered at the way this theory could help explain the successfulness of online projects like Wikipedia. ...the likelihood of cooperation and trust developing between two partners is related to the likelihood of there being other encounters in the future.” The mention in Wikipedia News stressing the possible relation between the ‘prisoner’s dilemma’ and discussions of cooperation and trust related to Wikipedia confirms the nature of the article as an important article about cooperation, making this research slightly metadriven. After several dark alleys, without finding the information of where the text comes from, the researcher writes a letter to user ‘Sjc’ named on the Wikipedia News article as the starter of the ‘prisoners dilemma article’ to uncover what happened between those two months where there is evidence from an article prior to August 2001, both in one time stamp from the talk page, and from a mention in Wikipedia News. His reply points to the early changes of Wikipedia: “At some point in time around then Wikipedia moved from a very small database cluster to a much more impressive structure. The only real casualties of this move were (AFAIK all) the revision histories prior to the move and a handful of articles which got trashed/corrupted in the move.” So, that was it for the use of revision histories. We didn’t get to know exactly what happened in the pre-history of the Prisoner’s Dilemma article but heard about Wikipedia nostalgia, Wikipedia news, and the move and loss of the revision histories. Still before the ‘prisoner’s dilemma’ got its current spelling, there was a mention of it in “Brilliant prose”, which ‘Stephen Gilbert’ started on 19:50, 14 November 2001 where 'prisoners dilemma' figured in economics. So, we have an article ‘Prisoner’s dilemma’ which comes from a previous one ‘Prisoners dilemma’. That one started at 21:25, 21 August 2001, or better – re-started after a migration in the servers and the previous history was lost. On the 18th of June 2001 someone already commented on the existence of the prisoners dilemma article. More information was possible to uncover due to Wikipedia NEWS – which is an example of a community tool – through which it was possible to identify the creator of the article even after that data was gone. 39 This concerns internal news of Wikipedia, not by Wikipedia such as “WikiNews”. This is now called Wikipedia SignPost. 117 History of the article 5.2 5.2.1 Prisoners dilemma’s history in 11 edits The history previous to August 21st, 2001, was, as we saw previously, lost in server change. From there, a link is added at the end of the article, an edit fixes a spelling, makes a small change in the paragraph about the game of chicken, and deletes one line and one paragraph. The next edit links the article as a ‘stub’, and adds a line space in the end, for visual purposes. The next edit adds information, 2 full paragraphs after the information about the ‘tit for tat’ strategy, relating to the iterated tit for tat, about the Nash equilibrium, and the chicken game. Then a bot went by and cleaned up the empty lines. Another edit was then made adding information to the paragraph on the Nash equilibrium, added recently. Then one word was added in the paragraph about the chicken game. After a link to the Portuguese version of the article was added on top of the page, and the next edit fixed a mistake in the previous link. And the final edit of the short history of the alternatively spelled article ‘Prisoners Dilemma’ came by, redirecting it to the better spelled ‘Prisoner’s Dilemma’. Reading a short description such as this one is not very appealing. It exemplifies also the difficulty in ‘following’ what happened, which demands other tools to better understand what was happening. 5.2.2 Parts of an article As argued earlier, many parts constitute an article: accompanying each article, there is a discussion page, and a log of the changes (the history page). The discussion started after the article, and, as this is an early article, the time gaps are of the order of six months (in more recent articles, because there are more people involved in the project, there is more activity). After 22 revisions from 11th of March 2002 to 12th of December 2002 appears the first edit in the talk page appears, a conversation when someone (24.207.234.62) on 17th of December 2002 who hasn't contributed yet (who knows, at least not that IP address), asks: "The article makes a mistake, in that the prisoners do not have to be in contact to reach the mutual best decision. If the prisoners assume that both of them are rational, and bound to make the best rational decision, they will choose to cooperate." And so 209.107.95.230 answers on the 22nd of December 2002, 5 minutes after having fixed a wrong payoff matrix: "Actually, the whole point of the problem is "What is the best rational decision?"" The next conversation happens on the 27th of August 2003, and after that on the 17th of December 2003, showing that in the beginning of Wikipedia it took a while for the discussion page to pick up: in more than one year, there are only 3 entries in the talk page. 118 5.2.3 History of the beginning: 87 edits until featuring From the official beginning of the ‘prisoner’s dilemma’ article on 11th March 2002 until it was featured on the main page on 16th March 2004, there were 87 edits. Since descriptions like the previous one on the edits to the ‘prisoners dilemma’ and small observations about the edits on the talk page don’t yield much, we will study the article first with some of the tools available (and consequently study those tools), and then use a new method, mixing qualitative and quantitative data yielding networks and bicliques, which will be used to analyze these 87 changes as part of a tri-partite network of editors, type of edits and paragraphs. 5.2.3.1 History Flow One of the tools developed to see the editors and their edits and how the text changes for each article is History Flow (Viégas et al. 2004), which compares different versions of a file, allowing for its history to be ‘seen’, a way to visualize dynamic, evolving documents and the interactions of multiple authors. It makes a visualization of the text colored by who was the editor. Such a tool reveals complex patterns of interaction, cooperation and conflict. It aims at making broad trends in editing more visible. Using it, it is possible to see vandalism, to see prolific editors and to see edit wars, to name a few applications. Figure 5-3: screenshot of History Flow for the article ‘prisoner’s dilemma’ for the period of interest here spanning the beginning until it became a featured article. See main text for more details. 5.2.3.1.1 Adding text 119 The first red ellipse identifies a place where a new editor added new text. Looking into the article, it reveals that on The 29th of November 2002, an anonymous editor added the section ‘Friend or Foe’. This section concerns a game show that uses the prisoner’s dilemma game. 5.2.3.1.2 Change of author The second red circle denotes a place where the color of the text changes, which means that the authorship changes. This edit from the 28th of May 2003 is only a few changes in words and tidying, what could be called ‘clarify info’, but the whole paragraph was cut, changed and then pasted back, changing the authorship of that text. These kinds of cases when the visualization is misleading as to what really happens is, as we will later argue again, a reason that it is important to include qualitative observations in analyzing Wikipedia, and not just trust quantitative measures. 5.2.3.1.3 Moving of text The third red ellipse denotes a place where text was moved. Below in the text there was a definition of the prisoner’s dilemma in political science, which was moved to the beginning along with the other definitions. 5.2.3.1.4 Repeated vandalism The last red ellipse, which draws attention to several black columns, shows the vandalism that happened the night after the Prisoner’s dilemma article was featured on the main page. On the 17th of March 2004 00:13, anonymous a deletes all the content from the page and writes ‘fuck your mom...’ (a), at 00:16, user 1 writes ‘this needs reverting’ (1), 3 minutes after the attack, 1 minute later, at 00:17, the same anonymous user a does the same attack, one minute later, at 00:18, user 2 reverts, one minute later, at 00:19, anonymous a, again, for the third time, pertains the same attack, and one minute later, at 00:19, user 3 reverts it back to the article with content. 5.2.3.2 Statistics There are some useful pages, that allow for a statistical overview of a page. On 12th of August 2008, we retrieved this information for vs.aka-online.de for the ‘prisoner’s dilemma page’: 120 Figure 5-4: Statistics tables for the ‘Prisoner’s Dilemma article’: overall statistics, edits per year and user statistics. Statistics can be informative of, well, statistics. In these tables one can get the information that the ‘Prisoner’s dilemma’ article is not heavily edited, with less than an edit a day (2 days 121 average between two edits), that has been similarly edited over the years, that the average number of edits per user is quite low (expected since many, given the long tail effect, only edit once), and the most active editors (the ten with more than ten edits) have only a fair number of edits (10-41). These facts support the choice of the ‘Prisoner’s dilemma’ as an uncontroversial article not dependent on current events, or based on controversies. From the list of editors, 8 of the 31 presented in the list, were contributing during the period of interest – until March 2004 when the article was featured. Knowledge from the workings of the article allows also for the reading that ‘Michael Hardy’ was attracted to the article via the featuring process or from the main page, as the first edit by this user is from the 16th of March 2004, at 20:51, the day the article appeared on the main page. Another possible inference is that ‘The Anome’ was engaged in the featuring process working on the article until the 17th of March 2004, and then left the process. 5.2.3.3 WikiChanges 40 To see a graph of the number of contributions over time, there is the tool WikiChanges . WikiChanges is a web-based tool that exposes the revision history of Wikipedia articles using an interactive graphical timeline. It has been used to show activity, especially of articles timedependent articles, such as ‘London bombings’ or political campaigns where information is added following a major event or controversy. In the case of ‘Prisoner’s dilemma’, in figure 5 below, one can analyze the peaks, and then use that information to look back at the history of the article and infer why there was so much activity. 5.2.3.3.5 Peaks The first peak in 2003/05 reveals 24 updates in a month. A way to summarize what happened would be to say that two people picked the article up, tidying, and clarifying, and because they did that, other contributors (some anonymous, others not) contributed 1-2 times. Another interesting peak is the one in 2006/01, where there are 61 updates – it is accompanied by 7 edits in the discussion – there was a proposal to remove the article from the list of featured articles and so the article generally cleaned up. Because the article’s continuation as featured was being discussed, other tasks were performed such as fixing typos, capitalized letters, and adding clarifying sentences. On that month, on the 18th of January 2006 there was a vandalism attack that read “WHY WOULD YOU WANT TO READ SOMETHING LIKE THIS THAT IS SOOOOO BORING”, which was removed after one minute. 40 http://sergionunes.com/p/Wikichanges 122 Figure 5-5: Wikichanges graph reveals the number of edits per month since the beginning of the article ‘Prisoner’s dilemma’. 5.2.3.3.6 Around the featuring There were 0 edits on the month of February 2004, 55 on the month of March 2004, 13 on the month of April 2004, and one edit in the month of May. These reveals that surrounding the featuring a lot more work is done, but it is not previous work or later work that continues that engagement. 5.2.3.4 Wikipedia Dashboard Wikipedia DashBoard 41 is a tool made to provide social transparency to Wikipedia by tracking the quality of the contributions and attributing the work to individual users, it hopes to eventually increase the credibility and trust of Wikipedia. For a given article it shows contributions and the contributors color-coded as well as small statistics on contributor patterns. 41 http://Wikidashboard.parc.com 123 Figure 5-6: Wikipedia Dashboard is displayed in the beginning of an article providing information about the activity per user per month. The previously seen higher engagement during the month of March 2004, is seen again in this display, where the first red ellipse denotes the vertical line corresponding to the month of March, 2004, when many edits were performed. It is easy to capture the most engaged editors. After that peak, editor Psb777 was very active for two months, leaving comments in the discussion page such as: “the caption doesn’t seem appropriate”; and soon after: “I’ve changed it”. 5.2.3.5 Other Tools There are other tools available, for example, more statistics can be found at http://stats.grok.se/. Some of these other tools are described below, even if they are less relevant to the study of the PD article. Wikitrust (Adler & de Alfaro 2007b) in Figure 7 was developed to make possible visualization of the trust assigned to the content of an article. Depending on the contributor’s history and the length of time the content has survived change, the marker orange changes from dark orange (highly untrustworthy) to white (trustworthy). In this manner is possible to identify pieces that reveal vandalism, or controversies. In figure 8 one can see the application of the algorithm to the ‘Prisoner’s dilemma’ article, though Wikitrust is less useful on the quiet ‘Prisoner’s dilemma’ page (merely revealing it is less trustworthy to think of cash machines in terms of trust and cooperation than newspaper vending machines) than on more controversial pages. 124 Figure 5-7: WikiTrust applied to the article Prisoner’s Dilemma. As this article is not highly controversial or prone to much vandalism, the only ‘orange’ (less trustable) content regards very new content in the end of two paragraphs. For more political endeavors, to reveal hidden players behind anonymous IP addresses that may well have stakes on the editing of Wikipedia pages, there is “Wikiscanner”42, which tries to match edits with companies, parties, governments. One of the possibly most useful tools for this research is inaccessible due to corporate limitations and a sparse notion of contribution to the research community (requests made by email and in person haven’t helped), as the team at IBM cannot make available (neither for use, nor the code) the tool “chromograms” (Wattenberg et al. 2007) in which they have constructed a way to color code the titles of articles and the comments left when they were edited, making it possible to visualize edit histories by user. Traffic statistics, in figure 8, reveals how many people have been accessing the article, and ultimately, it is an important factor for interpreting Wikitrust – as Wikitrust computes how long text has survived which is only meaningful if it has been seen. Figure 5-8:traffic statistics or the article ‘Prisoner’s dilemma’ until October 2008. 42 To see the (lack of) political turmoil behind the ‘Prisoner’s dilemma’, please check: http://Wikiscanner. virgil.gr/f.php?pagetitle= Prisoner%27s+Dilemma. 125 Another tool is WikipediaAnimate 43 a Greasemonkey script that constructs a movie of the different versions. While with History Flow there is a focus on the people who have edited, in WikipediaAnimate there is a focus on the text change. Together they can reveal patterns in editing, also from the point of view of the text. For example, a piece of text can trigger several edits because it was a concept badly defined or difficult to agree upon. The first movie of Wikipedia changes was a videocast about the Heavy Metal Umlaut 44 which actually triggered a competition for scripts to do it cleanly, which was won by ‘Wikipedia Animate’. 45 Another film of this kind follows the ‘London bombings’ article . In the Wikipedia Animate version for the ‘Prisoner’s dilemma’ 46 article the first 100 changes to the article can be seen in video format. One can notice that there were more structural changes in the beginning and that the article became more organized later – with a table of contents, a clear introduction and a longer text. This is a very compelling visual form to motivate the research, as it is possible to get a feel for text addition, alteration and subtraction, but a deeper look is needed in order to extract more information about what happens in the process. 5.2.3.5.7 87 edits In order to extract qualitative data from the series of edits, a good idea is to follow the developed categories in Pfeil (2006) which separates the possible actions in an edit, shown in the following table: Table 5-1: Summary of categorization used in the edits of the beginning of the ‘Prisoner’s dilemma’, following Pfeil et al (2006). 43 44 45 46 http://phiffer.org/projects/Wikipedia-animate http://weblog.infoworld.com/udell/gems /umlaut.html http://youtube.com/watch?v=s8O-hv3w-MU&feature=related http://www.nbi.dk/~vulpeto/pd100 126 The following three tables show the results of this categorization, where 53 editors, 13 actions, and 11 parts of the article (paragraphs or the whole article) have been involved in 173 changes. Tables 5-2, 5-3, 5-4: These three tables show the statistics of the data collected following the categorization in Pfeil et al (2006). The first table concerns the number of editors that have made how many changes (providing all the names would produce too long a list), the second has the categorizations of ‘paragraphs’ – including the general ‘whole article’ ordered by number of changes made to them, while the third table has the actions performed, ordered by the number of times each of them was ‘used’. Greater amounts of data could be gathered if one could use a program to retrieve the information. This would have its caveats, as data retrieval (construction?) such as classifying an action is better performed with the help of human expertise rather than a purely automated method. This said, humans are also biased, but are able to discern different situations and be consistent with their own classifications. This kind of categorization and statistics can be made more revealing if the network is studied in more depth, putting emphasis in the way editors, actions and paragraphs are connected. That is the purpose of the network views and the use of the biclique methodology already used to see dense clusters in groups of articles that are explored in the next section. 5.3 Discussion This study used different tools constructed specifically for analyzing Wikis. These tools were constructed in parallel with the appearance and increasing popularity of Wikis, to serve 127 different purposes, from research to visualization. The growth of analysis tools in parallel with the Wiki technology is an important point for reflection as there is a need for methodology to accompany the objects of study. As it is often said, “Wikipedia doesn’t work in theory, but it works in practice”, and this could be an analogy to the process of research about Wikipedia. While there isn’t much theory developed, there are tools that can be used ‘in practice’ in order to look at specific issues. Most of these tools evolved with the project creating new ways of ‘looking’. A deep qualitative analysis is always going to work, but the interesting challenge is to know how to combine the data and take information out, in a way that is meaningful. Visualizations feed into specific points in the archived information on particular events that then allow for qualitative in depth analysis. As for the tools used here, a rough evaluation points positively to those, which are open source, and freely available, which are versatile, web-based, and possible to use with any article. 5.4 Present/Future Research Anne Goldenberg (a PhD student in Montreal who studied negotiation in Wikis) and I presented “WikiWriting Collaborative Patterns” 47 (2008) at the conference WikiSym 2008 where we further developed the idea of constructing an application that would help study the conversations around a Wikipage and their relation to the content of the article. It is a mixed method, in the spirit of collaboration between human and non-human actors (like the symbiosis between vandal-fighter-bots and editors). We decided to consider the discussions between pairs of editors as the measure of acquaintance. Our purpose was to capture the important actions that determine the evolution of an article, through a semi-automated method to study qualitatively the contributions and the discussions. We presented this proposal, but realized, that although in the forefront of research possibilities, it needed an infrastructure (and resources) not usually associated with individual PhD projects. We constructed a list of specifications, which allowed us to make a preliminary typology of the forms of intervention and what needed to be extracted for this analysis. With it, we would be able to analyze greater quantities of data: we would be able to extract the comments that surround a negotiation, test the hypothesis that discussions in the filligrame form create a stronger epistemic community. In particular, it would be interesting to cross, match and relate the typology of intervention forms, with the typology of resolutions in a discussion. 47 in appendix W. 128 129 NETWORKS OF WIKIPEDIA ARTICLE: INSIDES OF ‘THE PRISONER’S DILEMMA’ 6 Rut Jesus Center for the Philosophy of Nature and Science Studies, University of Copenhagen, Denmark 6.1 Abstract In this study, the analysis of Wiki edits is coded and the resulting data is used to build networks of Wikipedia articles in the inner process of accretion of edits. The article studied is ‘The Prisoner’s Dilemma’ (PD), a content article, which is studied in the early Wikipedia (2001-2004) following a featuring process. The mixing of methodologies and the network visualizations are keen in providing a quicker overview of the most important themes, although diagrams on paper are not the best way to interact and learn from networks. 6.2 Introduction Wikipedia has been studied in many ways, also as a unique source of data to understand collaboration among others. Both bottom-up and top-down structures have emerged, and their understanding requires tools that help with the visualization of the processes involved. If, on one hand, quantitative studies of networks give insights of the evolution of the site in general (Capocci et al. 2006), on the other hand, qualitative studies have focused on several processes, such as the understanding of negotiation processes (Goldenberg 2010). On the third hand, there have been efforts to create tools to see collaboration at the article where the earliest and still one of the best tools, is History Flow (Viégas et al. 2004). Several studies have identified work patterns and roles in contributions to Wikipedia (Kittur & Kraut 2008; Anthony et al. 2007) while others have modeled accretion of edits (Wilkinson & Huberman 2007) and relation to quality. In the present paper, the focus is on seeing the networks of collaboration at the article level. These data are, therefore, important for the development of theory to understand the workings of Wikipedia. 6.3 6.3.1 Method Data Sets The data is from the ‘Prisoner’s Dilemma’ (PD) history page. The PD is a ‘standard’ content English Wikipedia article, which was featured rather early (and has lost featured status since). The data used from the ‘Prisoner’s dilemma’ is from its inception until featuring in March of 2004, with the intent of studying a beginning to reveal the particular features of the initial construction of an article, and its process of becoming a featured article. 6.3.2 Data Extraction Data was extracted pertaining to the editors, paragraphs and actions performed by editors (coded by hand using the list in the previous study). For the ‘Prisoner’s Dilemma’ article, the 130 network comprised 56 editors, 11 paragraphs, and 13 types of action, for 174 edits. The borrowed data from Niels Christensen’s master thesis about the ‘Punk Rock’ article had 471 editors, 45 paragraphs, and 793 edits. 6.3.3 Tripartite network In the previous studies the data was constituted of editors and articles that they had edited. To adapt the approach to understand the inner workings of an article, the ‘articles’ were replaced by ‘paragraphs’. Paragraphs are also a natural unit within an article – on one hand because it is of a different nature to edit the ‘introduction’ or the ‘references’ and on the other hand because MediaWiki pages are constructed as such that it is rather easy to click on the topright corner of a section and edit just that section (which is another example of architecture influencing behavior, or of the distributed process of tools and humans). One way to fulfill the goal of integrating data from quantitative and qualitative analysis, one way to do so is to construct a third part in the network – namely characterizing the link between the editors and the paragraphs. Characterizing the links in a computerized way partially cannot [subjective information], partially should not [many possible mistakes] and partially would not [the impossible need for a fulltime programmer] be done. Therefore, one human agent (to be consistent) recorded the acquisition of data categorizing the kinds of actions performed by the editors. This is a simple way to add ‘qualitative’ data, as it is ‘processable’ data – in the form of tags, and links to the other categories. Certainly much data is lost in this process – all the notes taken regarding what happened are not accounted for. Nonetheless, processing information in this way is possible to extract ‘cliques’ or highly engaged zones, and then complement with selected richer data. In the analysis of the ‘Prisoner’s Dilemma’ article, a tripartite network is constructed with a being an editor, b being an action and c being a paragraph. Three bipartite networks are also constructed from this data, namely the pairs (a,b), (b,c) and (a,c). 6.3.4 Data Visualization Two techniques were used to visualize the data. To see the networks, Cytoscape was used (in ‘organic’ mode), with the nodes color-coded. BCFinder was used to find and visualize bicliques following Lehmann et al. (2008) and Jesus et al. (2009). The figures below were created to aid the visualization of the type of networks; labels of editor nodes have been excluded to avoid clutter that could make the reading of the pictures nearly impossible. 6.4 Results The creation of network visualizations is the first step to dealing with a large quantity of network data. Co-editor networks in Wikipedia, with editors and articles as nodes (Jesus et al. 131 2009) lead to the first possible kind of networks inside of articles: substituting articles by paragraphs, one can construct the networks of co-editorship inside one article. Figure 1 shows such a network for the Prisoner’s Dilemma. 6.4.1 6.4.1.1 Networks Editor-paragraph Figure 6-1:Network between editors and paragraphs of the ‘Prisoner’s dilemma’, in the initial years of existence. Blue nodes are editors and grey nodes are paragraphs. There are a number of editors who only edited once, or a few times, which creates a structure of ‘umbrellas’ around the paragraph nodes. Inside those paragraph nodes, in the most central place of the graph, are the editors who were mostly engaged, which are connected to many of the paragraphs. To compare if the same structures would appear for another article, data collected by Niels Christensen and used for his master thesis (Christensen 2009) were borrowed to construct Figure 2, comprising of editors and paragraphs of the article ‘Punk Rock’ during the year 2007. 132 Figure 6-2: For comparison, the network between editors and paragraphs of the article ‘Punk Rock’ during the year 2007. The same structure can be seen, with umbrellas around an external circle of paragraphs and then an internal circle of highly engaged editors right in the center. The ‘umbrella’ structures point once again to the high number of one-time editors – which suggests these are power-law networks – many people doing few edits, and few people doing many edits. The center of the figure, as well, maps the editors that were most engaged and edited most in many of the several paragraphs. 6.4.1.2 Editor-action In order to understand what is going on inside an article, it is even more interesting to look at what ‘actions’ people were performing, what happened and not just where it happened. All the edits and ‘diffs’ (comparison between versions) were looked at and coded for ‘the type of action’, spanning ‘delete link’ to ‘spelling’. Figure 3 shows the network of the Prisoner’s Dilemma article until being featured, the network between the editors and the actions they performed. 133 Figure 6-3: Network between editors and ‘actions’ such as ‘clarify information’, ‘reversion’ or ‘grammar’ in the article ‘Prisoner’s dilemma’ in its first stages, 2001-2004. Blue nodes are editors, while purple ones are actions performed. Of notice is the disconnected part of the network, in the bottom left corner – connecting the two vandalism instances – that, no surprise, were not performed by any of those involved in the writing of the article. It is also noticeable that there are also structures like ‘umbrellas’ and that there is a core of editors in the middle. ‘Format’ and ‘Mark-up language’ figure more or less in the middle, because there are many small edits that are formatting, such as adding a line for better readability. 6.4.1.3 Action-paragraph For completion, although relatively less interesting is the network between actions and paragraphs. Figure 6-4: network between the actions and the paragraphs in the ‘Prisoner’s dilemma article’. The lack of clear patterns points to the fact that none of the actions is particularly relevant to specific articles. 134 The most striking feature of this network is how much it doesn’t resemble the other ones. That is mainly because of the numbers of actions and paragraphs being very low, but also because it suggests that these two categories do not relate in any specific way: all kinds of actions are performed to different parts of the article. 6.4.1.4 All It is also possible to make a network visualization (figure 5) with the above data from the three kinds of nodes: editors, actions and paragraphs. Figure 6-5: Network with three kinds of data: editors in blue, paragraphs in grey and different colored links depending on the actions. 135 The utility of this kind of network is to be able to see different kinds of actions performed by the same editor on the same paragraph (which otherwise would be masked as only one edge), which now appear as an onion shape. In the middle, we can see a prolific editor who edited with many kinds of actions. For example, light blue edges represent ‘add information’, rose ‘clarify information’ and grey represent ‘format’. It is easier then to see actions of ‘add information’ being added in different kinds of paragraphs both by ‘middle editors’ and by ‘umbrella ones’. In figure Figure 6-6: another way to render a network made of three different kinds of nodes: in the middle the editors, connecting to the actions in purple, connecting to the paragraphs in grey. What this one shows is that the editors are clearly the central part of the work. 6, there is another rendering of this network with the three kinds of nodes. 6.4.2 Bicliques As used previously, groups can be formed with the same network data and help to see the ‘densest’ parts of the graph. 6.4.2.1 Editor-Paragraph – PD In figure 7, we can see one biclique for the data concerning editors and paragraphs in the Prisoner’s Dilemma article. The first two sections, the introduction and ‘the iterated prisoner’s dilemma’ are edited. Figure 6-7: A k11,2 biclique made of bicliques k22,2, k=17,2, k=11,2 where two of the paragraphs are edited, the ‘introduction’ and ‘the iterated prisoner’s dilemma’ along with the whole article. 6.4.2.2 Editor-Action In Figure 8, we can see one of the relevant bicliques for the PD article, where ‘format and ‘add information’ play the central role, being used as editing actions by 23 editors. 136 Figure 6-8: Biclique group (k=11,1) from the ‘Prisoner’s Dilemma’ network between editors and actions, revealing the two most ‘clustered’ actions: ‘add information’ and ‘format’. 6.5 Discussion Visualization and clustering of the networks of co-authorship of Wiki-articles give insight into what kinds of collaboration patterns are taking place. Many of the patterns already seen for the PD, an early content article with fewer edits are also present at the ‘Punk Rock’ article’s edits in 2007, showing the prevalence of some structures of editing according to the actions. Although ‘organic’ mode of visualization seems to yield the most meaning – easy to access information – what nodes are in the center, what nodes are out, what nodes make umbrella shapes – figures 5 and 6 show that other ways of conceiving the same data can also be useful. In figure 6, for example, the tripartite structure of the data becomes apparent in the concentric circles. In this study, the three pairs of information were studied: the one that provides most information is (editors, actions), because we can see some patterns of editing ‘styles’. The one biclique in figure 8 shows how two of the most central actions are ‘add information’ and ‘format’ which points to an intertwining of kinds of activity – people are both contributing with new paragraphs and text, as well as fixing the general appearance and formatting the article. The pair (action, paragraph) serves almost as the neutral control – and shows that the structures presented in the other two pairs are not always present, they are more the result of specific patterns of collaboration. As expected, there is no ‘correlation’ between actions and paragraphs (the only one expected would be ‘add link’ to ‘references’). These networks allow us to see inside one article and understand its process of creation, where both the paragraphs and actions were important. 6.6 Limitations and Recommendations The greatest limitation of these studies is paper. Seeing a static picture of these networks reveals far less than an interactive version where it would be possible to click on nodes and edges to know a bit more of their story. The same is true for the renderings of biclique communities – it is only possible to show a little glimpse of a far vaster amount of data. Another limitation is that this is a preliminary visualization study that would highly profit from an interdisciplinary team – to write the ethnographies, to calculate the exponents and 137 other network factors, to make interviews. The understanding of the internals of the article would be more complete with the analysis of the discussion pages, which are crucial in the process of negotiation of content. 6.7 Conclusion This study showed possible avenues to integrate qualitative and quantitative data when researching collaboration of Wiki-pages, which is a reply to an imperative of interdisciplinarity in this age of large amounts of data, and large amounts of (sociological) questions. 6.8 Acknowledgments I would like to thank Max Schich and Sune Lehmann for inspiration and discussion about the formation of networks and Niels Christensen for the acquisition of qualitative data of the ‘Punk Rock’ article. 138 139 INSIDE THE POLICY ARTICLE ‘THE NEUTRAL POINT OF VIEW’ 7 Rut Jesus Center for the Philosophy of Nature and Science Studies, University of Copenhagen, Denmark. 7.1 Abstract The changes made to the policy article NPOV during the year 2009 are visualized. The structures found (umbrella shapes, discussion lines) are compared to the other networks previously studied. We find a higher prevalence of vandalism and reversions in late Wikipedia done by editors not engaged in other editing activity in the article. 7.2 Introduction The big secret of course is that Wikipedia is not really about an encyclopedia, it's just a big game of nomic. Jimbo Wales It is said that the ‘Neutral Point of View’ is the most central of all policies in Wikipedia, and it has been there since the beginning in 2001. How then are 500 changes per year to the page on the NPOV policy page still happening in the year of 2009? Since all of Wikipedia is a Wiki, the structure of the coordination pages (with a different namespace than the content articles) is very similar to the pages of ‘encyclopedic articles’. While discussion pages have a special culture more similar to discussion forums, policy pages such as ‘Neutral Point of View’ have been edited in the same fashion as ‘content’ articles. The 'Neutral Point of View' is one of the three basic values in Wikipedia, alongside with 'No Original Research' and 'Verifiability'. It has been argued (Niesyto 2010) that because these three principles upon which the informative nature of Wikipedia is build, are not hierarchical, conflicts can be created, as for example, in the absence of a neutral name for a region in conflict, a solution would be to come up with a new neutral option that would be fair to all sides. That is, however, not possible, as Wikipedia rejects 'original research'. Still, the NPOV is the central article because it is what ties this community in being about ‘writing an encyclopedia’. In “Wikipedia Revolution”, Lih (2009) states: "NPOV is the only nonnegotiable policy in Wikipedia, according to Jimmy Wales. It's what makes people work together: converging while collaborating. " This neutrality principle helps the construction of an encyclopedia, and tightens the community. In the early days, the clear idea of an encyclopedia helped to aggregate many people engaged into the project. Reagle (2007c) in “Is Wikipedia Neutral?” investigates in depth the usage of neutrality in the context of Wikipedia. He concludes that the usage is appropriate because it is understood as a 140 process, and not an end, and a process that does not necessarily consider neutrality possible, but the act of writing being fair to all the parts. In this small study, the process of writing on the article NPOV is investigated. 7.3 Methodology 7.3.1 Data set The data gathered consists of the changes done to the article “Wikipedia:Neutral Point ofView” for the year 2009, making it a study about the maturation process. While the “Prisoner’s Dilemma” article discussed in the previous paper is a classical encyclopedia article, the NPOV is a policy article, but because both are written in a Wiki, by editors following the same guidelines, their structure is similar – and similarly studied (the networks of editors, the ‘article’ page, the discussion page, the history page). The chosen time frame to study the NPOV was the year of 2009. Even though it is quite a central policy article, and it started back in the year 2001 – there are still almost 500 edits in the year 2009. Studying the year 2009 – is a way to look at an article already ‘mature’, which still undergoes many changes and discussions. 7.3.2 Data Extraction The data extracted was the list of changes in 2009. These changes were categorized according to paragraph and action – as previously done for the PD article. The data comprised 188 editors and 14 actions, in a total of 470 edges. The data from the discussion page was also extracted, comprising 1626 entries (edits), 78 “discussion topics” (paragraphs) and 279 editors. 7.3.3 Tripartite networks For the ‘Neutral Point of View’ article, two networks are constructed, one for the editors and types of action done to the article, and one for the editors and discussion topics in the discussion page. 7.3.4 Data Visualization Three techniques were used to visualize the data. To see the networks, Cytoscape was used (in ‘organic’ mode), with the nodes color-coded. Biclique algorithm and visualization using BCFinder was used following Lehmann et al. (2008) and Jesus et al. (2009). The figures below were created to aid the visualization of the type of networks; labels of editor nodes have been excluded to avoid clutter that could make the reading of the pictures nearly impossible. 141 7.4 7.4.1 Results Networks Figure 1 shows the network for the article NPOV, during the year of 2009. Compared to the PD network, it comprises more edits, and some categories are much more prominent. Figure 7-1: Network between editors and actions in the meta-article ‘Neutral Point of View” during the year 2009. The most striking differences are the strong presence of vandalism and reversion probably due to the ‘stage of Wikipedia’, the prominence of the article and the greater number of changes due to more activity in Wikipedia in general in the year 2009. In this figure, the category of ‘add image’ is the one to appear isolated, which could point to a work-division of adding images independently from the editing of the article. The most striking feature is that vandalism and reversions have so many edits, making dense umbrellas. Vandalism is only linked to the rest of the network by two edges, which means that it is ‘an independent activity’ otherwise. Investigating those two edges yields that the one connect to reversion was someone who wrote about the Manchester Derby and right after ‘reverted their own vandalism’. It could have been a phrase for another page, but contribs doesn’t yield more results for that date and editor. The other edge, connected to ‘add link’, was ‘just’ the addition of the link to Wikipedia’s own page, which could be classified as vandalism, given also that that was reverted, and used in order to support the claim ‘Sometimes, a potentially biased statement can be reframed into a neutral statement by ''attributing'' or ''substantiating'' it.’ Many of the reversion edges are also unrelated to the rest of the editing process, probably showing the activity of the vandalism fighting well described in Geiger & Ribes (2010) where specific bots help vandal-fighters. 142 Figure 7-2: network of editors and discussion topics in the discussion page of the ‘Neutral Point of View’ during the year 2009. It is more mixed – about as many topics as editors, … To contrast with a very different structure, in figure 2 is the network of editors and topics in the discussion list for 2009. Here it is clear that some discussions are threads – as there are ‘lines’ in the network. The editors are more scattered with the topics. 7.4.2 Biclique In figure 2 is shown one of the bicliques from the data above. Before, the umbrella shape of reversion incuded editors that didn’t perform any other activity in the article (probably vandal-fighters), but also some editors that were connected to other parts of the activity. To understand what kinds of activities the editors were also engaged in, the biclique below is helpful. 143 Figure 7-3: Biclique group (k=9,2) from the ‘Neutral Point of View’ article’s network between editors and actions. Editors engaged in reversion and also engaged in more of the article, were engaged in adding information and clarifying information, two main activities of ‘construction’ and betterment. This probably points to editors who are quite engaged in the re-writing of a certain aspect, but also react to vandalism once in happens. 7.5 Discussion Liu & Ram (2009) have clustered editors into all-round, watchdogs, starters, content justifiers, copy editors and cleaners. In the previous figures, it is easy to identify all-round editors – editors who perform several kinds of actions, who usually appear in the center of the networks. Watchdogs are also visible, by the focused work on reverting, by people who didn’t edit the article anywhere else. These watchdogs, or the emergence of a policing activity, have also been described by Shirky (2008b) in a distributed cognition approach between vandalfighter bots and editors. The visualization of the changes made to the NPOV article points towards a maturity increase in Wikipedia, where some structures get augmented, in comparison to the study of the PD article: the existence of core editors (all-round editors), many one-time editors, and that editing happens by topic and by type of action. At the NPOV article edits, reverts are also made by people engaged in writing the article – people who probably see the vandalism and correct it straight away. So, that’s one of the answers to the question of what 500 edits are doing in a mature article: many of them are the playing back and forth of vandal wars. A metaphor used by Clay Shirky (2008b)(Shirky 2008b) in “Here Comes Everybody” helps to shed light on the process of what is happening: comparing Wikipedia to a shrine in Japan which is rebuilt every couple of decades from wood from the same forest,UNESCO refuses to list the shrine as a historical place, even though it has been there for over thirteen hundred years. This is Wikipedia’s strength: process not product. If people decided to leave the site, article quality would decline, and vandalism would take over. 144 7.6 Conclusion The happenings at the NPOV article show us a kind of dynamic stability. These visualizations point towards the same dynamic processes discussed previously – Wikipedia is not static, neither its articles, its structure or its principles. 7.7 Acknowledgments I would like to thank Brendan Cooney for the acquisition of qualitative data, by categorizing the types of actions and making notes regarding them. 145 146 IV. PHILOSOPHICAL/COGNITIVE 1 Introduction to Cognition Studies .................................................................... 149 1.1 How does Wikipedia work in practice....................................................... 149 1.2 Wikis and Cognition ................................................................................. 150 2 What Cognition Does for Wikis....................................................................... 151 2.1 Cognition for Planning vs. Cognition for Improvising............................... 151 2.1.1 Relations Between CfP and CfI.......................................................... 152 2.1.2 Relations ............................................................................................ 154 2.2 Cognition for Improvising Surplus............................................................ 157 2.2.1 Cognitive Overload ............................................................................ 158 2.2.2 Benkler’s Modularity and Granularity ................................................ 158 2.2.3 Wikipedia as an encyclopedia and text ............................................... 159 2.2.4 Anderson’s Long-Tail and Margin Advantage.................................... 159 2.2.5 ICT & CfP & CfI ............................................................................... 159 2.2.6 Kindness-Trust Surplus ...................................................................... 160 2.2.7 Wiki Case and Wikipedia................................................................... 160 3 What Wikis Do For Cognition ......................................................................... 163 3.1 Cognitive milestones................................................................................. 163 3.1.1 Comparing with Print......................................................................... 164 3.2 Information’s support................................................................................ 164 3.2.1 New organization of knowledge ......................................................... 164 3.2.2 Add, then filter................................................................................... 165 3.3 Information production’s consequences .................................................... 165 4 Implications for Cognition............................................................................... 166 4.1 ESD Cognition Umbrella .......................................................................... 166 4.2 Cognition vs. Cognizing ........................................................................... 167 4.3 Cognition is Thinking and/or Problem-Solving and/or InformationProcessing? ....................................................................................................... 167 4.4 Extension.................................................................................................. 168 4.5 Ecological................................................................................................. 168 4.6 Understanding Cognition .......................................................................... 169 4.6.1 Dynamic, Complementary, Transversal.............................................. 169 4.6.2 Bridging............................................................................................. 170 4.7 Transience Consequences ......................................................................... 170 Section-map First, I will explain the two relations (weak and strong) between Wikis and Cognition (what cognition does for Wikis and what Wikis do for cognition). In the following sections, first the “weak” hypothesis is investigated and then the “strong” hypothesis is investigated. This is followed by “What Cognition Does for Wikis”, where the weak hypothesis is defended and a conceptual distinction between Cognition for Planning (CfP) and Cognition for Improvising (CfI) is proposed. Then, I argue that Wikipedia’s success depends on the Cognition for Improvising surplus, a mode of great use in a project that grows incrementally. Subsequently, the strong hypothesis is brought up “What Wikis Do For Cognition”, by studying the role of 147 Wikis in the major change in information production. After that, the possible implications for the understanding of cognition are presented, using the distinction between Cognition for Planning and Cognition for Improvising and how this distinction can contribute to clarify the distinction between Cognitivist, Embodied, Situated and Distributed approaches to cognition. Publication status: — the paper “What Cognition Does For Wikis” in the appendix Y was accepted and published at the Proceedings of WikSym 2010. The paper is incorporated in parts 1 “Introduction to Cognition Studies”, “What Cognition Does for Wikis” and “Implications for Cognition” and it argues for the point in 2 “What Cognition Does for Wikis”. 148 1 INTRODUCTION TO COGNITION STUDIES Two of the major themes of this thesis, Wikis and Cognition, are addressed in this section. The research conducted for the previous, data-driven section was crucial for the insights developed in this ‘cognitive studies’ section. Nonetheless, the hope is that the theoretical framework, including the main distinction between Cognition for Planning and Cognition for Improvising, can stand on its feet. 1.1 How does Wikipedia work in practice It has been repeated that “The problem with Wikipedia is that it only works in practice. In theory, it can never work.” This claim has been made into the zeroeth law of Wikipedia (Raul654 2005). This phrase can be understood in several ways, and therefore appeals to people from different areas. The phrase appeals to those who would argue from a moral point of view and count the number of good-doers and bad-doers in the world, who are surprised that the openness of Wikipedia attracts more people who contribute positively for the project than people who would destroy its viability. The phrase appeals to those pragmatists, who like to point out that the success is visible, and that practice is what matters, not theories that utopias are or are not possible. As put by Clay Shirky: Wikipedia’s “utility is settled, interesting questions lie elsewhere" (Shirky 2008b). Even if accuracy is being studied and is important to develop tools to help navigate the trustworthiness of the content, it is also a fact that Wikipedia is a Top 10 website, and is widely used, cited or not. The phrase that Wikipedia ‘does not work in theory’ can also be understood as ‘Wikipedia is not supposed to work according to out theories. Its success runs contrary to our theoretical expectations. We can see the results of its success but lack accompanying theories to understand why and how Wikipedia works (game theory, for example, accounts for people only behaving by direct self-interest). The thread that will be pursued here is the development of a cognitive distinction to account for the phenomenon of the use of Wikis, and, specifically, Wikipedia. Substantial research on Wikipedia has been done in the last few years, and presented in conferences such as WikiSym, but there is a clear lack of philosophical approaches (one issue of Èpisteme dealt with the epistemology of Wikipedia (Fallis 2009), and, with very few exceptions [such as the description of bot use to vandal fighting using distributed cognition (Shirky 2008a), cognitive theory has not been involved in Wikipedia research. A paper discussed the “Wisdom of the Crowds vs. the Rise of the Burgeoisie” (Kittur et al. 2007). In the present paper, the focus is not on who writes Wikipedia but on the construction of a 149 cognitive distinction to think about how Wikipedia is written – how to account for the many tinkering edits and the fewer substantial additions of content. 1.2 Wikis and Cognition Wikis and cognition can be connected in two ways: what cognition does for Wikis, and what Wikis do for cognition. Another way to understand the two directions of the implication between Wikis and cognition is to consider two hypotheses, called weak and strong in relation to how much they alter our brain: The Weak Hypothesis: Wikis work because, through them as a tool, particular aspects of human cognition are used. Cognition for Improvising was always “there”, and Wikis profit from tapping into the Cognition for Improvising. The Strong Hypothesis: Not only Wikis harness this surplus in Cognition for Improvising, but they also “shape” it. In this hypothesis, human cognition is changed/enhanced/extended by the use of Wikis. In the following section, ‘what cognition does for Wikis’, the weak hypothesis is defended and elaborated, while on the section after, ‘what Wikis do for cognition’, the strong hypothesis is investigated, although I remain agnostic about its validity. 150 2 WHAT COGNITION DOES FOR WIKIS To best understand cognition’s role in making Wikis thrive, a distinction between Cognition for Planning and Cognition for Improvising is suggested below. 2.1 Cognition for Planning vs. Cognition for Improvising Cognition for Planning (CfP) is the kind of cognition that we use when we sit down to reflect on an issue and make a decision. Cognition for Improvising (CfI) is the kind of cognition that we use when reaching for a glass of water, where the body ‘knows’ how to make the movements, one after the other to reach the glass of water. Goal Level: At the extremes, higher-level goals can look very different from lowerlevel goals. Many smaller cognitive processes can constitute a bigger goal, in a modular way. Writing an encyclopedic article is a goal higher than correcting a typo. Cognition for Planning is present when there are higher-level, very well defined goals, while Cognition for Improvising is present when lower-level, even very low-level goals are the ones at stake. For example, while making a calculation there is the clear goal of getting a result in the end. It involves making a computation in the mind, or using the help of pencil and paper, where several processes are applied (some of which we may not how we do). These processes constitute the ‘problem-solving’ process. Other cognitive processes can have much lowerlevel goals, so much at a low-level that they may even not be called ‘goals’, such as saying one word, or moving an arm. Units: The minimum unit of analysis and of processing for Cognition for Planning is bigger than the minimum unit of analysis and of processing for Cognition for Improvising, in terms of time, decision and work. Cognition for Improvising is constituted by many small decisions, as in an improvisational dance, where each small decision brings the opportunity for the next. Cognition for Planning is constituted by greater decisions, like a rehearsed dance that encompasses decisions about the whole structure. Action vs. Reaction: While Cognition for Planning is what we use in a coordinated effort to produce a specific result, acting upon the world (for example, saving food for the winter), Cognition for Improvising is what we use in replying to an immediate disturbance or interaction (for example, ducking if someone shoots), reacting to the world. Cognition for Planning allows us to construct futures, and remember pasts, while Cognition for Improvising allows us to deal with the here-and-now challenges. 151 2.1.1 Relations Between CfP and CfI Having described the distinction between Cognition for Planning and Cognition for Improvising, it is important to stress that these ‘types’ of cognition can happen in parallel. There may be activities where we use one of these types of cognition, and other activities for which we use the other. Research activity, for example, comprises paper writing, which uses Cognition for Planning, but many of the sources of inspiration come from conversation, which usually uses Cognition for Improvising, as it is a quick exchange of small units of thought, quite reactive to what is going on. Cognition for Planning is a more complex category that includes Cognition for Improvising, thus these two types of cognition are often present simultaneously. 2.1.1.1 What kind of distinction is this? Before we start using the distinction between CfP and CfI, it is important to lay some thoughts about ‘what kind of distinction’ and ‘what a distinction is’ within the metaphors of a continuum, a joint of nature, and a phase-transition. Continuum: Cognition for Planning and Cognition for Improvising are just names for different parts of the continuum, and therefore, there is an element of artificial categorization as to where to set the border between the two. The first reason to use this distinction is to be able to speak of different extremes of a spectrum. In one extreme of the spectrum of the dimension “size of goal” there are very small units of processing, and on the other extreme these units are much bigger. As size of goal is an important difference between CfP and CfI, these cognitions represent ‘ideal types’ in the ends of the spectrum. Distinguishing them is useful as a construct to understand the world. Moreover, not only the bigger units can be comprised of smaller ones, but there are also many mid-sized units of goal, time or work that don’t get the honor (or the curse) of a label in this dual-sided distinction between Cognition for Planning and Cognition for Improvising. 152 Nature’s joints(see box): A small note must be made about the nature of categorizations, as categorizations are essential for the thinking process, but also attempt to ‘carve nature at its joints’. It is, though, a construction to think that nature does have all the joints we claim it has, because there are many ways to categorize and make joints to serve different purposes of understanding. Nonetheless, as Umberto Eco said, there are many possible cuts, but it is hard to imagine one that puts the trunk attached to the tail of the elephant. The distinction between Cognition for Planning and Cognition for Improvising wants to be at least useful for the understanding of the processes happening in Wikipedia, and possibly useful to the greater picture of cognition. The distinction between CfP and CfI is also highlighting a difference in degree, and as argued below, a difference in kind. Again, not wanting to argue for the existence of natural joints, it wants nonetheless to capture an important distinction. Degree and Kind: One can make a distinction between a difference in degree – more of the same – and a difference in kind – Prince Wen Hui's cook speaks: "A good cook needs a new chopper Once a year--he cuts. A poor cook needs a new one Every month--he hacks! "I have used this same cleaver Nineteen years. It has cut up A thousand oxen. Its edge is as keen As if newly sharpened. 'There spaces in the joints; The blade is thin and keen: When this thinness Finds that space There is all the room you need! It goes like a breeze! Hence I have this cleaver nineteen years As if newly sharpened! "True, there are sometimes Tough joints. I feel them coming, I slow down, I watch closely, Hold back, barely move the blade, And whump! the part falls away Landing like a clod of earth. "Then I withdraw the blade, I stand still And let the joy of the work Sink in. I clean the blade And put it away." [Excerpted from "Cutting Up An Ox", from "The Way of Chuang Tzu" (Merton 1969)] something new. In the case of the CfP and CfI, it could be said that CfI is just a smallest unit, and that much CfI together makes CfP. It remains to be argued if this difference of degree can become a difference in kind. Clay Shirky (2008b) in “Here Comes Everybody” postulates that the difference in degree of sharing information in the Internet is becoming a difference in kind, even though the social urge to share information isn’t new. Emergence studies have also put much effort in trying to identify what factors are that transform something from having a difference in degree to a difference in kind. Those studies ask what are emergent properties, what is new, for which a possible answer is that something can only be new in relation to a frame of reference (Bedau & Humphreys 2008). In the case of the internet, the sharing of information and the case of Wikipedia, there seems to be something new in relation to previous patterns of knowledge production. Phase-transitions: A way to understand this transition between a difference in degree to a difference in kind, is to think of phase-transitions, well known in physics. The simplest example of different phases is how water can be ice, liquid water and vapor. A phase transition is the moment between the different phases. While it is the same substance, a change in temperature, and/or pressure, which really is a change in the velocity of the particles, leads to different phases that have different properties. When the water molecules are slowed down, particular bonds are possible between the molecules, and the water gains a 153 particular set of properties that we call ice, because it is sufficiently different than liquid water, for example, in its greater hardness and lower density. In a market, just as in physics, it is not difficult to see, in simple examples, how quantity can change the quality (“how more is different”, or how a difference in degree can lead to a difference in kind). If we are to pay several people to do some work, which has some costs of transaction—processing the check, or the account—it pays off as long as there aren't too many people. Say, 10 people working 10 hours, and the cost of transaction for each, is the use of 1/10 of an hour. It pays off because 10x10= 100 hours of work, and 1/10x10= 1 hour of transaction. You get 100-1= 99 hours of work done. But if there are 100 people doing 1 hour of work each, it pays less off, even though there are the same 100x1= 100 hours of work, there are now 1/10x100= 10 hours of transactions, more than previously. There are, therefore, 100-10= 90 hours of work done. If there are 1000 people doing 1/10 hours of work, there are the same 1000x1/10= 100 hours of work, but now 1/10x1000= 100 hours of transactions. And the result is 100-100= 0 hours of work. This example shows how there is a clear difference between 10 and 1000 people working for the same goal, which is not just a difference in degree but also a difference in kind, when one considers the transaction costs. For 10 people working, there is a model that can work with paying the people, which doesn't work with 1000 people doing the same 100 hours of work. The two extremes of the dimension “number of people vs. transaction costs” are different phases and even though the work done is the same, there will be very different properties of the economies in one extreme and in the other. In one extreme, a proprietary market works, while on the other extreme, a non-proprietary economy can arise. Such as these two market phases, the same may be happening between Cognition for Planning and Cognition for Improvising, where the difference in the size of unit can be producing different ‘phases’ of cognition. 2.1.2 Relations Although acknowledging that categorizations can be abstract, heuristic tools for thought, the distinction between Cognition for Planning and Cognition for Improvising can be useful in understanding the shift between modes of production of knowledge. Below, these two modes of cognition are put in comparison to other concepts that have been inspirational for the development of this distinction, namely, the distinction between abstract and concrete, the distinction between planners and bricoleurs from Turkle & Papert (1990), the difference between know-what and know-how studied by Varela (1999), and finally the relation of the modes of cognition for planning and for improvising in relation to modes of reasoning, deduction, induction, and abduction. 154 2.1.2.1 Abstract, Concrete and CfP and CfI While Cognition for Improvising deals with the immediate, Cognition for Planning deals with larger time-spans. Somehow, immediacy is intimately related to the concrete. An immediate situation is a concrete situation. Concrete situations, such as a specific reply on a discussion list that triggers another reply are well catered by the Cognition for Improvising mode. On the other hand, abstraction is mostly achieved by a Cognition for Planning mode. 2.1.2.2 Bricoleurs, Planners and CfP and CfI Turkle & Papert (1990) develop a brilliant distinction between planners, who use most abstract thinking, and bricoleurs, who use mostly concrete thinking, in a paper about gender approaches to technology. In the following quote they evaluate how ‘bricoleurs’’ ways of thinking can be stronger than it would seem: It provides examples of the validity and power of concrete thinking in situations that are traditionally assumed to demand the abstract. It supports a perspective that encourages looking for psychological and intellectual development within, rather than beyond, the concrete and suggests the need for closer investigation of the diversity of ways in which the mind can use objects rather than the rules of logic to think with. Cognition for Improvising is certainly more present in bricoleurs’ activities that also deal more with the concrete, while Cognition for Planning is used in the abstract thinking of planners. Nonetheless, distinguishing what cognitions are at play in the writing of Wikipedia is more appropriate to do using a temporal distinction of cognition than a personal style of dealing with the world. It is not possible to divide people in using one or the other type of cognition because often both cognitions are used. In this sense, the distinction is more useful to understand contributions than contributors. 2.1.2.3 Know-how and CfP and CfI Varela (1999) in ‘Ethical Know-How’ explains that we have an ethical stance that may be explained into words, and analyzed in abstract terms such as ‘would it be ethical to go home or to help my brother’ but that we have an ‘ethical know-how’ by which, were we to walk down the street and see someone in need, we (some of us) would know immediately that it would be more ethical to help this person than to rush to work. This ethical know-how points to a kind of moral cognition that best deals with the work in immediate, concrete situations. Know-how is the word used for a kind of implicit knowledge that helps us navigate in the world. When speaking of apprenticeships, the apprentice is really learning the know-how of a profession by being immersed in the context (Lave & Wenger 1991). A Martian would not only not understand what we ‘are doing’ most of the time, but also would send the report back to Mars with all kinds of descriptions of things we aren’t even aware of because we either learned them in implicit form and have never been aware, or we did them so often, that our 155 know-what got transformed into a know-how, and we have now forgotten that we know. Many are the examples from our daily life for which we needed to dispend energy to learn explicitly, such as riding a bicycle or learning a language, but they become things ‘we know how to do’ – and to which we don’t need to use extra ‘attention brain power’ – allowing us to either talk on the phone & ride a bicycle at the same time, or argue & conjugate verbs correctly simultaneously. When Cognition for Planning can be broken down in pieces, such that some pieces are Cognition for Improvising, we are somehow following the jump between ‘knowing-what’ and ‘knowing-how’. One of the factors for this transition is the size of the unit, as explained earlier, while the other is that Cognition for Improvising, uses feedback loops and possibly the environment to a higher degree, and can therefore transform the activities into something more easy to tackle. That is what Varela’s ethical ‘know-how’ is proposing – that it is easier (and we know better) to decide an ethical question in situ than abstractly beforehand (also because it is hard to describe a situation with enough context to know what position to take: how is the weather, did you just have an argument at home, are there others to help, do you know the person, are you at war…). 2.1.2.4 Modes of reasoning and CfP and CfI The three modes of reasoning, deduction, induction and abduction, as identified by Peirce are an important categorization of the methods to acquire knowledge about the world and highly relevant to understand the stance of the conceptual distinction between Cognition for Planning and Cognition for Improvising in relation to them. Deduction concludes something necessary from a number of premises, and it could be called the rationalistic way. Cognition for Planning involves deductive methods of reasoning. In deduction, a number of premises are set up and then, using a set of logical rules, the conclusions are drawn in a fashion that takes planning (it could even be said that there is no new knowledge being produced when pure deduction is taking place – because it is knowledge that is already implicit in the premises and the use of the logical rules). Induction draws new knowledge from the world, deciding for example that swans are white after several instances of white swans (this knowledge can be wrong, but can be adapted at the first instance of a black swan). Cognition for Planning uses both deductive and inductive modes of reasoning, while Cognition for Improvising uses more abductive methods of reasoning. Abduction, is a nonmechanical mode of reasoning, which seems to be at play in instances where Cognition for Improvising is also at play: namely when an immediate, concrete situation, makes one ‘act’, then ‘think’. In other words, abduction is an important way to acquire knowledge, by developing an explanatory hypothesis. Peirce relates abduction to a social character of 156 enquiry, and suggest that drawing on more individuals might make abduction easier and creativity simpler. 2.2 Cognition for Improvising Surplus I propose that the Cognition for Improvising mode is responsible for the possibility of projects such as Wikipedia and other Wikis. As seen above, Cognition for Improvising is related to a concrete, know-how, and abductive way of knowing. Below it is argued that Wikipedia’s and other Wikis’ success is partially a result from harnessing a surplus of Cognition for Improvising. Cognition for Improvising is used in very immediate, concrete surroundings, quite often embodied, or in interaction (in conversation). Encyclopedias were still being written within great amounts of Cognition for Planning, as someone would plan the distribution of work, and once given an assignment, a scholar would plan the writing of an encyclopedic article. This work wasn’t absolutely individual, the article would be sent to the editors, and comments and corrections would be added. In the end, the editors would also check for style. Nonetheless, most of the cognitive work was being done with Cognition for Planning. Cognition for Improvising is a mode that can be used for incremental writing and therefore contributes to the success of writing an encyclopedia with Wikipedia’s process. But this possibility is independent from ethical and motivational reasons that stimulate people to contribute to Wikipedia. These motivations are the interest in belonging to a greater project, the security of the copyleft license and the interest in doing good, which are all crucial for Wikipedia’s success. Also important are the many architectural features of Wikipedia and Wikis, which allow for discussion and negotiation, and the possibility of shaping the metalevel of Wikipedia. Wikipedia is possible because Cognition for Improvisation can be used. Cognition for Improving can be used because there is a surplus, because we don’t use all of our cognitive capacity. Clay Shirky speaks of the “cognitive surplus” (Shirky 2008a), in anecdotal form, when in a lecture he tells the story of explaining to a TV-producer the intricacies of making a Wikipedia article, to which he gets the question “But where do people find the time?” His witty answer is, "No one who works in TV gets to ask that question. You know where the time comes from. It comes from the cognitive surplus you've been masking for 50 years." Yochai Benkler, who has analyzed what he calls the “commons-based peer production” from an economic perspective in the book The Wealth of Networks (Benkler 2006), speaks of the difference between market and nonmarket production and describes some of the necessary characteristics of peer-production, in order for it to harness the excess capacity of time and interest in human beings. The processing, storage, and communications capacity in computers are available to be used for activities whose rewards are not monetary or monetizable, directly 157 or indirectly. Benkler describes extremely succinctly what the processes are by which the harnessing of this excess capacity can be effective: For this excess capacity to be harnessed and become effective, the information production process must effectively integrate widely dispersed contributions, from many individual human beings and machines. These contributions are diverse in their quality, quantity, and focus, in their timing and geographic location. The great success of the Internet generally, and peer-production processes in particular, has been the adoption of technical and organizational architectures that have allowed them to pool such diverse efforts effectively. The core characteristics underlying the success of these enterprises are their modularity and their capacity to integrate many fine-grained contributions." [in The Wealth of Networks, (Benkler 2006)] 2.2.1 Cognitive Overload The notion of a surplus entails that there is cognitive power that is ‘not being used’. Cognitive power is a capacity that can be used in different ways. A mechanism that is behind the possibility of a greater use of Cognition for Improvising is diminishing the cognitive overload, and thereby allowing for a greater use of Cognition for Improvising. The notion of cognitive overload goes at least back to Simon (1978) in writing “On how to decide what to do”. For example, some tasks take immense cognitive power, such as writing a thesis; where I need to decide what to write about, decide to sit at this precise moment, and also what to write (and much more…). If the tasks can be broken down into parts that are already defined, then, instead of using so much Cognition for Planning, one can use more Cognition for Improvising. A story from Zen and the Art of Motorcycle Maintenance (Pirsig 2008) may elucidate this: the son is stuck wanting to write a letter to the mother and not knowing where to start. The father suggests that he’d keep it simple – he should first write a list of the things he wants to say and then make the decision of which one to say first. To both One of the tricks of thesis-writing is to balance well between the greater goals and the small goals. Forgetting the greater ones as often as possible, usually works, except when inspiration is lacking and is important to re-understand how it all fits together. Please see the MetaPhd installation and performance, September 2010, for more thoughts on applying these theories to the act of writing a thesis. decide what to write and what to write first can be too big a cognitive task, and therefore there is a cognitive overload. 2.2.2 Benkler’s Modularity and Granularity Breaking a cognitive task into smaller parts, relates to the modularity and granularity concepts proposed by Benkler (2006). “Modularity” describes the extent to which a project can be broken down into smaller components. These components can be produced independently and can later be assembled into a whole. “Granularity” describes the size of the components, in terms of the time and effort that an individual must invest in producing them. When a project has modules of small size, it more easily harnesses Cognition for Improvising. 158 2.2.3 Wikipedia as an encyclopedia and text The nature of Wikipedia as both an encyclopedia and its support as text (in contrast with many FLOSS (Free Libre and Open Source Software) projects which are programs and written in code) increases the way by which Cognition for Improvising can be used as Wikipedia is quite modular and very fine-grained: it can be built from many small contributions. An encyclopedia is really a collection of articles; an article is a small module of cognitive ‘coherence’ (smaller than a book, for example). Moreover, text has a very small granularity, allowing contributions as small as the fixing of a comma or the addition of a reference. 2.2.4 Anderson’s Long-Tail and Margin Advantage Chris Anderson (2006) in the book “The Long Tail”, explains that a long tail is a simple property of networks with power law. In Amazon and book selling, it means that few books get bought by many people, while many books have very few buyers because many people have uniquely odd preferences. One of the advantages of Amazon (conferred by it not having a physical store limited by stock and limited space) is that it caters more easily to all those people who have choices that are less mainstream, which are part of the long tail. Catering to this long tail, it is also a margin advantage, by making a difference ‘that makes a difference’. The analogy with Wikipedia and Cognition for Improvising Surplus is that there are people out there with 5-minute slots in their hands and these constitute a long tail of contributions, which make Wikipedia grow and grow. In this way, Wikipedia also plays in this margin advantage of pulling work in small increments. Its sheer size makes us ponder and end up writing theses like this to understand these phenomena better. 2.2.5 ICT & CfP & CfI The distinction put can also be used to understand greater patterns of the information and communication technologies. Lawrence Lessig, the scholar who started Creative Commons and who is the greatest advocate for a review of copyright to increase the freedom, describes, in the book “Remix” (Lessig 2008) two cultures, which are present in the Internet. RO, which stands for Read-Only, applies to sites where one can only consume the information, such as newspapers; and RW, which stands for Read-Write, applies to sites where one can directly interfere, by commenting, changing and engaging, such as blogs and Wikis. These two cultures, in respect to the same issue (such as publishing) are examples of the two economies that are present, the commercial economy and the sharing economy. These have run in parallel for a long time. Lessig advocates that a change of the law is necessary in order to not criminalize the sharing economy, and shows that there are many possible hybrid models, in which both economies are present, such as Free and Open Source Software. The appeal to the 159 RW culture is derived from the possibility to use the Cognition for Improvising surplus, which allows for a whole segment of remixes to exist and thrive. 2.2.6 Kindness-Trust Surplus Cognition for Improvising is the ‘how’ by which Wikipedia thrives. The surplus results from the fact that Cognition for Improvising is a mode for immediate, concrete situations. These immediate, concrete situations didn’t use to include the writing of encyclopedias. Similarly, as I speculate here, in what concerns the motivations behind contributing to Wikipedia, there is a surplus, where people have time, effort and kindness available to do things outside the markets and the quest for survival. Although lives are complex, in the normal lives we lead we usually act out of kindness and trust to those close to us, and we do fewer acts of kindness and trust to those farther away. We may, though, have a greater potential to do these acts of kindness and of building trust than what is necessary for building the close relationships, and therefore there is a surplus that can be exploited. It is possible to harness this potential because it responds to the human motivation of following ‘higher’ values, being part of something ‘greater than themselves’, contributing to the common good, altruism, and engaging in community. This tapping of the ‘kindness surplus’ is possible because there is an environment that feels trustworthy, safe, useful, and therefore the kindness-trust can be expressed. We were used to rely upon trust and kindness in a small immediate environment; now, with the right values, technologies and affordances, we can harness those capacities to produce something not any longer in the small immediate scale, but at a greater scale. Some Internet projects have been more equalitarian, providing a space for trust at a distance, despite their rich-white-western biases. These new peer-production models somehow ‘short circuited’ these distances, and trust and kindness became visible. The other main reason for Wikipedia’s success, namely the architectural features that support discussion, negotiation and governance, has been discussed in the theoretical perspectives chapter. 2.2.7 2.2.7.1 Wiki Case and Wikipedia Particular Wiki characteristics Wiki characteristics such as watch this page, recent changes (especially when Wikis are smaller), and discussion pages – all support an immediate, reactive, and concrete mode of interaction and contribution, which uses Cognition for Improvisation. Just replying to a point in a discussion or fixing a typo in someone’s just added paragraph are behaviors that contribute to the whole. Watch this page is an attention-grabber, whereby it is easier to reply to a change that was made, by correcting, improving, or reverting if it was the case of a small mistake, a good addition or an act of vandalism. Recent changes was the most important feature of Wikis. Ward Cunningham, their inventor, supported that by saying, “we knew 160 48 where the action was taking place” . They also pointed the attention to where something was happening. A loose comparison would be to say that there is less need for Cognition for Planning if one were to walk by the main square of one’s village and suddenly saw a group of people gathered. It would only be natural to join them and improvise a conversation with a friend or an acquaintance, using Cognition for Improvising. Both watch this page and recent changes (and similar functions) also play on the stigmergic effect (Heylighen 2006) whereby a change (an edit) left in the environment (an article), is a communication device about the possible next change to do. As for discussion pages, the implication of Cognition for Improvising is even more direct – engaging in a discussion is interacting back-and-forth, using more Cognition for Improvising than Cognition for Planning. 2.2.7.2 Division of work These two cognitions also reflect some of the spontaneous division of work that has been seen in Wikipedia. While the addition of a substantive piece of text is something that happens mostly using Cognition for Planning, the small tinkerings are done with the Cognition for Improvising. In terms of number of edits, there is a clear split where few edits add much previouslythought, carefully structured content, while many edits add a small change that is a quickreactive contribution. The division between these two groups of edits follows the division between Cognition for Planning and Cognition for Improvising. The work done by bots (small programs that edit systematically) fall out of this distinction, as their ‘behavior’ is mostly syntactic (example: “find ‘tpyo’; replace by ‘typo’”) and bots do not use the more intricate semantically rich notions of planning and improvising. 2.2.7.3 Self-reported motivations & CfP & CfI In relation to Wikipedia, it is obvious that both types of cognition are at play. This is true for the writing of Wikipedia articles where some edits are plainly ‘adding information’, making more use of Cognition for Planning, while others are ‘clarify info’ and ‘fix typo’, making more use of Cognition for Improvising. The massive use of Cognition for Improvising accounts for the many actions in Wikipedia that are ‘bottom-up’, such as the division of work. There is, though, also a hierarchy and a structure of policy, with norms that are top-down (even if mostly arose bottom-up). But these modes of editing are also present in the selfreported motivations of contributors, where the separation of big/small goals, planning/improvising, bottom-up/top-down shows the self-awareness for why people 48 Cunningham, open Space, WikiSym’09, personal communication. 161 49 contribute. In the Wikipedia-wide survey – the top two reported self-reported motivations were: 72% I like the idea of sharing knowledge and want to contribute to it 69% I saw an error I wanted to fix These self-reported motivations show the inclination to the greater utopian hope, which include the use of the Kindness-Trust Surplus. This utopian hope is represented by the motto “The Free Encyclopedia That Anyone Can Edit” and decisions of non-profit, early GFDLlicensing, late CC-BY-SA-licensing, But these self-reported motivations also show that Wikipedia taps into the resources of the Cognition for Improvising Surplus by the motivation of contributors of simply fixing an error and contributing incrementally. 49 Philipp Schmidt, talk at WikiMania’09. 162 3 WHAT WIKIS DO FOR COGNITION On the previous section I proposed a distinction between Cognition for Planning and Cognition for Improvising, which was a way to discuss how distinct types of cognitive processes makes ‘Wikis’ and in specific Wikipedia work. In this section I will discuss the other direction of possible implication – if before we looked at the role of cognition in Wikis’ success, we will now look at the role of Wikis (and to some extent their general context of the internet and digital age) for human cognition, individual and collective. This section is about a much stronger claim than the previous one, and one, as far I can see, very far from being settled, or even properly defined. I will sketch some relationships that will eventually lead to a better understanding of the role of technological artifacts such as Wikis on the way we think. Nonetheless, I remain agnostic about this issue. Moreover, as it is very difficult to 50 define the precise statement : “Human cognition is changed/enhanced/extended by the use of Wikis”, the most that can be accomplished here is to draw some thoughts on very general claims, which include the nature and production of information, Internet’s role and the positioning of cognitive tools, while ‘Wikis’, in their specificity, end up belonging to these more general categories. 3.1 Cognitive milestones One can speak of different media milestones, in the evolution of media. Some of these have been accompanied by milestones in cognition. There is a theory about which we are witnessing what could be the very beginning of the 4th cognitive milestone (Harnad 1991). If this cognitive milestone is present, Wikis are a part of it. Wikis’ particular role in this (beginning of a) milestone, would be that they may be unfolding the potential for Cognition for Improvisation to be expressed. It could be that, besides the capacity for humans to “improvise” is being harnessed, that this capacity is changing, augmenting. A possible way to test this issue, would be to study ‘Wiki-users’ against ‘non-Wiki-users’ and investigate their differences in respect for a capacity to cooperate, to solve immediate issues. Nonetheless, it would still be biased as ‘Wiki-users’ may have started to do Wikis, precisely because of their greater capacity and/or willingness to collaborate, solve immediate issues and the like. Several changes in the development of our species and its cultural history have contributed to cognitive changes. The first, language, which may be the leap that humans took from animals, has even changed the way our brains work. Others, such as writing, may have changed our 50 Defining cognition is as difficult as defining intelligence, which keeps being redefined as ‘what robots don’t do’. What is it to use a technology? Does it change the individual’s cognition, or the cognition capacities of the species? What what is ‘change’? Influence? How? How not? How to measure? Or do Wikis and the internet change our conception of cognition, either by defining it as something independent from the environment, or as something present everywhere. 163 brains but have certainly changed our cultural heritage, and our ways of thinking. The cognitive milestone that is mostly interesting to use for a comparison here, is that of print, the invention of movable type. Two of the greatest innovations of the digital area are the digital information’s size and cost, which alters the way people can access knowledge, copying becoming so easy, and changing the organization of knowledge. Another way is the way knowledge is getting produced, in quite different parameters than before. 3.1.1 Comparing with Print Print made it possible for written text to be accessible by many more people. For purists, it doesn’t count as a cognitive milestone, while for others it is an example of a difference in degree that became a difference in kind. What is fascinating to think about print is that it had a special role in the reformation of Europe. While on one hand it is far-fetched to say that it was because of the invention of movable type that the European religious and political landscapes changed, on the other hand those reformations wouldn’t have been possible without it (Shirky 2008b). In a sense, it was not because of print that there was a reformation in Europe, and simultaneously, it wasn’t possible without it. It was a necessary but not a 51 sufficient condition . 3.2 Information’s support Information occupies space, but nowadays it is encoded in ways that the scale is much smaller than to store information on paper. While a book needs to be stored within 15cmx20cmx2cm of space in the paper format, the text of a book takes about a 77th of a square millimeter in today’s digital format (1MB). This is another example of a difference in size, or in degree that leads to a difference in kind. Because information plus today’s technology occupies so little space, it has reduced to about zero the costs of copying. This changes the landscape of sharing; giving you a copy of my e-book costs me close to nothing and does not prevent me from reading on it. 3.2.1 New organization of knowledge David Weinberger (2007) devoted a book “Everything is Miscellaneous” to the new digital order, which is highly based on the fact that information today stands quite differently in relation to materiality that at a time when each item had one single place in a taxonomy or library classification system. This provides new possibilities of organization, that may also better fit the way minds organize – not by a stiff hierarchical organization, but by ‘pulling out’ what is needed at the moment. It has also, as Weinberger argues, made it much easier to 51 Of course the meaning of necessary should be taken lightly, as in a contingency. In a counterfactual world, it would certainly have been possible the reformation of Europe with some other invention other than the moveable type. 164 have second order metadata (what book is it, title, author, etc). While before this metadata was coded on a library card which needed to be physically stored somewhere limiting the organization of knowledge (alphabetically, historical), now this information is accessibel in much easier ways. And, even more important, information’s nature in the digital support, allows for a third order of organization – an order we see in search engines, Amazon ‘suggestions’ and tagging. This third order, Weinberger argues, fits better knowledge, as other attempts of organizing knowledge (libraries for books, kingdoms for species) are always biased by the time and the organizer. The third order allows for people to fulfill their own need in the moment, be it gathering photos of Copenhagen’s Pride Parade, or collecting papers on Wikipedia. 3.2.2 Add, then filter Similar to Weinberger and his emphasis on the new miscellaneous order made possible by the way information is stored, Shirky (2008b) in “Here Comes Everybody” addresses a consequence of the way information is stored: there is no cost for failure in adding content, because one can always filter later. It has changed how people go about the production and retrieval of information – anything can be put out there, because the cost is almost zero. The third order of organization of knowledge, where it is possible to filter or search for what one is looking has a consequence in the way we interact with information, as a metaphor by Weinberger (2007) points in this direction: "The difference in the digital order is the difference between the annoying interactions you have on a product support line—"Press 1 if you're calling about a medical emergency. Press 2 if you're calling about billing"—and the conversations you have with real people." 3.3 Information production’s consequences Although the cognitive case is a hard one to settle, at least, information production has a great impact in our lives and freedoms. As Benkler (2006) argues for its impact in the shape of freedom and in the political consequences: "How we make information, how we get it, how we speak to others, and how others speak to us are core components of the shape of freedom in any society.(…) Market-based, proprietary production has often seemed simply too productive to tinker with. The emergence of the networked information economy promises to expand the horizons of the feasible in political imagination." He also argues, more specifically about the ways in which the networked information economy can increase our autonomy: “The networked information economy improves the practical capacities of individuals along three dimensions: (1) it improves their capacity to do more and by themselves; (2) it enhances their capacity to do more in loose commonality with others, without being constrained to organize their relationship through a price system or in traditional hierarchical models of social and economic organization; and (3) it improves the capacity of individuals to do more in formal organizations that operate outside the market sphere." 165 4 4.1 IMPLICATIONS FOR COGNITION ESD Cognition Umbrella The distinction put forth between Cognition for Planning and Cognition for Improvising builds upon the dynamic and ecological views of cognition, which encompass Embodied, Situated and Distributed Cognitions (ESDC), a major trend in cognitive science (Robbins & Aydede 2008; Payette & Hardy-Vallee 2008; Dror & Harnad 2008b). In the last 15-20 years, these cognitive theories have set the focus on the embodiedness and embeddedness of the cognitive processes. In other words, these theories are not satisfied with the computational and brain-limited cognitivist theories and support, to a greater or lesser degree, that the environment, artifacts, and the body are important parts of the cognitive processes. The Extended Mind hypothesis, put forth by Clark & Chalmers (1998) also plays a major role in these discussions, because it questions the philosophical place of the mind, when confronted with the claim that the mind might do more than just sit in the brain and compute purely abstract issues, but supports a stronger ontological claim than the weaker versions of ESDC. In this context, even a theory of cognition being ‘coordinated non-cognition’ (Barsalou et al. 2007) has been put forth. There has been a long discussion of what cognition really means – is thinking purely mental symbol processing, is it problem-solving or is it informationprocessing involving body and environment? Although it is beyond the scope of this section to resolve the issue of what cognition really is, taking into account the notion of cognitive artifacts is useful when speaking about Wikis. Cognitive artifacts can range from physical objects, to behaviors, to processes that are used to aid, enhance or improve cognition. Some examples are a calendar, a shopping list or a computer. Wikis play a role in the cognitive processes of collaboration, and in the case of Wikipedia, a Wiki is the mediating technology in the writing of the biggest encyclopedia. The distinction in this chapter, that between Cognition for Planning and Cognition for Improvising is inspired by ESDC approaches but is also transversal and complementary to those theories as the ESDC theories are mostly concerned with a spatial position of cognition, while this distinction is mostly concerned with a temporal position of cognition. There are different ways to use the terms, including the idea that Situated Cognition is the umbrella concept, under which fit theories of embodiment, embeddeness, distribution and extension (Robbins & Aydede 2008). More relevant though is their analysis of the three central ideas: that cognition depends also on the body; that cognition exploits structure in the natural and social environment, and that the boundaries of cognition extend beyond the boundaries of individual organisms. Michael Anderson (2008) in “Embodiment and the Nature of the Mind” (Payette & Hardy-Vallee 2008) defends that 166 "We ought by now to be in a position where the embodied, situated and distributed approach(es) to the study of mind are seen not primarily as criticisms of the prevailing paradigm, but as established, vibrant and fruitful research programs in their own right, needing no justification other than their own success." ESDC have come into their own, but still, by their mere existence, these programs re-question what is the mind and what is cognition and it is a good opportunity to discuss their relation to meaning. 4.2 Cognition vs. Cognizing Dror & Harnad (2008b) define more finely what is distributed cognition and what is not. They argue that mental states are not distributed beyond the individual brain and that cognitive states are mental states, and therefore also not distributed beyond the individual brain. One way to ‘see’ this is to ask the question of ‘who is the cognizer’ or a question of ‘who is cognizing’, to which we have an intuition to not answer: ‘the whole system’, or “‘the whole 52 world’ was thinking just so this one thought is here” . Even though cognitive states are kept, cognitive technologies can enhance the cognitive performance of its users, and even, transform our cerebral lives, they argue. But what is cerebral life? The claim that stimuli changes brain states? Hardly surprising. 4.3 Cognition is Thinking and/or Problem-Solving and/or InformationProcessing? Views of what cognition is or is not depend highly on what is the quick definition that is referred to. If cognition is thinking (following the etymology), and if we know what thinking is, then cognition is very related to mental states, to meaning and to consciousness. Cognition is reasoning, use of cognitive faculties (like logic, argumentation, structured knowledge) to reason about problems, etc. If it is thinking, we can use the strategy of Dror and Harnad, in which one asks the verb: can I say that I cognize? Yes, then I have? Cognition. We use this strategy to know how better to make the distinction between what is cognition and what is not: a thinking, conscious-of-thinking creature cognizes, while the paper used to write down a phone number doesn’t. If it is ‘just’ problem-solving, then the paper with the phone number can be a part of the problem-solving process – well, the solving of the problem “what is my friend’s phone number?” plus a look a the paper, and the retrieval of the phone number. If cognition is even less human-centric and is just ‘information processing’ – then the paper is certainly the keeper of the information of the phone number and must therefore be part of the process of ‘information processing’. 52 Appealing to intuition is appealing to a subjective common sense, which most researchers have of separation from the world – including the other-minds problem (this subjectivity wouldn’t be appealed to in, say, Buddhist circles). 167 4.4 Extension ESDC approaches have done a wonderful job in increasing the importance of the study of body, situation and cognitive technologies, but there are still debates going on about their nature, in specific the necessity or not of the Hypothesis of Extended Cognition (HEC) — a vision of cognitive processing itself as sometimes quite literally extended into the organismic environment. For example, going down the lane of supermarket desks, I use the variety of visible commodities in my immediate environment to extend my buying-for-dinner-planning. Now, thinking, here, involves processes like (a) recognizing this box a box of cream; (b) associating cream with sauce, (c) associating sauce with my squash (remembering earlier instances of... etc. etc.). Having the thought “cream sauce” is both an “inner” mental state, yet, as I could not, in that situation, have evoked that state without that cream box (it was maybe a necessary condition for my thought of squash sauce), this thing in my environment, externally to me, is, at the same time, a sign being part of the cognitive system that I am a part of: my embodied cognition + my supermarket. Or if it is enough to stay with the Hypothesis of embedded cognition (HEMC) — in which: “cognitive processes depend very heavily, in hitherto unexpected ways, on organismically external props and devices and on the structure of the external environment in which cognition takes place” (Rupert 2004), then we might not need to claim the cognizing systemto be extended to the whole supermakert, but only strongly interdependent upon inputs and exchanges with that external system. 4.5 Ecological The ESDC approaches have an ecological stance because the relationship between the object of study and the environment is important. Hutchins (1996) in Cognition in the Wild says this explicitly: “I hope to evoke with this metaphor [cognition in the wild] a sense of an ecology of thinking in which human cognition interacts with an environment rich in organizing resources.” Bateson (1972) had already been a precursor in this stance, where in his “Steps for an Ecology of the Mind”, where ecology is taken as the comprehensive science of the relationship of the organism (the cognitive process, the writing of a Wiki-article) to the environment (the context, Wikipedia and the distributed collective writing of the article). There are many possible stances regarding the position, the place and the meaning of cognition. For the possible metaphysical implications of the place of mind, although the distinction between HEMC (Hypothesis of Embedded Cognition) and HEC (Hypothesis of Extended Cognition) is put forward here, it is not directly related to the first claim that will follow – that Wikis support the showing of the Cognition for Improvising. As for the second claim, that Wikis do play a role in changing cognition – it is possible to circumvent the discussion between HEMC (Hypothesis for Embedded Cognition) and HEC (Hypothesis for Extended Cognition) – the same way that no ‘extended mind’ is needed – there are only 168 needed the notions that cognitive technologies play a role in the cognitive process (in the moment by participating – coupling is enough, no need for constituency; and in the long run by changing the way and what we think). In other words, the work done here is dependent on the dynamic and ecologic based views of the process of thinking but not on the new metaphysics of mind. Overall, cognition can be a broad-shouldered word (or a concept that always floats – see box) and we can also proceed by drawing the attention to a core of what we mean by cognition, insofar as we don’t engage in drawing (sometimes thought as revolutionary) metaphysics from those claims. In the very least cognitive technologies are there, and have a major impact The Floating Hypothesis Please see the TANK installation by Di Ponti, September 2010. In it, she explains that words such as love, intelligence and cognition are names we all praise highly, and we want to keep them afloat, no matter what gets to be found or discussed. For some people, one should call ‘love’ the most unconditional feeling, to other people ‘love’ should be the act of conditionality (“If I love you, I set you free” vs. “I am jealous as a proof of love to you”), showing that people want to preserve the world ‘love’ so that, no matter what, it still means ‘the highest feeling I know of’. Intelligence which used to mean a capacity to process rationally, was found to not be enough to tackle the world, and was therefore subdivided into emotional intelligence, social intelligence and some others of the kind. As everyone wants to be intelligent, intelligence adapts its meaning to be encompassing. But not only, intelligence wants to float, and be kept human – it also speaks to the typical AI curse: we keep redefining intelligence so that what was ‘just achieved’ doesn’t count as intelligence. As for cognition – the word wants to be kept by those advocating it is also in the world, in thermostats and in bodies, keeping the word afloat as well. The installation consists of a tank (in Danish meaning both ‘thought’ and ‘tank’) where several concepts are pieces of wood and other materials, which can be linked (also) by the participants. No matter with what they are coupled, the pieces representing ‘love’, ‘intelligence’ and ‘cognition’ stay afloat. in our world. 4.6 Understanding Cognition The distinction between Cognition for Planning and Cognition for Improvising has helped us make sense of the different kinds of contributions to Wikis and Wikipedia. Here I will address how this distinction can be useful for the greater panorama of cognition studies, especially those where the studies focus on ‘beyond the brain’. 4.6.1 Dynamic, Complementary, Transversal While cognitivist approaches to cognition place the limit at the brain, and embodied, situated and distributed approaches speak of the embeddedness (at least), or of the extension (at most), the distinction put forth in the previous section is transversal to all of these approaches. The ESDC approaches are dynamic-systems based, ecologically oriented models of the mind, which focus on the study of the dynamic interactions among mind, body and world. Therefore, one speaks of both a boundary (a spatial change) — well covered by the arguments between ESDC approaches, and of a temporal change – a dynamic, which is marked by a change in speed and scale. Therefore, the distinction between Cognition for Planning and Cognition for Improvising is both a complementary and a transversal approach to studying cognition. Cognition that happens only in the brain, that may be fully abstract, can both be Cognition for Planning and Cognition for Improvising. Cognition for Planning would be a chain of thought with a very clear goal, and a fair amount of weighting, processing, cognizing. For example, making a big decision – for example, if or not, to move to a new city – can be done by weighting the pros and cons. On the other hand, going from one goal to the next, is using Cognition for Improvising. A cognitive processs seen as embodied (one doesn’t 169 need to postulate that the body is a cognizer to have the body be an element in a cognitive process) can also involve mostly the Cognition for Planning or the Cognition for Improvising. Cognition for Planning is the cognition used in acrobatics, the mechanisms worked out, and trained repeatedly similarly to learning how to solve mathematical equations. Cognition for Improvising in the embodied manifestation is the one used, well, to improvise dancing. In the movement world, this is known as the conflict between technique and improvisation. In the embedded focus of placing cognition, the simplest example of great use of Cognition for Improvising would be in a conversation, which can also be called interactive cognition, while an interview is rather more the use of cognition for planning. 4.6.2 Bridging In “Relating Embodied and situated approaches to cognition”, Reichelt & Rossmanith (2008) acknowledge that the approaches of embodied and situated cognition are mostly connected as criticisms to the cognitivist approach that in its too great focus on abstract problem-solving it has forgotten, or mostly avoided dealing with real-world problem solving. They argue that although they unite in wanting a change of focus, the two approaches are hardly related to each other, as they even come from different traditions – embodied cognition studies are biology-driven, while situated cognition studies come from socio-cultural studies. These different origins also bring different methodologies, from physiology to ethnography. The distinction made in this study, between Cognition for Planning and Cognition for Improvising is, albeit transversal to all of the approaches, is useful to see better the similarities between these research directions. While the focus from the cognitivist studies has been mostly on the Cognition for Planning, which is easily done abstractly, both embodied and situated cognition approaches remind us that real-world problem-solving is of foremost importance, for which Cognition for Improvising plays a necessary role. Cognition for Improvising is crucial in understanding cognition, especially at the interface, at the place where the interactions with the world are the main subject of the research. When a Wikipedian looks at the computer, they may look at a discussion on a Wikipedia page – that interaction with the world triggers a reaction, wanting to respond, which is really Cognition for Improvising at play, and then fingers type, react to what had been said. Embodied and Situated Cognitions redirect cognitive studies to the fact that cognition is embodied and embedded, but also, by being much about cognition in the wild, in the real world, they redirect cognitive studies (including this one) to play closer attention to cognition that is not only of the well-planned, well-abstracted kind, but to cognition that is also interactive, reactive, quick and improvised. 4.7 Transience Consequences Wilson & Clark (2009) propose Transient Extended Cognitive Systems (TECS) which are 170 “A soft-assembled whole that meshes the problem-solving contributions of the human brain and central nervous system with those of the (rest of the) body and various elements of local cognitive scaffolding.” They also divide these TECS into some that are one-off, new assemblages, as when solving a new brainteaser, or others, that although temporarily assembled are regularly repeated such as the TECS in use by a practiced crossword solver. The division between Cognition for Planning and Cognition for Improvising can have consequences in this argument as well. While mental states hang on to the continuity of our minds, even if we don’t know precisely what that means, or if the “I” is really just an illusion following Hofstadter, that’s also where we hang the intuitions about “cognizing”. Cognition for Planning – as it is a longer goal, longer cognitive process relies more on this continuity, and therefore we often associate it with mental states processes. On the other hand, Cognition for Improvising, using its much smaller units of thought, can more easily be constituted, if there is anything as such, by systems that are adhoc-ally cognizing systems, in their transiently assembled kind of systems. Nonetheless, studying Wikis, as one more instance where cognition is implied can inform immensely about cognition – understand better the role of cognitive technologies, of externalizing the process of producing information, of interactive minds, of sizes of goal modularity, and of the effects of harnessing large-scale a surplus of cognition for improvising, usually seen, but not restricted to real-world, close by, reactive cognitive processes. 171 172 V. EXPLORATION: ARTISTIC/EXPERIMENTAL/EXPERIENTIAL 2 3 4 5 1 Motivation and Explanation, serving as “The Introduction” ...................... 175 1.1 Art and Science......................................................................................... 175 1.2 WikiWay .................................................................................................. 176 1.3 Action/Design-based research ................................................................... 176 Our Coll/nn/ective Mind Proposal & Summary................................................ 177 2.1 Summary .................................................................................................. 177 2.2 The Theoretical Framework...................................................................... 177 2.3 Methodology ............................................................................................ 179 2.3.1 The installation .................................................................................. 179 2.3.2 The Practice ....................................................................................... 180 2.4 The Scope, The Hypothesis, The Questions .............................................. 180 OCM Report “The Results” ............................................................................. 182 3.1 WikiWars ................................................................................................. 182 3.1.1 Edits................................................................................................... 182 3.1.2 Pictures .............................................................................................. 184 3.2 WikiSym 2010.......................................................................................... 186 3.2.1 Edits................................................................................................... 186 3.2.2 Pictures .............................................................................................. 187 3.3 Wikimania 2010 ....................................................................................... 188 3.3.1 edits ................................................................................................... 189 3.3.2 Pictures .............................................................................................. 189 3.4 Comparison of the reception at WikiWars, WikiSym and WikiMania ....... 190 3.4.1 Documentation................................................................................... 190 3.4.2 Thoughts............................................................................................ 191 OCM Thoughts “The Discussion & Evaluation”.............................................. 192 4.1 Revisiting visualization-image-mindmap-inspiration................................. 192 4.2 Revisiting Open Space Technology inspiration ......................................... 192 4.3 Revisiting it as a Live Wiki....................................................................... 193 4.4 Thoughts on.............................................................................................. 193 4.4.1 Research itself.................................................................................... 193 4.4.2 Physicality/Presence........................................................................... 194 4.4.3 WikiWay............................................................................................ 194 4.4.4 Abduction & Catalyzing Creativity .................................................... 194 4.4.5 Harnessing CfI ................................................................................... 195 OCM Future .................................................................................................... 196 5.1 Visualization/Communication................................................................... 196 5.2 Discussion time ........................................................................................ 196 5.3 Game ........................................................................................................ 196 5.4 How-to guide? .......................................................................................... 197 In this section is presented work, which is perhaps at the boundary of scholarly work, as it consists of an installation to make a ‘living-Wiki’. It starts with a motivation and explanation that serves as the “introduction”, continues with the description of the project, which was accepted at WikiWars (conference, Bangalore, 2010) and written by Anne Goldenberg and me, which includes a “theoretical framework”, a “methodology” part and a number of 173 “research questions”, then it is followed by a small report of what happened at the three conferences it was presents, what the “results” are and finally, based on my talk presented in the end of the conference WikiWars “OCM Thoughts”, there is the “discussion”. This project was a joint creation by Anne Goldenberg and me, resulting, in part from a previous collaboration related to Wiki-writing (see appendix W), and in part resulting from our independent (re)searches with interactive events. The conception of it as a living-Wiki was produced together, although Anne presented the first thematic “Our Coll/nn/ective Mind” at Artivistic (Art+Activism Conference, Montreal, 2009) before WikiWars in Bangalore. The report and the thoughts are influenced by our collaboration, but written by me. Publication status: 1: “Motivation and Explanation serving as “The Introduction”” is the introduction to the chapter. 2: “Our Coll/nn/ective Mind Proposal and Summary” — was accepted at WikiWars, Bangalore 2010. They financed our trip to Bangalore, where we made the ‘Our Coll/nn/ective Mind’. It is, at least, equivalent to a conference paper. It was written and thought in collaboration with Anne Goldenberg. 3: “OCM Report “The Results” is a short report about what happened (pictures and video by Nóemie Nicolas). 4: “OCM Thoughts “The Discussion”” is based on the talk presented in the end of the conference WikiWars, Bangalore 2010: “Just Ask, my Elephant”. 5: “OCM Future” are, well, future ideas. 174 MOTIVATION AND EXPLANATION, SERVING AS “THE INTRODUCTION” 1 Research is, after all, questioning, it is searching for understanding. There are many different methods to do so, some quite hands-on, making experiments, some just using rationality and logical consequence. Now, “to experiment” has two senses: one in which it means “a test, especially a scientific one, carried out in order to discover whether a theory is correct or what the results of a particular course of action would be” and the other means “an attempt to do something new, or a trying out of something to see what will happen”. In its initial days, science was just following the idea expressed in trying something new to increase knowledge, and with time, a whole enterprise was constructed, which defined a scientific method, and a course of action, which now defines an experiment as something quite bounded. But experiments in the second, more open, sense are still there, reachable by hands and by ideas, and experiments in this general sense can still, or so I’d argue, contribute to research and to the act of learning. Experience is a way to engage with more than one sense, and a way to go beyond the intellectual capacities; experiences engage also the emotional, the visual, and the presence. To explore, by experimenting and experiencing, should still be part of learning. An education, such as a Ph.D. project, should really still include a process of learning that matters to the individual person. The following pages report the result of this more experiential part of the project, done in the last years’ research process. Also, it has served as a way to take the educational process of a PhD into my own hands, following “WikiWorld” (Suoranta & Vaden 2010): "A good educational arrangement as a convivial system would, then, provide everyone who wants to learn at any time in their life with access to available resources; it would empower people to share their knowledge; and it would give an opportunity to people to present an issue to the public whenever it is necessary." 1.1 Art53 and Science Art and science have been holding hands for as long as the authorities weren’t looking, or so I claim. Sometimes their relationship can be reduced to the idea that art is only helping science to communicate, that art is only a means of distribution, a means of outreach or a means of illustration. But the aim is usually bigger, because it is possible to use other means, such as some artistic ones, to understand, criticize and comprehend science. It is more often taken as if ‘science and art’ were that boundary where the visual arts contribute to or grasp the beauty of science. But there are other senses out there, which can also be a beautiful way to play with all the science and research concepts. The contours of what is being done are also fuzzier, for this project is a way to educate (what is a Wiki?); a way to learn (what does openness do, 53 I refuse to start a discussion about what art really is. In this context, “artistic installation” was the term used for lack of better options, not knowing what else to call this activity, ‘art’ seemed innocuous. 175 what boundaries are, or are not, necessary?); a way to interact (let’s add a bot on top of the ‘sock puppet’); a way to cooperate (together we can construct the cube); a way to explore; a way to play. This relates also to the greater relationship between the ‘how’ and the ‘what’. Although at times it is possible to separate them, usually they have an intricate relationship. What is a Wiki, but a how, a tool, a form? ‘Just’ translating concepts between different media and expressions, allows for new avenues to arise and be thought. 1.2 WikiWay To know the world, one must construct it.- Giovanni Battista Vico The interest in trying something else arose from the engagement with Wiki culture, with the WikiWay. The WikiWay is an informal term that grasps several of the attitudes and methods that have been fuelling hacker culture, free software and open content movements. A ‘Wiki’ is such a paradigmatic example of openness and inclusion (at least as a starting point), that most people understand the principles behind ‘WikiWay’: do first, ask after; Fix it Yourself, Do it Yourself; Provide Freedoms; be proactive, Daring (be bold). It has been this set of values and approaches that first drew my interest to Wikis. In order to engage with the material, it is also necessary to experiment with it. The WikiWay is also a way that is simple, which puts emphasis on doing, and on interaction. The Wiki culture has origins in the hacker culture and in the FLOSS movement (Free Libre and Open Source Software) and therefore begs for participation. The quest here was to ‘walk the talk’, to step out of the supposedly wise researcher role in order to engage with the concepts and the people. Not participating in any way would be similar to talk about color without using a picture. Another direction of this 54 exploration is the ethical stance of bringing Wikis2people . The idea of making concepts more accessible – both in the direction of making ‘Wikis’ less scary – by showing that they are a continuation of practices of participation and discussion already presented (or created) in other fora, and also making it more accessible for those experts with Wiki-style communication, that it can be brought to other dimensions of life. 1.3 Action/Design-based research “if we do what we always did we will get the results we always got” Some of the aims of this exploration, are in line with action-research and design-based research insofar as there is the goal of transforming a reflective process into an interactive inquiry process. It is also inspired by Freire’s Participatory Action Research, wanting to intervene, develop and change a community, in this case, a conference community, but with hopes of extending to other academic practices. 54 This chapter is the most relevant of several interactive creations alongside with this thesis. Please see the introduction, for the Wikipedia Conversation Game, and the Appendix for the Ontology Games, the ‘live-Wikigame’. And see the meta-phd offence, to be performed after the phd defense. 176 2 OUR COLL/NN/ECTIVE MIND PROPOSAL & SUMMARY Theme: Critics and the WikiWay Anne Goldenberg and Rut Jesus WikiWars, Bangalore, January 2010 2.1 Summary We propose a critical thinking game that will result in a collective artistic installation. It is inspired by Wiki-like socio-cognitive structures such as the Glass Plate Game, Mind Maps, and Open Space (occasionally presented as ‘live Wikis’). Our Coll(nn)ective Mind is a participative, low tech and immersive installation through which we will invite the participants to play and discuss concepts and relations between them through the two days of the conference. In particular, the installation will deal with «critics in the WikiWay» (Critical Point of the WikiWay: CPOWW) as the participants of the conference will be asked to write, draw and model concepts and their connections within this theme. 2.2 The Theoretical Framework The Wiki was the first artifact to implement both the principles of a connected organization and a public intervention in an online and therefore collective space. In order to understand better what these devices do, we would like to explore a way to make a Wiki ‘live’, following some of the same principles. By asking what are the features of a Wiki and re-creating them live, we can get better to the core of what it is that Wikis do, how to poke into them being coll(nn)ective minds. Here we would like to present some of the lineage for Wikis, as well as three examples that inspired us in developing an installation that will serve as a ‘live Wiki’: Mind Maps, Glass Plate Game and Open Space. Mind Maps are diagrams that represent ideas and concepts and their relationship. They can be used to visualize ideas, to structure thought, or to help understand relationships and links. The Glass Plate Game is a ‘real version’ of Herman Hesse (1947)’s Glass Bead Game, and it is basically a conversation, connecting ideas and concepts, while using a board and inspiring cards. Open Space is a meeting technique where the participants build the agenda. Just like Open Space Approach and Glass plate games have been presented as Wikis in everyday life, this installation explores these principles, by inviting the public to experiment collective and discursive thinking within a low tech, live, three-dimensional artifact. Wikis’ lineage is related to the hypertext, emergence of the Internet and the movement of Free Software & Open Source. In an interview with Ward Cunningham, the inventor of the Wiki, we realized he was inspired by the hypertext principles, themselves influenced by a theory of a connective mind, which suggests that the way we think is by making connections 177 between concepts and categories (Bush 1945). The hypertext theoricists also stressed on the fact that this way of organizing knowledge should allow public intervention in the making of links. The emergence of the Internet, and the hype around the web 2.0 has influenced Wikis, as they are webpages that allow for interaction. Actually, the first web was open to both inputting and outputting (reading), but the ability to write directly was suppressed with the widespread use 55 of the browser which was limited to reading . The FLOSS movement has also inspired Wikis, with the values and practices (some of them themselves inspired in the Hacker Manifest). Wikis are also particular in that they provide much more transparency and it is more possible to trace the activity. Documentation innovation inspired by the hypertext is also keen in authorizing the addition of traces, trails, and annotations. The information on many (if not all) steps in the editorial changes from one version to the next are available, in addition to some but not all parts of the motivational and contextual grounds for the changes that can be seen in the discussion pages. Not only Wikis record all the contributions over time, but also most of the interactions happen online (and are saved), either by postings in discussion pages, talk pages or open-access mailing lists. This new mode of communication leaves traces of what is happening. In this project we would like to capture this accessibility to the process, thus allowing investigating its development. The way Wikis evolved, supported a clear trend where cognition became explicitly social, and therefore its political dimensions more visible. These hypertext principles are keen on that the information is organized according to a principle of association of meaning, implying that a document can be bound and sent to several other documents. The documents can be multi-media and will tend to appear on various windows opened simultaneously progressively and during the navigation rather than of a linear reading. The use of the Wiki artifact brought another characteristic. Aggregation is not sufficient to bring collective cognition. When the collaboration process and the content production are not set in advance, there is an important need for discussion. And in a free and open environment, enacted by heterogeneous users and participants, discussions may easily turn into important and critical negotiations. In order to create uniform and organized knowledge, which is typical of epistemic communities, Wikis’ users have to discuss and negotiate content and rules. In other words, in the Wiki context, the users are not only readers. They are invited to intervene, as commentators, collaborators or potential authors of the organization and the drafting of the knowledge. They work, then, in the optics of a joint project where the idea of a cognition functioning by association is here associated with giving the opportunity to the 55 Ward Cunningham, personal communication, WikiSym 2009. 178 users to intervene on their environment, by annotation, organization, augmentation or interrelation of the corpus. 2.3 Methodology 2.3.1 The installation The installation consists of a giant cube, made of bamboo pieces, allowing participants to hang concepts (written or drawn) on recycled CDs, plasticine, Polaroid’s taken there, origami and other hanging pieces. They will be invited to place their « concept » in the cube and create links with other concepts and by doing so to participate in the creation of a collective mind map. Participants will be able to hang translations or representations of the same concept on the same vertical chain of concepts. People justify the relation they do between concepts by placing a note on the link. In this way, the space will become the occasion for mediated discussion and justification (through the notes) but also physical and face-to-face ones, as several participants have a good chance to meet within the cubic installation. The cube will be available throughout the conference, and the participants are encouraged to create concrete activities to complete, discuss and modify the cubic map of concepts. The main elements of the installation are: Participation All can participate; all interact in doing so; Users can make new concepts – by writing, by modeling, by drawing, by shooting; Editability Content edition is facilitated by the use of plasticine for modeling concepts; Content addition is allowed on the textual media; Content deletion is possible, though not encouraged, as there is no history of change; Users can always reorganize and create new links between concepts. Linking Links between concepts will be made possible with different materials such as small wood sticks, pieces of string, glue. Discussion The main medium for discussion will be the voice; Recording devices will be put within the discussion space and participants will be asked to think load; Participants will also be invited to justify their link, discuss connections and relate topics coming out of discussion on dedicated medium, which they will be able to hang in context or in a meta space. 179 Tracing There will be recorded images and video; There will be recorded audio, and voices. 2.3.2 The Practice In practical terms, we need: - 20 min at the beginning of the conference to introduce and motivate the concept — we want to explore as well as to invite the people to the cube, explain how it works and encourage massive participation; - 20 min at the end of the conference, to describe the results and the process, reflect upon the initial research questions and how they have been addressed in the process; - Space (5x5m²) in a main hall to build up the cube, central enough for the participants to visit it, or nearby the coffee break. The recording device will also imply that we have a quiet enough environment to be able to listen and capture discussions. We will be in charge of the whole installation. This proposal will contribute to tie together the conference and the participation in a creative, explorative WikiWay. Many of these ideas would be developed further, and would be particularly interesting to dialogue and construct together the specificities with the organization of the conference. 2.4 The Scope, The Hypothesis, The Questions Exploring these principles, Our Coll(nn)ective Mind is aimed at bringing the participants to discuss concepts and connections into a cubic collective space. We are interesting in discussing the following questions in the realm of the panel: 1. How do game--concept--discussions embody the collaborative spirit and illustrate abductive methods of reasoning? We want to understand, compare, play and extend the “WikiWay” and its inherent principles to other devices. 2. What is distributed cognition and how is it expressed at Wikis, Wiki-like activities? Our hypothesis is that there are different ways of presenting this extended way of thinking. Some devices and discourses would represent a silently consensual practice (reinforcing the determination of the structures or of those who would talk a lot) whereas some others do the contrary, pushing forward argumentative thinking and a critical mind (including justification, explanation and negotiation). 3. The theme, Critical Point of the WikiWay: Critics and the WikiWay will be put at stake: what is the WikiWay, and how do critics related, use, and abuse it? What is the critical point of the WikiWay, as a style and a practice of editors and researchers? 180 4. We will also be discussing the general question of what cognitive tools do to cognitive processes and what is the difference between process and product. 5. And a very self-reflective question, on the ability to use the “WikiWay” to question itself. Or, even, the necessity of doing so. Wikis carry a culture, and a world of change, of daring, of mixing, of remixing, of volunteering, and so it is only natural to also change, dare, mix, remix, and volunteer. So, how can this type of ‘experiment’ be considered research within new formats (format/content)? This is relevant to WikiWars in several aspects: both the content (theme: critics and the Wikiway — that is motivated beforehand, partially user-generated on spot) and the form (people play together cards and concepts that have relations and argue ‘consensually’ or like in Wikipedia ‘adhocatrically’ to put them forward) are Wiki-important, have Wiki-style and contribute to Wiki-criticism. By bringing a highly interactive aspect to the conference we would like to foment the spirit of community and critique, with interplay between art, academic and activism. 181 3 3.1 OCM REPORT “THE RESULTS” WikiWars The bamboo cube was built, and several materials were acquired, the days before the conference. We had the chance to introduce the cube in the first morning after the official openings, and before the first set of talks. We presented the idea, brought people to the place, explained the principles of open space, and of the editability of the cube, and invited people to participate. People kept adding things here and there, or finishing their edits (during the talks, if one is worried if people were paying attention, it is no worse than twittering allllllll the time). The next day, a cancelled talk allowed us to invite people again to the cube, where for half an hour, actively contributed. At the end of the second day, we presented our talk, with examples from the cube, thoughts for the future and general research inspiration. In terms of content, many edits – not directly pertaining to the theme ‘critics and the WikiWay’ were made on the first day – trying to convey a Wiki, some of its features – revision histories, bots, sock puppets, etc, while on the second day, some of the conversations from the conference started to appear at the cube, such as the elephants and the red tape, so it could be said that a level of criticism started to appear. Editing as a broad category was visible – people created, changed, connected, and destroyed ‘pages’. Below are some anecdotes as well as some visual information. 3.1.1 Edits One of the first edits added a user – it had a head made of clay, arms and legs made of icecream sticks. To this user was added a computer screen, of the old ones, made in grey and black clay. Still during the first morning an edit transformed it into the super user: the computer screen became its head (figure 2). Later that day, the super user had arms added to it. Most of what was happening was still about creating individual ‘pages’ and small links, until, in the morning of the second day, many ‘red links’ appeared, red thread going from one side of the cube to the other, all over. In conversation, someone mentioned it didn’t have the meaning of ‘red links’ in a Wiki (incomplete link), but rather it was a mark for the pervading red tape of bureaucracy that was all over (figure 3). A cute little edit was placed outside of the cube. What did it read? “Authority.” Most new ‘pages’ were following the pattern of being hanged at head’s height, like most others were until the moment. This was complemented by an edit right in the ground, with a 182 stick pointing up, which was the ‘foundations’. After that ground edit, another edit followed on the ground: right in the middle of the ground square, someone posted “Jimmy Wales”. It was heard (by the same editor?): “That’s because he thinks he is in the center”. Later, someone rescued Jimmy from the ground, not finding it an appropriate place. An edit with a bot should also be mentioned, which was done with care as a little figure, with a broom. It got connected to edit wars, to vandalism, and to the sock puppet. The link between “Critical Point of View” and “Neutral Point of View” was complemented by a note referring to a quote from the introduction of the conference: “Is this ironic?” In the second day, some people said the installation was missing an elephant. Elephants had 56 been referred several times in discussion mentioning the metaphor in the poem below , for which there are several versions57. Do we have the same elephant? Or not even? Soon after there were different elephants made of origami, drawn, just trunks… Later, a sugar cane appeared with the subtitle “I am a snake”, referring back to the discussion and adding local materials. For the talk, I took the yellow elephant, and made the elephant interview me about the research questions with such a project (figure 4 – “Just Ask, My Elephant”). 56 57 From http://www.noogenesis.com/pineapple/blind_men_elephant.htm see http://en.Wikipedia.org/Wiki/Blind_men_and_an_ elephant 183 John Godfrey Saxe's ( 1816-1887) And happening to take version of the famous Indian legend, The squirming trunk within his hands, It was six men of Indostan Thus boldly up and spake: To learning much inclined, "I see," quoth he, "the Elephant Who went to see the Elephant Is very like a snake!" (Though all of them were blind), That each by observation The Fourth reached out his eager Might satisfy his mind. hand, And felt about the knee. The First approach'd the Elephant, "What most this wondrous beast is And happening to fall like Against his broad and sturdy side, Is mighty plain," quoth he, At once began to bawl: "'Tis clear enough the Elephant "God bless me! but the Elephant Is very like a tree!" Is very like a wall!" The Fifth, who chanced to touch the The Second, feeling of the tusk, ear, Cried, -"Ho! what have we here Said: "E'en the blindest man So very round and smooth and Can tell what this resembles most; sharp? Deny the fact who can, To me 'tis mighty clear This marvel of an Elephant This wonder of an Elephant Is very like a fan!" Is very like a spear!" The Sixth no sooner had begun The Third approached the animal, About the beast to grope, 3.1.2 Then, seizing on the swinging tail That fell within his scope, "I see," quoth he, "the Elephant Is very like a rope!" And so these men of Indostan Disputed loud and long, Each in his own opinion Exceeding stiff and strong, Though each was partly in the right, And all were in the wrong! MORAL. So oft in theologic wars, The disputants, I ween, Rail on in utter ignorance Of what each other mean, And prate about an Elephant Not one of them has seen! Pictures Figure 3-1: view of part of the installation, showing some edits, and inks, including a nest, acquired locally. 184 Figure 3-2: The super-user. Started as a person sitting in front of a computer to become a computer with a body. Later it would acquire more arms. Figure 3-3: A bot, a troll and a magician actively editing, after the red tape of bureaucracy had already been added all over. 185 Figure 3-4: The double personality Rut/Di being interviewed by the yellow elephant on her left hand, while pictures of the making of it were being projected on the back. 3.2 WikiSym 2010 For WikiSym 2010 we proposed a tetrahedron, as we wanted to experiment with different shapes to create 3-D structures. Unlike a cube, a tetrahedron is a very stable structure which can also stand by itself. We proposed the tetrahedron as a better metaphor for a wiki, as a ‘space’, which provides space by proposing very stable features. The installation was located near the coffee break, and introduced at the first Open Space session. On the second day, we made an Open Space session for thinking, discussing and gardening the structure. The theme was “What are Wikis Made of`?”, which was meant to aggregate contributions from both academics and practictioners. 3.2.1 Edits Posts were added and links created. People picked up on our initial posts in two of the lower corners, suggesting ‘community’ and ‘software’ as two major parts of ‘what wikis are made of’, although actions, and values were also essential elements from our point of view. Someone added conflict. Next to it, a heart, representing love was added. There are no wikis without love, we were told. Someone came and said “I will look at what there is and consider what could be missing”. “Purpose! That is missing. Purpose is the seed. Purpose should be in the ground, as a grounding.” Later, the ‘purpose’ ‘grew’ into a flower. Later ‘purpose-seed’ explanation was added to the reflections poster: 186 "purpose is the seed. Everyone has a reason for their actions, an objective they wish to reach. This is often not commercial in nature. For collaborative efforts the purpose can be the betterment of mankind, but also to promote an ideology or view, attribution, mastery, increased knowledge on a subject within an organisation etc. For the effort to flourish, one must be aware of these motivations and cater to them." One discussion between the existence of ‘nodes’, or if all are just processes, gave rise to some hairy figures “the processes”, that meet each other, without nodes. These were later called “Squid of POV”. 3.2.2 Pictures Figure 3-5: people observe and read about the installation, in its earlier stages. 187 Figure 3-6: the post about ‘purpose’. Initially put on the ground to represent the seed, the beginning, and later having grown to a flower, blossoming. Figure 3-7: posters around the installation. The one on the left was the explanation of the device and the invitation for people to participate. The one on the right was a space for discussion and reflection, which was used to justify some of the contributions (for ex. the seed-purpose explanation), and to make schemes that resulted in other contributions (the processes to processes giving rise to the ‘POV Squid’. 3.3 Wikimania 2010 For Wikimania 2010, a big conference for the wikimedia community, we made an installation that had more meaning already in itself. A small sphere was to represent the past, that could 188 be edited from the outside, while a dome, following Buckminster Fuller’s design represented the present, and ropes, going into the future allowed for posts about the future of wikimedia. 3.3.1 edits Building the structure was a very collaborative process, as we needed more and more volunteers to help the structure hold while new nodes were added. This was done very wikilike, enjoying the helpfulness of wikipedians and their readiness to engage. But not just the process of building, the whole process of bringing it to Wikimania, talking to the organization, trading help to organize the wiki wall and some open space ‘time’, was done in a wikiway. A Brazilian Wikipedian, frustrated by their work not being considered a chapter (already addressed in a data paper), made three posts – one in the past, one in the present, one in the future – with a crown – for “content is king”, “organization is king” (with a sad smiley), and “people are kings”. In the end, as the structure started to fall apart (in the last hours, melting from the sun), people made comments and posts about it being ‘alive’, and how the future was still there. 3.3.2 Pictures Figure 3-8: the collaborative endeavor of erecting the dome. 189 Figure 3-9: view of the structure and its three parts: the past pentagon-based sphere on the left, the present-dome in the center, and the future-ropes in the back, to the right. Figure 3-10: details of two of the posts – one complex explaining the different directions of Wikimedia and communication, and one meta-interesting reading: broken-link, near a broken piece of the structure. 3.4 3.4.1 3.4.1.1 Comparison of the reception at WikiWars, WikiSym and WikiMania Documentation Website A simple website at www.ourcollnnectiveminds.blogspot.com can be consulted for more information related to the installations. The website serves as a place to gather information both for documentation reasons but also to promote the work. 190 3.4.1.2 Video The best documentation of the first installation at WikiWars, was done by Noémie Nicolas and can be accessed at http://vimeo.com/10829778. The video is important as documentation, visualization and also able to capture different parts and moments that were impossible to be seen with ‘the naked eye’. 3.4.2 Thoughts The three installations were very different. They were also differently conceived, perceived and received. The conferences themselves had very different purposes: WikiWars – 50 people – critical of Wikipedia; WikiSym – 100 people – place for wiki researchers and practicioners; Wikimania – 400 people – Wikimedia community. While on the first, we were officially wellreceived, our proposal was accepted, tickets paid, time slots offered, unofficially we didn’t find the organizers willing to make the installation a greater part of the conference. Having time slots for editing and a smaller crowd was positive, as it was easier to explain the purpose. Also, it was a conference with a lesser degree of community, and therefore people had more time to do other things when comparing, for example, with Wikimania, where people were very busy in their group discussions. At WikiSym, the position near the coffee space helped, as well as a general interested in meta-thinking, and in drawing concepts. At Wikimania the structure became more important, having itself a meaning. The heavier presence of the structure, the heat, the position further from the coffee space (outside), and the whole conference being scattered contributed to a smaller amount of edits. On the other hand, both the organization at WikiSym and Wikimania were extremely supportive, in paying for the materials, but mostly with finding helping hands, praising us for the initiative, and being enthusiastic about the idea. 191 4 OCM THOUGHTS “THE DISCUSSION & EVALUATION” What is a cube of bamboo doing in the garden, the hall through which the participants pass in their way in and out of the auditorium to the conference WikiWars, Critical Point of View? What are pieces of yellow and orange thick paper flowing in the wind? What are the messages there, what are the purpose, the engagement, and the vision? Bringing “Our Coll/nn/ective Mind: Critics & the WikiWay” to the conference was a way to raise many issues related to Wikis, to collaboration, to creativity, a way to engage in the discussion by investigating the particularities of Wikis and interactions between participants, concepts and discussions. The greatest learning and positive conclusion of such a piece ‘by one’s own hands’ was that the lessons and the enthusiasm were much more telling than if we had only presented a paper. Here I will revisit several of the starting points in the light of what happened. I will revisit the inspiration and theoretical framework, in specific the inspiration from mindmaps and Open Space; I will revisit and evaluate the several Wiki characteristics that were attempted with this work, and I will revisit and elaborate on the theoretical questions about what was at stake. 4.1 Revisiting visualization-image-mindmap-inspiration “Our Coll/nn/ective Mind: Critics and the WikiWay” contributed, at least, to offer the conference a visual layer, where what happened to a smaller or greater extent was transformed into a visual, physical device. Pictures of the conference, instead of the usually boring dark image of a speaker, and a slide projector on the side, were able to be enrichened by the interplay between colors, materials and contents at the installation. In a sense, the installation provided a live, present documentation place and captured the themes of the conference, functioning as a mindmap. 4.2 Revisiting Open Space Technology inspiration "What Wikis are for internet, Open Space is for face to face meetings" — Ward Cunningham As for the relation with Open Space Technology, is two fold. On one hand, it was a way to introduce Open Space to an audience, and play a little bit with some of its principles. On the other hand, unlike Open Space, it didn’t have a transformative quality to the event, as it was only ‘an addition’. This feature, which is limiting of the possibilities of open designs like these (our installation and open space), can also be seen in a different light – as Open Space requires often too big a commitment from the organizing committees (having to let go of structured paper presentations that are so ingrained in our modes of knowledge 192 dissemination), “Our Coll/nn/ective Mind” could be an easier contribution to such stiff structures. 4.3 Revisiting it as a Live Wiki The most important discussion, because it touches many topics, is related to the extent of the analogy of a ‘live Wiki’. In some senses the analogy was very productive, in other senses, it was an utopic analogy. For example, compared to a Wiki, the installation had, perhaps, too loose a topic, which was “critics & the WikiWay” – which didn’t allow for an interest in deep collaboration, eventual coordination, because there wasn’t, like in many Wikis, a common goal. Comparing to Wikis, it could be said, that we tapped into the beginning of a Wiki, when there is so much excitement, most of what is added is valid, and there are a lot of additions. Only later, usually comes the question “what do we really want”, which leads into organizational discussions and coordination activity. We were also careful to not put an emphasis on authorship – like Wikis don’t do –but leave it open, without need for signing. This raises an important question: pure anonymity wouldn’t allow for tracking and a kind of engagement seen in Wikis, not due to authorship via one’s official persona, but at least due to pseudoship. By allowing all edits to be anonymous in our live Wiki, we didn’t allow for follow-up discussions: “Person X, why is ‘Jimmy Wales’ on the floor?” This said though, would be very difficult to keep the level of anonymity, even to the level of pseudoship in a direct interaction (which is a struggle I wouldn’t know how to solve). It certainly functioned well as a place of participation, editability also was quite present – no entry barriers to participation, people constructed new artifacts, changed others, and linked them in diverse ways. As pointed out above, the lack of a common goal (accompanied by a different depth of motivation – in Wikis participates who wants/cares/can, while at the “Our Coll/nn/ective Mind” participated those that were at the conference) made focused discussions less present. Tracing was possible due to Noémie … who documented well what happened. Otherwise, it is a difficult task to keep track of who did what. Its limited time (2 days) also limited its notion of a Wiki, which usually lasts longer. 4.4 4.4.1 Thoughts on Research itself The first research point is concerned with research itself. If science and scholarship are processes of understanding, the installation – called art for lack of a better category – fits well, as it is also a process of understanding. The methods and the media, unlike most of academic work done with text and argumentation, take a more experimental stance, which engages people in a direct sense, by asking them to participate. Making an installation based on specific principles, seeing it develop, evaluating it, are ways to complement the 193 understanding process of science, not just in an interdisciplinary way, but also including approaches and modes of investigation from other realms. 4.4.2 Physicality/Presence The second topic concerns physicality and presence. There seems to still be a lot of value in meeting face to face, otherwise there wouldn’t be all these people flying across the world to meet and present face-to-face their ideas to each other. We are still humans, bodies, and are capable of getting immense value from direct interaction, from those conversations. Making an offline installation honors our humanness. An offline, in-site, interactive installation respects and supports the meeting taking place, be it between researchers, ideas or discussions. If we are to wish for more engaging meetings, and richer participation, we need to do things other than sitting and listening, and occasionally asking questions. Other times, such as lunch, breaks and conference dinners are also important, but it is necessary to complement those spaces with more creative, fun, new ways of interaction, which can be done by thinking carefully about the making of an event, in order to provide the necessary arenas for the expression of embodied cognition as well as distributed cognition. 4.4.3 WikiWay The research point is concerned with ‘walking the talk’. If we are to study collaborative processes of producing knowledge, open, collaborative, if we are to study what came to be called ‘the WikiWay’, a very fair approach is to ‘try it out’. In a sense, it is a design-based research, or action-research, with some other elements of facilitation and creativity specifically adapted to its nature as ‘an event’. To engage with the ‘WikiWay’ is a way to ask: what is this Wikiway? What of it is new, what is not, what are the changes, and what are the continuities? As to answer these questions, in a sense Wikis are new ways to do some things we always did – discuss, collaborate, be in an open square. But they have been so successful because they record, bet on openness and the good, rather than safeguarding all that can go 58 wrong . The WikiWay brings forth much of the ‘know-how’ discussed in the previous chapter – it is by doing that one becomes an expert rather than a beginner, even if only an expert of the moment. 4.4.4 Abduction & Catalyzing Creativity The third research point is about abduction, a mode of reasoning that is often used, and seldomly assumed to be the one at stake. By providing the materials, the space (and time) for the installation to take place, we couldn’t have predicted what would happen, we couldn’t even have hypothesized correctly beforehand. After, it is now clearer to see what were the 58 Jimmy Wales has spoken about the open design principle whereby considers the good in people that won’t use knives in a restaurant to stab each other, instead of putting them into cages and limiting their potential. 194 points at stake (for example, how loose should the topic be?), and we can now construct an a posteriori hypothesis about the different purposes of this work. Creativity is, by definition, an abductive process, such as play – we wanted to bring this dimension to the conference, allowing people to play with the different possibilities, opening the opportunity for people to try some concepts, links, discussions (while we supported by holding the space, by providing the tools, by inciting to participation). We wanted to contribute to ‘random’ encounters that could lead to collaborations, discussions of topics, and ideas for research. It is hard to prove if this happened, but we hope so and saw much evidence that there was a greater sense of ‘community’/’fun’/’excitement’ than in traditional academic conferences (except for Wiki conferences). Conversation is done this way, and is something somehow unsurpassable. 4.4.5 Harnessing CfI This installation can be seen as an attempt to harness the Cognition for Improvisation present in such a conference. In a sense, that is exactly what happened, because people, with no previous notice, preparation, deliberation, contributed to an experiment that had much content and interacted back to the conference. Some Cognition for Planning, by Anne and I, was used in making it happen, and in that sense we didn’t open the design to everyone (it was always open, and free – but a deep culture of academic and artistic authorship prevents people from truly engaging and transforming it into ‘their own’). A conference has a specific structure, which usually doesn’t allow for much interaction with materials or with each other – besides the use of conversation during coffee breaks – either on the subject or characterizable as small talk. There is a surplus of Cognition for Improvising that this installation harnessed, a surplus of creativity, a surplus of engagement. That said, it only scratched the surface of this potential. It is a much greater potential available, but institutions, such as many in education, hinder rather than nurture these potentials. 195 5 OCM FUTURE There are a number of ideas that could be used, which would make the ‘Our Coll/nn/ective Mind’ more game-like, or more discussion-like or more visualization-like. Here, some of them are discussed. 5.1 Visualization/Communication One possibility would be to do some work before hand to increase the visual aspect. One would, upon receiving everyone’s papers (or long abstracts), use some of the data in them to produce a ‘seed’ and supporting visual material – which would serve as a grounding part of the installation that would get completed at the conference, with the participation. Using Wordle (a word cloud visualization application) we’d take the tag clouds, the concepts, the relevance from the abstract submissions, and make posters of tags and using cytoscape, we’d map these words into a network and a poster of the network which can include tags, ideas and concepts, but also fields, titles of papers, and authorship. In this case, the structure and the posters could be used as a communication tool, as people could leave notes in people’s nodes, with questions about their work, or saying they’d like to talk, or making an appointment, etc. If there is Open Space at a conference, the goal could be to ‘capture the conference’, asking people to hang something after each discussion. We would make open space topics in the end of the days: ‘engage with the installation’; ‘try to capture what happened today’. 5.2 Discussion time Wikis are, afterall, open spaces (in the general sense) where all can add, discuss, and contribute to the content. Discussions and negotiations are essential parts of making a Wiki (reference to Anne’s thesis). It may be the case that the ‘discussion spaces’ in Wikis should be mapped into ‘time spaces’ in reality. To support that part, we would devise small moments of discussion, asking people to form groups to discuss a particular issue, possibly of their choice (choose a link, show it to someone, do you agree? Would you agree? What edits can you make together to improve it?). Throughout the conference, at 4-6 coffee breaks, the participants would receive cards with concepts and tasks which ask them to do something at the installation, while discussing with others (15 minutes each time). 5.3 Game A possibility would be to use game-structures – people pick up cards, at different moments in time, and they are given tasks – like being a vandal, or needing to poll an issue, or ask for a page to get deleted. It could also be trying to simulate event disturbances. A Wiki’s life is also related to specific events that may put people together, against, and that would include more negotiation. We could create events during the conference, like vandal acts, forking, 196 controversies, and scandals – to then see what would happen. This could be interesting for an event such as WikiMania, as it is more a community-event, than a ‘what are Wikis’academic question-place. 5.4 How-to guide? A possible next step would be to write a ‘how-to-guide’, to offer this idea/experiment/design as an open design, which title would be: “How to bring it to other conferences, groups, live meetings: here is a guide to follow Our Coll/nn/ective Mind steps and then make your own”. But the essence is not in making installations or living-Wikis. The essence is to support people in their own quests and learning paths, so that they ask the questions, do the things and dare the events that are meaningful for them. 197 198 VI. DISCUSSION METHOD & METHODOLOGY 1 59 Dialog : - There were three parts of this thesis. Which were quite different in approach. - How do they fit together? - They construct a triangulation whereby one learns different aspects through the different ways of investigating. - I see, a kind of multimethodology. But what was really learned? - To clarify, the multimethodology in this case means that there were different methods being used. - I understand, drawing networks, creating cognitive concepts and preparing installations. There are different methods and techniques at stake. They are procedural. - Yes, the methods were different, but they all were part of a greater methodology in learning things with different zooms, glasses, techniques, to inform greater questions about cooperation and cognition. Here we can discuss these greater questions in two ways: the ecology of the article dealing with the theme of WIKIS and the relationship between the other two themes, COGNITION and COOPERATION. - But what “methodology” is this, what is its discourse, what are the particular combinations of research principles and procedures? How is it adequate and appropriate to discuss the Wiki(article)? - The first essential characteristic of this thesis is to be case-based. Flyvbjerg (2006) argues for the importance of case-studies. This became an essential methodological choice. Then, given the nature of socio-technological networks, constructed by people, technologies and values, in new interactions (networks, and not hierarchies), has demanded a methodology that was diversified and looked at the several parts distinctly. Moreover, the phenomenon at stake is part of the real world, it is ‘in the wild’ and accounting for that includes taking different perspectives at the world: as the nature of data is also changing, not the least because, it is accessible and in greater quantities – the data-driven studies dealt with finding ways to ‘see’ and ‘make sense’ of this data, with visualizations and clusterings; as the phenomenon is really about ‘human crowds’ – the cognitive studies dealt with finding ways to understand what kinds of cognitive mechanisms could be at stake for the massive behaviors to be comprehended; and last, but not least, both previous methods kept a ‘distance’ from the subject, which needed to be complemented by a more active participation – the exploratory studies dealt with the Wikiway by ‘trying it out’ and asking questions from within, in a participatory 59 A discussion is a conversation. Like in Wikipedia, the discussion page is called ‘talk’. This one is missing the ‘edit’ button, but the defense will provide the talking. 199 way. The different outlooks have also influenced much each other and some of those discussions are taken in the next parts of this discussion. - I start to understand. - Here is another way to think about it with an analogy with the emergence of life, which may shed light about the processes of the emergence of massive collaboration, but also to understand better how these studies ‘fit’ together. Life is a fact. We know when we see it, to some extent. Although that is true, it is still paramount to study the mechanisms that are part of ‘life’, as well as studying its characteristics. Some of these characteristics may be too specific – such as a dependence on carbon, while some mechanisms seem more universal, such as selfreproduction. Emergence is a difficult topic to study, in the realm of life, it has been done both by constructing artificial ‘primal soups’, but also, more indirectly, by studying some of the characteristics of this phenomenon we call life. Likewise, collaboration is a fact. We know when we see it, to some extent. Although that is true, it is still paramount to study the mechanisms that are part of ‘collaboration’, as well as studying its characteristics. Some of these characteristics may be too specific – such as a clustering around topics, while some mechanisms seem more universal, such as the use of conversation. Emergence is a difficult topic to study, in the realm of collaboration. In this thesis, the data-driven studies tried to address seeing some of the characteristics – clustering, mixture between content and metacontent, support of explicit collaboration (discussions) to implicit collaboration (stigmergic actions directly done at the article). The cognition studies address understanding part of the mechanisms of collaboration, and the exploratory studies built a kind of ‘primal soup’, perhaps missing the lightning. Below are some more topics of research, which may help elucidate how ‘it all fits together’. 200 ECOLOGY OF THE ARTICLE 2 In the old days, the ecology of an encyclopedic article was like the ecology of human beings (like peasants) almost self-sustaining their basic needs within a confined space, the farm, the village, with little exchange — corresponding to a little circuit of one writer and a few editors plus a peer reviewer for instance. Nowadays, and for the Wikipedia, the analogy would be between the global collective of editors, and the global ecosystem including many spots on the Earth almost all represented in a supermaket basket of just a few food commodities each composed of materials from many places of the world. Below I will use the metaphor of sketching an ‘ecology of the article’ to wrap up the studies in the thesis. That is to investigate the relationship between the Wiki-article and the environment. This follows Hutchins (1996)’ tradition insofar as he proclaims in Cognition in the Wild: “I hope to evoke with this metaphor [cognition in the wild] a sense of an ecology of thinking in which human cognition interacts with an environment rich in organizing resources.” These studies were performed in a socio-technological setting, the particular world of modern technology. These cooperative phenomena involving human agents and the technologies of their environment are also present in collaborations in the writing of scientific articles, the history of fairy tales or even in the process of writing a paper for a classic encyclopedia with the necessary intermix between co-authors, editors and revising people. But what is new or peculiar about cooperation when we focus on the makings of articles for a Wiki such as Wikipedia? The first answer to this question is that for a Wiki-article there is a very high accessibility of the data on cooperation, data that would be difficult to gather about a scientific collaboration, as it is extremely difficult to keep track of whom added which idea, or which phrase, to monitor all conversations (either in writing or in person) and have access to all the drafts. The second answer to the quesion about why the study of cooperation in a Wiki-article can tell us something new is simply addressing how the new technologies provide new forms of cooperation, interaction and communication and might create an arena where these instances of distributed cognition happen more, not only because a greater interaction in time (more and smaller edits than in a classical collaboration) and space (people connect from everywhere) but also because there is a greater presence of the cognitive artifacts and as Vygotsky (1978) claimed “the more pervasive such artifacts are in our everyday life the more they mediate cognition” (quoted by Rogers et al. 2005). 2.1 2.1.1 Some Sketched Steps for an Ecology of the Article The Article-Unit Let us draw the focus on the article. The article can be seen as the basic unit of social cognition production in the context of the whole socio-technological endeavor patent in 201 Wikipedia. The article is also a product and through it one can zoom into the footsteps left by this mode of collaborative cognition. Following Bateson (1972) it is important to bound the unit so that things are not left (too) inexplicable. To choose the article as the natural unit of analysis doesn’t mean, though, to have all the analysis at the scale of the article because some flexibility in both scale directions has been of great value. A smaller unit (but still part of the article) as the paragraph helped to trace who contributed to which paragraphs within one article. In the other direction (instead of the inside, the outside) the studies focused also on a group of articles, on a certain bounded subject, and studied the cooperation of the editors across a theme (and not just inside one article). In these cases the article is still the unit of analysis in which processes of zooming in and zooming out are performed. Although studying the Wiki-article was the concern of the study, certainly its immersion in Wikipedia has been crucial to inform what is going on and how the article is affected by general structures. The article encompasses the minimum set of characteristics of a focused effort – a number of actors, human and non-human gather around it with the purpose of making it a reliable, consistent, good encyclopedic article. The article can be seen as the basic unit of social cognition production. Although I would like to refrain from distinguishing the context from the content, it can be said that the content bounds the act. 2.1.2 Accessible process information — an illusion or a fact? On the one hand, and as emphasized many times above during this thesis, one of the most salient innovations when studying a socio-technological network like Wikipedia is that the information is available. Not only Wikipedia records all the contributions over time, but also most of the interactions happen online (and are saved), either by postings in discussion pages, talk pages or open-access mailing lists. This new mode of communication, leaves real footprints of what is happening. Looking through the glass of Latour’s notion of actor networks, an article should be considered as a holographic part of a whole, including all the actors: editors, Wiki technology, discussion pages, history pages, tags, etc. If not, analyzing only the finished article wouldn’t allow any deep description, as “it is thus impossible to use the end of the story to explain its beginning or its development” (Latour 1991). Likewise, the access to the history has important consequences for the understanding that knowledge is a process and not just a product. 2.1.3 Article-text as network as participation A text, as identified by Callon (1991) is also a network of actors, where “Words, ideas, concepts, and the phrases that organize them thus describe a whole population of human and non-human entities.” Reading a Wikipedia article as an old-days encyclopedic article online is missing the point. Wikipedia is not just an encyclopedia, and a Wikipedia article is not just 202 an encyclopedia article, there isn’t a way that an inevitable and bounded content exists per se, it is deeply intertwined with the form and its own dynamism. For example, in doubt of the possibility of bias in an article, one can read the discussion page and see which sides are arguing, what are the arguments, one can consult an earlier version and see if the worry was already present, etc. And finally, one can change it. Thus, a present readable version of a Wikipedia article is not just itself a dynamic representation of a network involving the collective of editors and technology, it also invites for a reading more open to co-read the context of article production, contested issues, view the history of the production, etc. In this sense, the interplay between content-interpretation of the entry and context-interpretation via talk and history pages invites the reader to join the network of participation. 2.1.4 Different niches, roles, elements Several were the characteristics observed in the studies, as to how the articles cluster, in what would be niches in the ecology of the article-space. The networks of collaboration on articles on specific subjects as analysed by the biclique techniques may exemplify such niches. These clusterings happen more when there is production of knowledge than when the tasks are administrative. Meta-tasks are much less specific, and the administrators don’t specialize as much as the producers of content. The two levels, of producing content and managing the community are very intertwined. Activity is somehow a bit clustered, there is a separation of functions, specially the one of vandalism-fighting. These data studies were performed so that they would span and represent both the general and the particular, different zooms and outsides and insides of the articles. In a new fashion, Wiki-articles and meta-Wiki-articles were treated the same way, to see patterns of work in the whole Wikipedia as an entity not just “Wikipedia-purpose” (writing an encyclopedia) or “Wikipedia-behind the scenes” (coordinating the writing of an encyclopedia). The different patterns found show that a Wiki article stands in a place between a fully random existence, and a highly hierarchized one, and it is therefore the place to be standing when expecting emergent features to happen. We could have found patterns where one person was deciding what was done in an article, taking that power, or that good articles would be written by one person, or that people would be clustering in administrative tasks, or that the type of actions wouldn’t cluster. But we found that the actions do cluster (and some become specific ‘roles’ such as vandalism-fighting, that emerges a level of collaboration between some basic stigmergic cooperation and some highly discussed coordination). 203 2.1.5 The article alive I’ve sketched the way in which the Wikipedia article can be considered a fruit of action and an actor itself. It is also alive, evolving, formed by the different actors, but also relating to the norms and values of the project as a whole. In a sense, the article recruits new participants, engages their Cognition for Improvisation. The article is dynamically stable, like the shrine, whose stability consists in being renewed every many years. We can, therefore, speak of a continuity in time, not physical continuity. 204 3 COOPERATION AND COGNITION There are three major themes of the thesis, cooperation, cognition and Wikis. While in the “Theoretical Perspectives” (chapter II) all were addressed in an introductory way, the data studies (chapter III) mostly addressed cooperation and Wikis and chapter IV on cognition studies focused upon cognition and Wikis. The exploratory chapter V addresses issues of context, exploring the WikiWay. Here, we turn the focus to the relation between cooperation and cognition. 3.1 Reshuffling “baralha e volta a dar o que tiveres de ideias” – Sérgio Godinho Reshuffle your ideas, says the Portuguese singer-songwriter. Thinking about the distinction between Cognition for Improvising and Cognition for Planning, an obvious criticism to this contribution would be to say that I am not pointing at any new distinction, or that this distinction already has been pointed out in other research. I don’t claim to be saying something truly new. As the history of ideas proves, many ideas come back again and again, disguised under several names and serving slightly different purposes. While the idea of cognition for improvising is related to ideas of immediate coping and of know-how, the specific way in which these ideas can be applied to Wikipedia seem to cherish from having a concept that focuses on the small sizes of the goals, and on the reactiveness. The idea of reschuffling can also be applied to Wikipedia’s success. Wikipedia is setting many old elements together in a new way. It is crucial to stress the importance of contingency for Wikipedia’s success, as well as the importance of the values and the technology. When looking at the human factors one observes that people were already cooperating before, but not in this manner. The right technology, values and moment in history allowed for this capacity for cooperation to ‘come out’, to be screwed in a new way, and be able to contribute to the construction of the major encyclopedia ever written. A possible criticism is that Wikipedia may be exploiting time and effort that would otherwise be used in people's free time. But the opposite can also be claimed, that Wikipedia is providing an opportunity for people to use their time in something they find meaningful (and saving much of our time by providing an essential resource). 205 3.2 Cognition supports Cooperation Cognition needs a goal. This goal can be a collaborative one. Even in the case of distributed cognition, collaborative achievement of a goal can be seen, for instance in Hutchins’ description of the intricate relationships inside a ship, all relationships are leading to the final movement: will the ship turn to the portside or to the starboard side? The Internet, which is this huge assemblage of pages, somehow lacks this kind of goal to even be called a cognitive endeavor. Its information though, organized in different ways, serves as Cognitive Commons. Wikipedia, with the encyclopedia general goal, and the goal of writing good “Neutral” articles collaboratively, has been a fantastic place to investigate cognition. 206 VII. CONCLUSIONS & PERSPECTIVES FOR FUTURE RESEARCH First, I will present conclusions directly from the data-driven studies, then, address the questions posed in 'research questions' and finally touch upon some future directions while acknowledging the limitations of this research. The concrete findings from the data driven studies point to interesting structures both around an article and within an article. Around an article, clusters reveal that the meso-level of investigation is quite appropriate when thinking about cooperation, because controversies and WikiProjects encompass more than one article. Bicliques showed clustering around knowledge issues, around epistemic communities, revealed from structures of collaboration and not from structures of knowledge relations. The same approach applied to meta-articles showed that the work done in these pages is much less clusterable, and that the community is more tight. Housekeeping is a more transversal activity. When looking at the contexts of edition to the articles 'Prisoner's Dilemma' and 'Neutral Point of View' it was shown how Wikipedia work can be quite interwoven. While there are editors who only add a paragraph here and there to a content article and those that are more engaged in the coordination procedures, these two worlds are very mixed. Thus, Wikipedia is not simply the encyclopedic entries, as these are embedded within the efforts of the community, cooperating, collaborating and coordinating the work, and the meta-work. Also, new ways to put together quantitative and qualitative data helped to look inside an article's edits and see patterns of few edits per editor for some, while others are 'all-rounders' and are in the middle of all the action. Some activity, such as vandalism and reversion seemed to have its own life, creating a clear separation of tasks. Using network visualization, while more useful if more interactive, helped to see the patterns of many tinkering edits and engagement of a community and tightness. All of these learnings crystallize two major points which are 1) Wikipedians are working together. They work in the same topics together, they discuss, they create rules about the game they are playing. There is a high level of cooperation, even in disagreement. The content of the ‘Prisoner’s Dilemma’ article points to this direction, when a good strategy over time is to cooperate with those that cooperate with you. To complement, the content of the page ‘Neutral Point of View’ serves as the aggregator, the common goal of Wikipedians. To achieve that goal, Wikipedians are communicating much, either indirectly by editing directly in an article, or directly both in the article discussions and in the meta discussions, where coordination and governance practices take place. 207 2) Wikipedia is dynamic, and has a stability in its dynamics. It is made of a permanent flux of editing, reshaping articles and rules, fighting vandals, but in this process it is still growing, getting better, but most of all, defining itself from moment to moment. These insights from the data-studies fed the theoretical proposal of a cognition for improvising that is used in these processes of cooperation, negotiation and coordination. The way a Wiki is dynamic and contingent was also studied in the attempt to construct a Wikilive, which was revealing in how much a Wiki is a process, and a blend between content and discussion. Cognition is emergent in these socio-technological systems, at different levels, it is present both at a low cooperative level (accretion of edits), all the levels up to the organization of a community. Understanding these phenomena, means questioning the 'distribution' of cognition, whether it happens with/at cognitive tools. Also, the sheer size is also responsible for Wikipedia’s success. Cooperation, as well, seems to be an essential mechanism between the different processes, between actors and cognitive tools. This thesis can be situated in the realm of the ‘high-thoroughput humanities’ as it proposed to analyze the task of writing articles from the perspective of studies of cognition and collaboration with data from the writing process. It could also have been interesting to follow the investigation towards the construction of a chart flow of the cognitive processes involved. That said, much more than the available data (and more abstraction) would have been necessary for such an enterprise. In understanding how Wikipedia works, this thesis has revealed some of the mechanisms that support the Wikipedia-phenomenon, which develops at different levels of complexity (stigmergy to foundation), making use of openness, discussion pages, a common goal and common practices. This thesis is really a beginning. As the field is new, the phenomenon is new and the approach is new, it is clear that much more work can and should be done to build upon these preliminary insights. An obvious direction is to pledge for a greater integration of the disciplines, as the understanding of the insides of an article, done in the humanities, with discourse analysis and interviews, should be supplemented with network views and the incorporation of data into the understandings. Pursuing the understanding of Cognition for Planning and Cognition for Improvising would be very revealing if more disciplinary knowledge was to be part. In terms of possible research paths starting here, three suggestions are made: 208 Cognitive milestones: understanding better the place of Wikis and the digital age in the cognitive milestone framework. Departing from Wikis as a medium, what are the consequences for human cognition of their success. Learning and knowledge: following Dreyfus’ stage theory of competencies and mastery, investigate how the Cognition for Improvising is related to expertise and to the process of learning. Political and educational consequences: besides the ethical discussion upon technology, investigating the toll being taken into the educational and political realms and how Wikis and Wikipedia can inspire new forms of collaboration, of understanding knowledge production and support the process of self-learning. 209 VIII. REFERENCES AND BIBLIOGRAPHY Cytoscape, Available at: http://www.cytoscape.org/. Abraham, S., 2010. Wiki’s worth, on a different turfq. Business of Life - livemint.com. Available at: http://www.livemint.com/2010/01/12210114/Wiki8217s-worth-on-a-diffe.html [Accessed June 13, 2010]. Adler, T. & de Alfaro, L., 2007a. A content-driven reputation system for the wikipedia. In WWW '07: Proceedings of the 16th international conference on World Wide Web. Banff, Alberta, Canada: ACM, pp. 270, 261. Available at: http://dx.doi.org/10.1145/1242572.1242608 [Accessed September 29, 2009]. Adler, T. & de Alfaro, L., 2007b. A content-driven reputation system for the wikipedia. In WWW '07: Proceedings of the 16th international conference on World Wide Web. Banff, Alberta, Canada: ACM, pp. 270, 261. Available at: http://dx.doi.org/10.1145/1242572.1242608 [Accessed June 13, 2010]. Ahn, Y., Bagrow, J.P. & Lehmann, S., 2009. Link communities reveal multi-scale complexity in networks. 0903.3178. Available at: http://arxiv.org/abs/0903.3178 [Accessed June 13, 2010]. Anderson, C., 2006. The Long Tail: Why the Future of Business is Selling Less of More, Hyperion. Anderson, M., 2008. Embodiment and the Nature of the Mind. In Payette, N & Hardy-Vallee, B. (eds): Beyond the Brain: Embodied, Situated and Distributed Cognition. Cambridge Scholars Publishing. Anthony, D., Smith, S. & Williamson, T., 2005. Explaining Quality in Internet Collective Goods: Zealots and Good Samaritans in the Case of Wikipedia. In Innovation & Enterpreneurship Seminar. MIT. Anthony, D., Smith, S. & Williamson, T., 2007. The Quality of Open Source Production: Zealots and Good Samaritans in the Case of Wikipedia, Available at: ftp://ftp.cs.dartmouth.edu/TR/TR2007-606.pdf [Accessed September 29, 2009]. Anthony, D., Smith, S. & Williamson, T., 2009. Reputation and Reliability in Collective Goods: The Case of the Online Encyclopedia Wikipedia. Rationality and Society, 21(3), 306, 283. Axelrod, R., 1997. The Complexity of Cooperation: Agent-Based Models of Competition and Collaboration 1st ed., Princeton: Princeton University Press. Barsalou, L., Breazeal, C. & Smith, L., 2007. Cognition as coordinated non-cognition. Cognitive Processing, 8(2), 79-91. Bateson, G., 1972. Steps to an Ecology of Mind: Collected Essays in Anthropology, Psychiatry, Evolution, and Epistemology 1st ed., University Of Chicago Press. Bedau, M.A. & Humphreys, P., 2008. Emergence: Contemporary Readings in Philosophy and Science 1st ed., The MIT Press. Benkler, Y., 2006. The Wealth of Networks: How Social Production Transforms Markets and Freedom, Yale University Press. Available at: http://www.amazon.ca/exec/obidos/redirect?tag=citeulike09-20&path=ASIN/0300110561 [Accessed March 14, 2010]. Beschastnikh, I., Kriplean, T. & Mcdonald, D., 2008. Wikipedian Self-Governance in Action: Motivating the Policy Lens. In Proceedings of the 2008 AAAI International Conference on Weblogs and Social Media. 210 Biuk-Aghai, R., 2006. Visualizing Co-Authorship Networks in Online Wikipedia. In Communications and Information Technologies, 2006. ISCIT '06. International Symposium on. pp. 737-742. Available at: http://dx.doi.org/10.1109/ISCIT.2006.339838 [Accessed October 17, 2008]. Borland, J., 2007. See Who's Editing Wikipedia - Diebold, the CIA, a Campaign. Available at: http://www.wired.com/politics/onlinerights/news/2007/08/wiki_tracker [Accessed June 6, 2010]. Bryant, S., Forte, A. & Bruckman, A., 2005. Becoming Wikipedian: transformation of participation in a collaborative online encyclopedia. In GROUP '05: Proceedings of the 2005 international ACM SIGGROUP conference on Supporting group work. ACM Press, pp. 10, 1. Available at: http://dx.doi.org/10.1145/1099203.1099205 [Accessed April 2, 2009]. Buriol, L. et al., 2006. Temporal Analysis of the Wikigraph. In WI '06: Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence. IEEE Computer Society, pp. 51, 45. Available at: http://dx.doi.org/10.1109/WI.2006.164 [Accessed September 29, 2009]. Bush, V., 1945. AS WE MAY THINK in "ATLANTIC MONTHLY" (July 1945: Volume 176, Number 1), Bound Volume July-December 1945, Fortune Magazine (1945). Butler, B., Joyce, E. & Pike, J., 2008. Don't look now, but we've created a bureaucracy: the nature and roles of policies and rules in wikipedia. In CHI '08: Proceeding of the twenty-sixth annual SIGCHI conference on Human factors in computing systems. ACM, pp. 1110, 1101. Available at: http://dx.doi.org/10.1145/1357054.1357227 [Accessed September 29, 2009]. Callon, M., 1991. Techno-economic Networks and Irreversibility. In J. Law (ed): A Sociology of Monsters? Essays on Power, Technology and Domination, Sociological Review Monograph. London: Routledge, pp. 132-161. Capocci, A., Rao, F. & Caldarelli, G., 2008. Taxonomy and clustering in collaborative systems: The case of the on-line encyclopedia Wikipedia. EPL (Europhysics Letters), 81(2), 28006. Capocci, A. et al., 2006. Preferential attachment in the growth of social networks: the case of Wikipedia. Available at: http://arxiv.org/abs/physics/0602026 [Accessed September 29, 2009]. Carr, N., 2006. Rough Type: Nicholas Carr's Blog: The death of Wikipedia. Available at: http://www.roughtype.com/archives/2006/05/the_death_of_wi.php [Accessed May 28, 2010]. Chesney, T., 2006. An empirical examination of Wikipedia's credibility. Firstmonday, 11. Available at: http://www.firstmonday.org/issues/issue11_11/chesney/ [Accessed June 6, 2010]. Christensen, N., 2009. Wiki Culture: En analyse af organisatorisk samarbejde på Wikipedia. IT University of Copenhagen. Available at: http://nmc273.wordpress.com/files/2009/12/speciale.pdf. Ciffolilli, A., 2003. Phantom authority, self-selective recruitment and retention of members in virtual communities. Available at: http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/viewArticle/1108/1028 [Accessed September 29, 2009]. Clark, A. & Chalmers, D., 1998. The Extended Mind. Analysis, 58(1), 10-23. Collins, H. et al., 2006. Experiments with interactional expertise. Studies In History and Philosophy of Science Part A, 37(4), 656-674. Cowan, R., David, P. & Foray, D., 2000. The explicit economics of knowledge codification and tacitness. Ind Corp Change, 9(2), 211-253. Dalby, A., 2009. The World and Wikipedia: How we are editing reality, Siduri Books. 211 Dewey, J., 2010. Human Nature and Conduct: An Introduction to Social Psychology, Nabu Press. Dror, I. & Harnad, S., 2008a. Offloading Cognition onto Cognitive Technology. Available at: http://arxiv.org/abs/0808.3569 [Accessed June 6, 2010]. Dror, I.E. & Harnad, S., 2008b. Cognition Distributed: How cognitive technology extends our minds, John Benjamins Publishing Company. Ekstrand, M.D. & Riedl, J.T., 2009. rv you're dumb: identifying discarded work in Wiki article history. In Proceedings of the 5th International Symposium on Wikis and Open Collaboration. Orlando, Florida: ACM, pp. 1-10. Available at: http://portal.acm.org/citation.cfm?id=1641309.1641317 [Accessed June 13, 2010]. Elia, A., 2006. An Analysis of Wikipedia Digital Writing. In EACL. 11th Conference of the European Chapter of the Association for Computational Linguistics. Trento, pp. 16-21. Fallis, D., 2009. Introduction: The Epistemology of Mass Collaboration. Episteme, 6(1), 1-7. Flyvbjerg, B., 2006. Five Misunderstandings About Case-Study Research. Qualitative Inquiry, 12(2), 219-245. Forte, A. & Bruckman, A., 2008. Scaling Consensus: Increasing Decentralization in Wikipedia Governance. In Hawaii International Conference on System Sciences, Proceedings of the 41st Annual. pp. 157, 157. Available at: http://dx.doi.org/10.1109/HICSS.2008.383 [Accessed September 29, 2009]. Forte, A. & Bruckman, A., 2005. Why Do People Write for Wikipedia? Incentives to Contribute to Open-Content Publishing. Available at: http://www.cc.gatech.edu/~aforte/ForteBruckmanWhyPeopleWrite.pdf [Accessed September 29, 2009]. Geiger, R.S. & Ribes, D., 2010. The work of sustaining order in wikipedia: the banning of a vandal. In Proceedings of the 2010 ACM conference on Computer supported cooperative work. Savannah, Georgia, USA: ACM, pp. 117-126. Available at: http://portal.acm.org/citation.cfm?id=1718918.1718941 [Accessed March 14, 2010]. Geiger, S. & Ribes, D., 2010. The work of sustaining order in wikipedia: the banning of a vandal. In CSCW '10: Proceedings of the 2010 ACM conference on Computer supported cooperative work. Savannah, GA, USA: ACM, pp. 126, 117. Available at: http://dx.doi.org/10.1145/1718918.1718941 [Accessed June 13, 2010]. Giles, J., 2005. Internet encyclopaedias go head to head. Nature, 438(7070), 901, 900. Goldenberg, A., 2010. La Négotiation des Contributions Dans les Wikis Publics: Légitimation et Politisation de la Cognition Collective. Thesis. Uni. Québec à Montreal. Halverson, C., 2002. Activity Theory and Distributed Cognition: Or What Does CSCW Need to DO with Theories? Available at: http://www.ingentaconnect.com/content/klu/cosu/2002/00000011/F0020001/00398228 [Accessed June 6, 2010]. Haraway, D., 1991. A Cyborg Manifesto: Science, Technology, and Socialist-Feminism in the Late Twentieth Century. In Simians, Cyborgs and Women: The Reinvention of Nature. Free Association Books / Routledge. Available at: http://www.amazon.ca/exec/obidos/redirect?tag=citeulike09-20&path=ASIN/1853431397 [Accessed April 2, 2009]. Harnad, S., 2005. Distributed processes, distributed cognizers, and collaborative cognition. Pragmatics & Cognition, 13(3), 501-514. 212 Harnad, S., 1991. Post-Gutenberg Galaxy: The Fourth Revolution in the Means of Production of Knowledge. Public-Access Computer Systems Review, 2, 39 - 53. Hesse, H., 1947. The Glass Bead Game: (Magister Ludi) A Novel, Picador. Heylighen, F., 2006. Why is Open Access Development so Successful? Stigmergic organization and the economics of information. Available at: http://arxiv.org/abs/cs.CY/0612071 [Accessed March 14, 2010]. Heylighen, F., Heath, M. & Overwalle, F., 2007. The Emergence of Distributed Cognition: a conceptual framework. In Collective Intentionality IV. Hirschauer, S., 2010. Editorial Judgments: A Praxeology of 'Voting' in Peer Review. Social Studies of Science, 40(1), 71-103. Hoffmeyer, J., 1996. Signs of Meaning in the Universe, Indiana University Press. Hollan, J., Hutchins, E. & Kirsh, D., 2000. Distributed cognition: toward a new foundation for humancomputer interaction research. ACM Trans. Comput.-Hum. Interact., 7(2), 196, 174. Hutchins, E., 1995. How a cockpit remembers its speeds. Cognitive Science, 19(3), 288, 265. Hutchins, E., 1996. Cognition in the Wild New edition., MIT Press. Ipsen, G., 2003. The crisis of cognition in hypermedia. Semiotica, 143, 185-197. Jesus, R., Schwartz, M. & Lehmann, S., 2009. Bipartite networks of Wikipedia's articles and authors: a meso-level approach. In Proceedings of the 5th International Symposium on Wikis and Open Collaboration. Orlando, Florida: ACM, pp. 1-10. Available at: http://portal.acm.org/citation.cfm?id=1641309.1641318 [Accessed December 9, 2009]. Kittur, A. et al., 2007. Power of the few vs. wisdom of the crowd: Wikipedia and the rise of the bourgeoisie. In 25th Annual ACM Conference on Human Factors in Computing Systems (CHI 2007); 2007 April 28 - May 3; San Jose, CA. Kittur, A. & Kraut, R., 2008. Harnessing the wisdom of crowds in wikipedia: quality through coordination. In CSCW '08: Proceedings of the ACM 2008 conference on Computer supported cooperative work. San Diego, CA, USA: ACM, pp. 46, 37. Available at: http://dx.doi.org/10.1145/1460563.1460572 [Accessed September 29, 2009]. Kittur, A. et al., 2007. He says, she says: conflict and coordination in Wikipedia. In CHI '07: Proceedings of the SIGCHI conference on Human factors in computing systems. ACM Press, pp. 453-462. Available at: http://dx.doi.org/10.1145/1240624.1240698 [Accessed October 17, 2008]. Konieczny, P., 2008. Something wikid this way comes: Wikipedia as a case study of adhocratic governance in the Internet Age. In annual meeting of the American Sociological Association. Boston. Available at: http://www.allacademic.com/meta/p_mla_apa_research_citation/2/3/7/6/4/p237649_index.ht ml [Accessed December 9, 2009]. Kriplean, T., Beschastnikh, I. & Mcdonald, D., 2008. Articulations of wikiwork: uncovering valued work in wikipedia through barnstars. In CSCW '08: Proceedings of the ACM 2008 conference on Computer supported cooperative work. San Diego, CA, USA: ACM, pp. 56, 47. Available at: http://dx.doi.org/10.1145/1460563.1460573 [Accessed September 29, 2009]. Kriplean, T. et al., 2007. Community, consensus, coercion, control: cs*w or how policy mediates mass participation. In GROUP '07: Proceedings of the 2007 international ACM conference on Supporting group work. ACM, pp. 167-176. Available at: 213 http://dx.doi.org/10.1145/1316624.1316648 [Accessed October 17, 2008]. Latour, B., 1991. Technology is Society Made Durable. In J.Law (ed.): A sociology of monsters: essays on power, technology and domination. Routledge, pp. 103-113. Latour, B., 2005. Reassembling the social: an introduction to actor-network-theory, Oxford University Press. Lave, J. & Wenger, E., 1991. Situated Learning: Legitimate Peripheral Participation 1st ed., Cambridge University Press. Lehmann, S., 2010. Worlds Colliding. Complexity and Social Networks Blog. Available at: http://www.iq.harvard.edu/blog/netgov/ [Accessed June 13, 2010]. Lehmann, S., Schwartz, M. & Hansen, L., 2008. Biclique communities. Physical Review E (Statistical, Nonlinear, and Soft Matter Physics), 78(1). Available at: http://scitation.aip.org/getabs/servlet/GetabsServlet?prog=normal&id=PLEEE8000078000001 016108000001&idtype=cvips&gifs=yes [Accessed September 29, 2009]. Lessig, L., 2005. Free Culture: The Nature and Future of Creativity, Penguin (Non-Classics). Lessig, L., 2008. Remix: Making Art and Commerce Thrive in the Hybrid Economy, Penguin Press HC, The. Available at: http://www.amazon.ca/exec/obidos/redirect?tag=citeulike0920&path=ASIN/1594201722 [Accessed March 14, 2010]. Leuf, B. & Cunningham, W., 2001. The Wiki Way: Quick Collaboration on the Web, Addison-Wesley Professional. Lih, A., 2004. Wikipedia as Participatory journalism: reliable sources? metrics for evaluating collaborative media as a news resource. IN PROCEEDINGS OF THE 5TH INTERNATIONAL SYMPOSIUM ON ONLINE JOURNALISM, 16--17. Lih, A., 2009. Wikipedia Revolution, The: How a Bunch of Nobodies Created the World's Greatest Encyclopedia, Hyperion. Liu, J. & Ram, S., 2009. Who Does What: Collaboration Patterns in the Wikipedia and Their Impact on Data Quality. SSRN eLibrary. Available at: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1565682 [Accessed March 14, 2010]. Magnus, P., 2007. Distributed Cognition and the Task of Science. Social Studies of Science, 37(2), 310, 297. Mainguy, G., 2007. Wikipedia and Science Publishing. Has the Time Come to End the Liaisons Dangereuses? NATO SECURITY THROUGH SCIENCE SERIES E HUMAN AND SOCIETAL DYNAMICS, 16, 19-27. Margulis, L. & Sagan, D., 2001. Marvellous Microbes. Resurgence, 206, 10-12. Navot, E., 2008. Wikipedia as Real Utopia. Available at: http://www.stuartgeiger.com/wordpress/conference-presentations/conferencenotes/2008/07/20/wikimania-2008-wikipedia-as-real-utopia-by-edo-navot/ [Accessed April 28, 2010]. Nielsen, F., 2007. Scientific citations in Wikipedia. Available at: http://arxiv.org/abs/0705.2106 [Accessed September 29, 2009]. Niesyto, J., 2010. DAY I: Bangalore — Session 2: Global Politics of Exclusion | Wikipedia, Critical Point of View. Network Cultures. Available at: http://networkcultures.org/wpmu/cpov/2010/01/14/day-i-bangalore-session-2-global-politicsof-exclusion/ [Accessed June 14, 2010]. 214 Nunes, S., Ribeiro, C. & David, G., 2008. WikiChanges - Exposing Wikipedia Revision Activity. In WikiSym'08: Proceedings of the 2008 international symposium on Wikis. Porto, Portugal: ACM. Ortega, F. & Gonzalez-Barahona, J., 2007. Quantitative analysis of the Wikipedia community of users. In WikiSym '07: Proceedings of the 2007 international symposium on Wikis. ACM, pp. 75-86. Available at: http://dx.doi.org/10.1145/1296951.1296960 [Accessed October 17, 2008]. Ortega, F., Gonzalez-Barahona, J. & Robles, G., 2007. The Top Ten Wikipedias: A Quantitative Analysis Using WikiXRay. In Proceedings of the 2nd International Conference on Software and Data Technologies (ICSOFT 2007). Springer-Verlag. Available at: http://libresoft.es/downloads/C4_159_Ortega.pdf [Accessed October 17, 2008]. Ostrom, E., 1990. Governing the Commons: The Evolution of Institutions for Collective Action, Cambridge University Press. Palla, G. et al., 2005. Uncovering the overlapping community structure of complex networks in nature and society. Nature, 435(7043), 814-818. Panciera, K., Halfaker, A. & Terveen, L., 2009. Wikipedians are born, not made: a study of power editors on Wikipedia. In GROUP '09: Proceedings of the ACM 2009 international conference on Supporting group work. Sanibel Island, Florida, USA: ACM, pp. 60, 51. Available at: http://dx.doi.org/10.1145/1531674.1531682 [Accessed September 29, 2009]. Payette, N. & Hardy-Vallee, B.(., 2008. Beyond the Brain: Embodied, Situated and Distributed Cognition, Cambridge Scholars Publishing. Pfeil, U., 2006. Cultural Differences in Collaborative Authoring of Wikipedia. , 113, 88. Pickering, A., 1995. The mangle of practice: time, agency, and science, University of Chicago Press. Pirsig, R.M., 2008. Zen and the Art of Motorcycle Maintenance: An Inquiry into Values, Harper Perennial Modern Classics. Priedhorsky, R. et al., 2007. Creating, destroying, and restoring value in wikipedia. In GROUP '07: Proceedings of the 2007 international ACM conference on Supporting group work. Sanibel Island, Florida, USA: ACM, pp. 268, 259. Available at: http://dx.doi.org/10.1145/1316624.1316663 [Accessed September 29, 2009]. Raul654, 2005. Raul's Laws. Wikipedia. Available at: http://en.wikipedia.org/wiki/User:Raul654/Raul%27s_laws. Reagle, J., 2005. A Case of Mutual Aid: Wikipedia, Politeness, and Perspective Taking. Available at: http://reagle.org/joseph/2004/agree/wikip-agree.html [Accessed December 9, 2009]. Reagle, J., 2007a. Do as I do:: authorial leadership in wikipedia. In WikiSym '07: Proceedings of the 2007 international symposium on Wikis. ACM, pp. 143-156. Available at: http://dx.doi.org/10.1145/1296951.1296967 [Accessed October 17, 2008]. Reagle, J., 2007b. Do as I do:: authorial leadership in wikipedia. In WikiSym '07: Proceedings of the 2007 international symposium on Wikis. ACM, pp. 156, 143. Available at: http://dx.doi.org/10.1145/1296951.1296967 [Accessed December 9, 2009]. Reagle, J., 2007c. Is the Wikipedia Neutral? Available at: http://reagle.org/joseph/2005/06/neutrality.html [Accessed December 18, 2009]. Reichelt, A. & Rossmanith, N., 2008. Relating Embodied and situated approaches to cognition. In Beyond the Brain: Embodied, Situated and Distributed Cognition. Cambridge Scholars Publishing. 215 Robbins, P. & Aydede, M., 2008. The Cambridge Handbook of Situated Cognition 1st ed., Cambridge University Press. Rogers, Y., Scaife, M. & Rizzon, A., 2005. Interdisciplinarity: an Emergent or Engineered Process? In Interdisciplinary Collaboration. Mahwah, New Jersey: LEA. Rogers, Y., 2004. New Theoretical Approaches for Human-Computer Interaction. Annual Review of Information Science and Technology (ARIST), 38, 143, 87. Roth, C. & Bourgine, P., 2004. Epistemic communities: description and hierarchic categorization. Available at: http://arxiv.org/abs/nlin.AO/0409013 [Accessed June 13, 2010]. Rupert, R.D., 2004. Challenges to the Hypothesis of Extended Cognition. The Journal of Philosophy, 101(8), 389-428. Sanger, L., 2004. Why Wikipedia Must Jettison Its Anti-Elitism || kuro5hin.org. Available at: http://www.kuro5hin.org/story/2004/12/30/142458/25 [Accessed June 6, 2010]. Schroer, J. & Hertel, G., 2009. Voluntary Engagement in an Open Web-Based Encyclopedia: Wikipedians and Why They Do It. Media Psychology, 12(1), 120, 96. Shirky, C., 2008a. Cognitive Surplus Talk. Available at: http://www.shirky.com/herecomeseverybody/2008/04/looking-for-the-mouse.html. Shirky, C., 2008b. Here Comes Everybody: The Power of Organizing Without Organizations, Penguin Press HC, The. Simon, H.A., 1978. On How to Decide What to Do. Bell Journal of Economics, 9(2), 494-507. Singel, R., 2009. Wikipedia Bans Church of Scientology | Epicenter. Wired. Available at: http://www.wired.com/epicenter/2009/05/wikipedia-bans-church-of-scientology/ [Accessed June 13, 2010]. Stvilia, B. et al., 2008. Information quality work organization in Wikipedia. Journal of the American Society for Information Science and Technology, 59(6), 1001, 983. Suchman, L., 2006. Human-Machine Reconfigurations: Plans and Situated Actions 2nd ed., Cambridge University Press. Suh, B. et al., 2008. Lifting the veil: improving accountability and social transparency in Wikipedia with wikidashboard. In CHI '08: Proceeding of the twenty-sixth annual SIGCHI conference on Human factors in computing systems. ACM, pp. 1040, 1037. Available at: http://dx.doi.org/10.1145/1357054.1357214 [Accessed September 29, 2009]. Suh, B. et al., 2007. Us vs. Them: Understanding Social Dynamics in Wikipedia with Revert Graph Visualizations. In Visual Analytics Science and Technology, 2007. VAST 2007. IEEE Symposium on. pp. 163-170. Available at: http://dx.doi.org/10.1109/VAST.2007.4389010 [Accessed October 17, 2008]. Suoranta, J. & Vaden, T., 2010. Wikiworld, Pluto Press. Susi, T. & Ziemke, T., 2001. Social cognition, artefacts, and stigmergy: A comparative analysis of theoretical frameworks for the understanding of artefact-mediated collaborative activity. Cognitive Systems Research, 2(4), 273-290. Swartz, A., 2006. Who Writes Wikipedia? (Aaron Swartz's Raw Thought). Available at: http://www.aaronsw.com/weblog/whowriteswikipedia [Accessed June 6, 2010]. Turkle, S. & Papert, S., 1990. Epistemological Pluralism: Styles and Voices within the Computer 216 Culture. Signs, 16(1), 128-157. Varela, F., 1999. Ethical Know-How: Action, Wisdom, and Cognition 1st ed., Stanford University Press. Venners, B., 2003. Exploring with Wiki, a conversation with Ward Cunningham. Available at: http://www.artima.com/intv/wiki.html [Accessed June 13, 2010]. Verbeek, P., 2005. What Things Do: Philosophical Reflections on Technology, Agency, And Design illustrated edition., Pennsylvania State Univ Pr. Viégas, F., Wattenberg, M., Kriss, J. et al., 2007. Talk Before You Type: Coordination in Wikipedia. In System Sciences, 2007. HICSS 2007. 40th Annual Hawaii International Conference on. pp. 78, 78. Available at: http://dx.doi.org/10.1109/HICSS.2007.511 [Accessed September 29, 2009]. Viégas, F., Wattenberg, M. & Mckeon, M., 2007. The Hidden Order of Wikipedia. In Online Communities and Social Computing. pp. 454, 445. Available at: http://dx.doi.org/10.1007/978-3-540-73257-0_49 [Accessed September 29, 2009]. Viégas, F.B., Wattenberg, M. & Dave, K., 2004. Studying cooperation and conflict between authors with history flow visualizations. In CHI '04: Proceedings of the 2004 conference on Human factors in computing systems. ACM Press, pp. 575-582. Available at: http://portal.acm.org/citation.cfm?id=985765. Wales, J., 2005. [Wikipedia-l] Wikipedia is an encyclopedia. Available at: http://lists.wikimedia.org/pipermail/wikipedia-l/2005-March/020469.html [Accessed June 6, 2010]. Wattenberg, M., Viégas, F. & Hollenbach, K., 2007. Visualizing Activity on Wikipedia with Chromograms. In Human-Computer Interaction – INTERACT 2007. pp. 287, 272. Available at: http://dx.doi.org/10.1007/978-3-540-74800-7_23 [Accessed September 29, 2009]. Weinberger, D., 2007. Everything is Miscellaneous: The Power of the New Digital Disorder, Henry Holt & Company Inc. Available at: http://www.amazon.ca/exec/obidos/redirect?tag=citeulike09-20&path=ASIN/0805080430 [Accessed May 31, 2010]. Wilkinson, D. & Huberman, B., 2007. Assessing the Value of Coooperation in Wikipedia. Firstmonday. Available at: http://arxiv.org/abs/cs.DL/0702140 [Accessed September 29, 2009]. Wilson, R. & Clark, A., 2009. How to Situate Cognition: Letting Nature Take its Course. In Robbins, P. & Aydede, M. (eds.): The Cambridge Handbook of Situated Cognition. Cambridge University Press. Zlatic, V. et al., 2006. Wikipedias: Collaborative web-based encyclopedias as complex networks. Available at: http://arxiv.org/abs/physics/0602149 [Accessed June 13, 2010]. 217 218 Appendices are parts of work, that didn't find a prime time in the main thesis, but nonetheless insist in being reported. In this section there are 4 kinds of material. In W, there are the small studies on other Wikis related to the data studies, namely the nostalgia, danish and esperanto Wikipedias. In X, there are some reflections about the process of writing a thesis, and methodological concerns. In Y, are the papers published in their published form. And in Z, are small experiments related to the experiential part of the thesis. The last to be presented here, but the beginning of the new cycle, is the accompanying piece to this thesis, which opening will take place in the afternoon of the PhD Defense, to be called, Meta-PhD Offense. W. APPENDIX W ON WHOLE DATASETS AND PROJECTS THAT FALL THROUGH 1 A thesis is comprised of the studies that are more mature, those that yielded something, those that have been presented, discussed, developed. But other studies are part of the process. Other data sets are used as pilot studies, as ‘sandboxes’ to try ideas. Below are four of these. In the first, a project with Anne Goldenberg – we were clear to define what we would be studying, we presented that in a conference workshop. We learned that we needed a programmer. We learned that there is no money for programmers. We learned that we had to drop the project. In the second, I show some curiosities from Wikipedia Nostalgia – a version of Wikipedia, back in 2001. Looking at Wikipedia Nostalgia is looking at the past, is looking at a fossil. I am fascinated by it, but wasn’t able to use that as a study, not any more than a presentation which is here reproduced in simple text form. The third and fourth parts of this Appendix W are the networks of the two studies that served as pilots for the data studies. One is on the Danish Wikipedia (på dansk), and the other is on the Esperanto Vikipedio (en Esperanto). 2 WIKI-WRITING COLLABORATIVE PATTERNS Anne Goldenberg (Uni. du Quebec à Montreal) and Rut Jesus (Uni. Copenhagen) 2.1 Goal This project aims at understanding in depth collaborative patterns in the writing of Wikis, their articles and discussions. It focuses on the articles as the unit around which a small community revolves. In the writing of Wikis, distinct patterns of work division emerge which are distinguishable both in quantity and quality. The goal of this project is to unveil these patterns along with possible examples of distributed cognition. In order to do this, we have developed a semi-automated (still to be implemented) methodology, which sits at the border of quantitative and qualitative research. It tries to automate the careful reading of the evolution of an article and capture the important actions taken in its development. 2.2 Framework Cooperative phenomena are an emergent, much discussed, and poorly understood reality in socio-technological spheres of which Wikis are a particular successful case, when taking the example of Wikipedia. These endeavors in collaborative work provide insight into distributed cognition as it supports a transparency layer making it possible to follow what happens through Wiki log pages. Moreover, web2.0 tools create new social and work dynamics, which may provide insight into new aspects of cognition, which are amplified by a multitude of editors and their ping-pong style editing, spatial and time flexibility, unique technologycommunity fostering features and a specific free culture. This project is the methodological framework for the collection of data that will be used towards an empirically-informed 219 sociological study. As there is little methodology available to look at Wiki-articles, we have developed a bottom-up methodology to allow cases of distributed cognition and collaboration patterns in general to be seen. This is necessary to be able to unwrap the invisible work. 2.3 Genesis We have been inspired by a methodology developed by Conein (et al) to analyze the discussions pattern within an archive of the Debian user mailing-list. Their goal is to understand how knowledge is distributed and built through cognitive artifacts and within epistemic communities. In order to understand how knowledge emerges from discussions, and more specifically, how threaded communications act upon knowledge exchange, they set up a method inspired by graph theory to visualize conversation patterns. They were able to study a large corpus of 10 years (1997-2007) of message exchanges, that is 160461 messages, 43526 threads, 10029 authors and 111 intensive users. By looking at who were talking to whom, they were able to discover that knowledge providers would choose whom to answer to (knowledge seekers) following the interest they found in the question. Easy questions would find very short answers. Answers would become a thread when multiple points of view would be given on the question and create a fan shaped thread. Difficult questions would bring more discussion. A thread would occur between two or few people that would reply to each other rather than simply give an answer. This pattern of discussion would create line shaped thread. After Conein presented this research at Minds and Societies 2008, a conference and summer school about social cognition, a social cognition theoretician, Stevan Harnad asked him why the method didn't go deeper into content analysis, for example, by trying to get what were people talking about, what kind of topic or what kind of intervention brought these kinds of shapes. Conein agreed that this would imply the implementation of a qualitative methodology. In relation to Conein and Harnad, we decided to think about a qualitative methodology that will analyze Wiki contributions, including discussions about contributions. 2.4 Methodology Below we present the methodology that can be implemented in order to investigate the properties of Wiki-writing. It comprises many questions and steps, in an attempt to comprehensively map the actors interactions in their contributions and discussions. We have set up a list of questions that would allow us to do a multi-level analysis of the interactions occurring in the building of a Wiki page. We are interested in pages that bring lots of editing, but also lots of discussion. For now, we established five questions concerning contributions on the Wiki page itself. Three other questions concern the discussions about the Wiki page. We use the following data sets for each of the articles to be studied: the article, the log of its edits (history page), the discussion page (or discussion list) and the log of the edits in the discussion page. Depending on the Wiki project, the discussions may occur on a related page or on a discussion mailing list. We are, though, aware that upon implementation the codes and links and necessary pieces of information will be adjusted in accordance with the data and the data-collection process. Moreover, not all of the process may be automated, so ‘human eyes’ will be used occasionally to decide categories, links and the like. We pretend to use these tools to sketch the collaboration patterns of Wiki-writing and therefore want to apply it to articles from different Wikis: English Wikipedia, French Wikipedia (Quebec Project), Debian Wiki and Ubuntu Wiki. 2.5 2.5.1 Contributions Who is editing? For this question, we have to identify user participation, it is to say to attribute a number to each contributor. This question will allow us to analyze which users took part on the page 220 building. In Wikipedia, an archive of each user contribution is easily available, and it includes quantitative data. 2.5.2 Who is editing which kind of edit? Here, we have to categorize the different kinds of editions occurring in a Wiki. Adding, deleting, modifying information would be the three major kinds of edits. To go further, we could consider adding of url, of Wiki links, grammatical modification. Pfeil, U., Zaphiris, P., and Ang, C. S, (2006) made an even more extensive categorization. Once this categorization is done, we can qualify each intervention. 2.5.3 Who is editing who's text? This question is required to relate text addition, deletion and modification to previous text addition, deletion and modification. For that, each edition has to be attributed to a contributor (see question 1.1) 2.5.4 Who changes the structure? This question concerns major changes that deal with the structure. It may be an edition that concerns the creation, deletion or the modification of a section title. A major edition could also concern the alteration of the text structure, such as a change in the paragraph order. 2.5.5 Who is editing which section? This question would allow us to identify with more precision which section each user edited. This question could be specified with a frequency indicator. In order to do so, each edition would be related to a section title. 2.6 Discussions This second set of questions is aimed at putting in relation discussions with contributions. We want to understand to what extend discussions act upon knowledge construction and distribution of work. Do we find the same user contributing and discussing? Or are there, on the contrary, contributor and discussion specialists? Are discussions really affecting contribution? If so, how? 2.6.1 Who is talking to whom? This question would first allow us to make a list of users who take part in the discussion. The list of participants in the discussion (with frequency data) should be compared and linked to the list of contributors. This question is a replication of Conein (et al)'s method to analyze interactions within the Debian user list. As questions, answers and replies occur in a Wiki discussion page with the same shape as in a discussion list, they can be approached the same way. The purpose will be to see if questions lead to short answers (no thread), multiple answers (fan shaped thread) or a list of replies (line shaped thread). Conein's theory is that knowledge is being discussed (and produced) when there is a line shape thread: knowers disagree on a question as the knowledge isn't stabilized about that point. The challenge here would be to analyze to what extent discussions are considered to be constructive, necessary or a waste of time for the community? 2.6.2 Who is talking to whom about what? To go further in the discussion analysis, this question will allow us to identify the discussion subject. Most of the time, questions have a title, and answers and repliers copy the title to identify what they talk about. Compiling this question with 2.1. will allow us to see who is talking to who about what, and so to analyze, for example, which subjects bring up discussions, and what kind of discussions (or thread shape). 221 2.6.3 Which section is the thread about ? This last question is another attempt to put in relation discussion content to contribution content. In addition with question 2.2, linking thread to section title could also be an interesting way to look at what kind of content brings up discussions. 2.7 Conclusion The communication will end with a discussion of the purpose of this methodology in relation with other existing analysis tools such as History flow, Graphing Wiki, WikiXRay and other qualitative analysis software such as CAQDAS. 2.8 Bibliography Conein, B., Auray, N., avec Conein, B., Dorat, R., Latapy. (2007) M., “Multi-level analysis of an interaction network between individuals in a mailing-list”, in Annals of telecommunications, Vol 62, No 3-4, april. Pfeil, U., Zaphiris, P., and Ang, C. S. (2006). Cultural differences in collaborative authoring of Wikipedia. Journal of Computer-Mediated Communication, 12(1), article 5. http://jcmc.indiana. edu/vol12/issue1/pfeil.html 3 NOSTALGIA One ‘side study’ is the study of “Nostalgia Wikipedia” – Nostalgia is a copy of Wikipedia in its first year (or even less), which had many fewer articles, editors and a different logo. It is very interesting to study nostalgia and its patterns because it gives information about the beginning of Wikipedia. On the side, the original logo, below, how the homepage looked like. 222 60 Below is a list of the first 100 articles : 61 Interesting to note the article in 15th position: RidiCulous 2001. started on the 28th of January “A misspelling, based on a mispronunciation, of "really clueless."----Or is it "really queueless" as in the model Registry of Motor Vehicles office?” Interesting to note the 67th article: MetaphoR 62 started on the 15th of February 2001. A figurative use of language that is used to paint one concept with the attributes normally associated with another. For example, "ship of state" is a metaphor that likens the government to a ship: just as a ship needs a captain to make decisions, give orders, and control things, so a government needs someone to make decisions, give orders, and control things. By refering to the "ship of state" one emphasizes this aspect of government. Metaphors are very powerful tools because they allow for the expression of very abstract principles by reference to concretes. They can also be dangerous to understanding, in that people may fail to recognize the figurative nature of a metaphor, and come to take it literally. 60 61 62 From http://nostalgia.Wikipedia.org/Wiki/Special%3AAncientPages http://nostalgia.Wikipedia.org/Wiki/RidiCulous http://nostalgia.Wikipedia.org/Wiki/MetaphoR 223 63 64 Interesting to note the articles in 94th and 95th position: København and Copenhagen . While “København” reads Danish for Copenhagen, Copenhagen says: Copenhagen is the capital of Denmark and the largest city in Scandinavia. The Danish name for the city is København. Copenhagen is located on eastern shore of the Sjæland island, across the body of water known as the Øresund, which faces the Swedish town of Malmø on the other side. Since the summer 2000, the cities of Copenhagen and Malmø have been connected by a toll bridge/tunnel which allows both rail and road passengers to cross. As a result, Copenhagen has become the center of a larger metropolitan area which spans both nations. Places of note in Copenhagen Tivoli Gardens? Christiania From January 15th to December 20th of 2001, the first year of Wikipedia, 19000 articles were started. Also other languages were started: Afrikaans, Arabic (Araby), Catalan (Català), Chinese (Hanyu), Danish (Dansk), Dutch (Nederlands), German (Deutsch), Esperanto, French (Français), Hebrew (Ivrit), Hungarian (Magyar), Italian (Italiano), Japanese (Nihongo), Norwegian (Norsk), Portuguese (Português), Russian (Russkiy), Spanish (Castellano), Swedish (Svensk), Polish (Polska), Basque (Euskara) 3.1 Networks The two networks below, which are quite different because of the different filters – 2 edits per editor per page minimum, and 10 edits per editor per page minimum, show that the first times of Wikipedia had an explosion of pages – only later came the explosion of editors. Figure 3-1: On the left, there is the network of editors (blue dots) and pages (grey dots) in the Nostalgia Wikipedia. Notice the several times that one editor created many pages, untouched by other editors (the concentric circles). On the right – if the edit filter is raised to 10 (value used in most of the other networks studied in this thesis), only few connections are kept, and still most are of the type where an editor is solely responsible for a page. Only later became a general practice to edit ‘together’. This is understandable – if one was about to edit an encyclopedia in its earlier stages, one would naturally devote more time to create new necessary pages, than start editing on someone else’s realm, only concerned for improving the work. 63 64 http://nostalgia.Wikipedia.org/Wiki/København http://nostalgia.Wikipedia.org/Wiki/Copenhagen 224 4 DANSK I undersøgelsen af meta-artiklerne havde jeg flere datagrupper i starten. Udover den portugisiske og spanske Wikipedia, var der Wikipedier på esperanto, dansk, svensk og nynorsk. Meningen var oprindeligt at også undersøge de skandinaviske Wikipedier udførligt. I sidste ende var det imidlertid kun de sydamerikanske Wikipedier, der kom med, fordi resultaterne skulle præsenteres I Buenos Aires, hvor det var interessant at kommunikere med portugisiske og spanske brugere. Desuden var det et spørgsmål om projektets omfang og tid. Man må jo begrænse sine undersøgelser, så man ikke ender med at skrive flere ph.d.afhandlinger i én! Det har været frustrerende ikke at kunne undersøge alle spørgsmål og perspektiver grundigt – men dette er selvfølgelig nødvendigt for at kunne afslutte projektet. 4.1 Statistik 105879 274027 3110107 66913 1477 seneste måned) 114 36 2 2 indholdssider sider sideredigeringer siden Wikipedia blev startet registrerede brugere aktive brugere (brugere der har foretaget mindst én handling i løbet af den robotter administratorer bureaukrater checkbrugere Udtræk den 30. Marts 2009. Wiki Indholdsartikler a) Artikler i 4 “namespaces” (wp,wt,h,ht) 5332 3424 33690 14975 28541 sider b) Brugere DaWiki (4) 105879 274027 3896 EoWiki (4) 112434 249335 2006 PtWiki (4) 468999 1825914 35368 EsWiki (4) 457836 1610623 77036 MetaWiki 14306 131443 82345 (4) Tabel 1: Sammenlignede a statistik for 5 Wikiers meta-artikler. Bruger/ metaartikel (b/a) 0.73 0.59 1.05 5.14 2.89 I netværksbilledet indgår data fra den danske Wikipedias følgende ‘namespaces’: Wikipedia, Wikipedia_diskussion, help og help_diskussion. Nedenfor kan man se netværksbilledet, filtreret til 10 (mindst ti sideredigeringer per brugere per link). 225 Figure 4-1: netværket mellem meta-artikler og brugere af den danske Wikipedia. Blå illustrerer personer, grå viser sider. ‘Paraplyen’ nederst er “Wikipedia: Sandkassen”. 5 ESPERANTO La unua studo kun ‘duopaj klikoj’ estis farita kun datumo el la kategorioj ‘Esperanto’ kaj ‘Kordoteorio’ el la angla vikipedio. Tiuj estis uzitaj ĉar ili estas malpli grandaj, do oni povis trejni la proceduron antaŭ ol uzi ĝin pli vaste. Tiel oni povas ekzerci kaj antaŭvidi la problemojn. Ambaŭ kategorioj havis ĉirkaŭ cent artikolojn. La esperanta Vikipedio estis unu el la unuaj. Kontraŭe kun multaj aliaj lingvoj, ne ekzistis enciklopedio en Esperanto antaŭe. Esperanto ja ne estas la lingvo de cxiuj, sed daure ekzistas, kaj siamaniere kontribuas por la bono de la mondo, eĉ probable por la paco. Ĉ i tie, mi montras kelkajn de la retoj kiuj eblas fari el la ‘meta’ parto de la esperanta vikipedio kaj el la esperantaj artikoloj en la angla vikipedio, kiuj nombris ĉirkaŭ cent antaŭ ol du jaroj. Antaŭe, mi volas rakonti rakonton, kiu estos aprezita de la esperantistaro. 5.1 Rakonto Unu el la grandaj sukcesoj de Vikipedio (de la tuta projekto) estas ke ĝi ekzistas en tiom multe da lingvoj, kaj ke, eĉ se la angla estas la plej granda, estas deziro kaj laboro por kuraĝigi la ekziston de aliaj lingvoj. En la somero de 2009, mi ĉ eestis la konferencon WikiMania, en Bonaj Aeroj — grava kaj interesa konferenco pri la vikipedio, kaj la aliaj projektoj de la Fonduso. Fakte, la prelego de Jimmy Wales estis pri lingvoj — kiom granda vikipedio estas en la diversaj lingvoj — kaj ke estas gravege havi tiom multe da lingvojn. Tie la kvar ĉ eestantaj esperantistoj renkontigxis. Parolante kun Jimmy, li klarigis ke fakte, Esperanto, estis gravega en la pasinteco de vikipedio ankaŭ por helpi vikipedio al multaj lingvoj. En la komenco, estis bezonata programisto, kaj Chuck Smith, jam kunlaboranta de la Vikipedio, kaj la kreinto de la vikipedio en Esperanto, proponis al Jimmy ke unu ebla homo estus Brion Vibber. Brion ankaŭ estis esperantisto. Kaj kiam li komencis labori en la programado de vikipedio, li multe zorgis ke estus ebleco skribi en ĝi per aliaj lingvoj. Tiel, ne nur Esperanto estas unu en la plej grandaj dudeko de vikipedioj, sed ankaŭ, iamaniere kiel pontolingvo, funkcis por malfermi la projekton al granda parto de la mondo. 5.2 112434 Kelkaj statistikaj informoj enhavaj paĝoj 226 249335 2192323 16372 483 93 15 2 paĝoj paĝaj redaktoj ekde vikipedio estis starigita registritaj uzantoj aktivaj uzantoj (uzantoj kiuj faris agon en la lastaj 30 tagoj) robotoj (listo de anoj) administrantoj burokratoj El la statistika paĝo, la 30a marto 2009 5.3 Retoj La datumo por fari la reto de la krom-artikoloj estis kaptita el la nomejoj: “Vikipedio:”, kaj “Vikipedia_diskuto:” Figuro 5-1: reto inter la uzantoj kaj la krom-artikoloj en la esperanta vikipedio, filtrita al dek (por malmultigi la numeron de paĝaj redaktoj). Estas interese noti kiel kelkaj uzantoj estas solaj respondeculoj por multaj paĝoj kiu montriĝas per cirklo de paĝoj ĉirkaŭante tiun uzanton. La ‘ombrelo’ farita de nodoj suben konektas multajn uzantojn al la paĝo “Diskutejo”, kiu estas ekvivalenta al la “Village Pump” en la angla vikipedio, kaj al la “Esplanada” en la Portugala vikipedio. Ankaŭ unu el la unuaj studoj por pripensi tiun ĉ i tezon estis kapti la artikoloj el la Angla Vikipedio en la kategorio “Esperanto” kaj en la kategorio “Kordoteorio”. Tiuj retoj estas montrita sube. Kiel videblas estas multe pli da bluaj nodoj (homoj), ĉar ĉitiaj retoj ne estas filtritaj – pro tio estas multaj unuopaj uzantoj. En ambaŭ, tiuj ĉi retoj ŝajnas komunumoj de aktivaj redaktoroj koncentritaj en tiuj ĉi temoj. 227 Figuro 5-2: Reto de artikoloj kaj redaktoroj en la kategorio “Esperanto” ĉe la angla vikipedio. La plej granda ombrelo en la mezo ĉirkaŭas la artikolon “Esperanto”. Figuro 5-3: reto de artikoloj kaj redaktoroj en la katehorio “Kordoteorio” en la angla vikipedio. La plej granda ombrelo maldekstre ĉirkaŭas la plej gravan artikolon en la kategorio, kiu nomiĝas “Kordoteorio”. 228 X. APPENDIX X 1 REFLECTIONS INSIDE "everything I write, of course is an extended metaphor for something I never mention" Hugh MacDiarmid As the old techie saying goes, it's not a bug, it's a feature. Black boxes are everywhere. I usually carry one ‘your thesis is done when it is finished’ Latour I liked to use a case study. Deal with the concrete, rather than just ‘being in the air’. Even with the interdisciplinary tricks and the artistic hopes, a PhD did give me the focus of a stable ground, a stable ground to be able to fly from…. around, it has some dark brown hair on the outside, plus some cavities. I have some access to what is going on inside, specially in peak times, headaches and eurekas. I am interested in opening it as much as possible, research my thinking. It is difficult. It won’t ever happen. But the attempt counts. Even if ever so shallow. It is quite networked thinking, spread across many sides of existence. Not so hierarchical. If it was, I’d be able to tell a story of first doing this and then doing that. But much happened in parallel. And, oh! what beautiful years were these When our hearts clung each to each; When life was filled and our senses thrilled In the first faint dawn of speech. Thus life by life and love by love We passed through the cycles strange And breath by breath and death by death We followed the chain of change. Langdon Smith I certainly used both Cognition for Planning and Cognition for Improvising, and tried to keep my gooey tendencies inside of the prickley demands. Much of what I learned about Wikipedia and cognition, I was applying to getting to know my thought process. For example, reading about KnowHow/Know-What gave me insights about my longings to ‘become’ instead of ‘have’. Reading about bricoleurs made me understand some the struggles with gender, career, handling the concrete, and seeing life as one long conversation. It provides examples of the validity and power of concrete thinking in situations that are traditionally assumed to demand the abstract. It supports a perspective that encourages looking for psychological and intellectual development within, rather than beyond, the concrete and suggests the need for closer investigation of the diversity of ways in which the mind can use objects rather than the rules of logic to think with. Turkle & Papert 229 2 2.1 REFLECTIONS ON METHODOLOGY About abduction “How can I know what I think until I see what I say” The classical accounts of scientific methodology, focusing only on the hypothetical-deductive and/or inductive aspects of hypothesis making and testing are not enough to account for what goes on in a typical research project. Much of scientific reasoning, everyday reasoning and reasoning used to produce these studies were accomplished by abduction. In most scholarly work, abduction is used, although rarely acknowledged. This said, it has been challenging to formulate the problem in a way that is not hypothesismaking and testing-proving but can simultaneously guide and focus the research. Abduction is a non-mechanical mode of reasoning, as Peirce (1903) so well described: “Abduction is the process of forming an explanatory hypothesis. operation which introduces any new idea; for induction does nothing and deduction merely evolves the necessary consequences of a pure proves that something must be; Induction shows that something Abduction merely suggests that something may be.” It is the only logical but determine a value, hypothesis. Deduction actually is operative; With abduction at hand there was the space for the information to surface ‘naturally’ without choosing beforehand ‘what to see’. This doesn’t happen ‘per se’, but through many cycles of understanding. 2.2 On the process from idea to ‘idea that works’ The process of getting data to ‘work’ is particularly troublesome. Along the way there are many decisions that one has to take, all with relevance to the final results. The process included having an idea, discussing it with the colleague at length, explaining it in detail to the programmer (‘ordering a program’); learning how to use the little program (how to run it, where to put parameters); running a number of samples and seeing either in the raw data or in the graphs if it added up to what was expected. If it did, then a new idea would come, that also needed a special program or parameter, and one would go back to the programmer, and so on, iteratively. If it didn’t do what was expected, then talking back to the programmer and to the colleague to trying to figure out what could have been misunderstood, or what could be that the program was doing that wasn’t expected. For each small idea, a number of weeks were needed to complete the process of ‘having idea’ to ‘being able to use idea’. For a bigger idea, a couple of months would be between ‘having idea’ to ‘using idea’ to ‘seeing idea’. For example, when we first extracted the data of all the philosophy and physics articles from the English Wikipedia, we had to ask, how to define ‘all articles’? We decided that depth 3 would do. How to extract depth three? Open up the tree to depth three and compile a list of everything that comes inside. And how to have the file? Download it from Wikipedia – but there are several versions – with history, with text, etc – and some of them – are seriously big, needing storage space and hours to search for anything in them. We’ve downloaded the version without text, but with histories – turned out that only these were needed because we were only investigating the links between articles and editors, not the precise text they had edited – so when we did have a list with editors, one with articles, and one with the links between them – this was still too big a list that couldn’t compute in our systems in reasonable time (hours, days). So – filtering was the option. Long discussion on what kind of filtering – should we just avoid all that aren’t registered, all the anonymous? On what grounds? Should we ignore all those that only made one edit? We got to make a graph to see the distribution of edits by editors and articles – to give an impression of where could we filter without losing too much relevant data. Then we realized that thresholding by number of edits was not enough to reduce our database. So, I suggested that we threshold by number of edits per article. That way would reduce considerably. How to chose the number though? Seven 230 seemed to be the lowest that we could compute, but because seven yield very dense clusters, we also thresholded at ten (getting two filters, and two datasets). 2.3 On improving the theory and the tools while researching Soon into the data acquisition process it is clear that the tools don’t do all one wants them to do. For example, while using BCFinder, we improved it in several ways to cater my dataset and research needs: it got a right-clique function over the nodes in order to be easy to access it directly in Wikipedia (either the page or the user-page); it asks for the min and max threshold of a and b, and computes only within those two values for each parameter; the network of communities was introduced and easy to visualize to see the relations between the articles and editors in each biclique community. Also, the studies were build upon the know-how gained on the previous ones. This is to say, once one goes through all those choices, and small programs, then it is easier to follow a recipe, and be faster in each step. But still, each dataset needs its own work. For example, when the data were extracted for the Portuguese, Spanish and MetaWiki (and some others, please see – appendix W), it was easy to try a couple of filters for how many edits per article an author should have done and settle for ten. But then it was still a bit too big to be computed, and it was necessary to use the threshold at a and b. So, a new little program was written (a group of them)... 231 Y. APPENDIX Y This appendix reproduces already published papers, in the published format, that are part of the main corpus of the thesis. No new information is included. BIPARTITE NETWORKS OF WIKIPEDIA’S ARTICLES AND AUTHORS: A MESO-LEVEL APPROACH 1 2 WHAT COGNITION DOES FOR WIKIS 232 Bipartite Networks of Wikipediaʼs Articles and Authors: a Meso-level Approach Rut Jesus Martin Schwartz Sune Lehmann Center for Philosophy of Nature and IT University of Copenhagen, DK-2300 Center for Complex Network Science Studies Copenhagen S and Informatics and Research and Department of Physics, University of Copenhagen Mathematical Modelling. Technical Northeastern University, Boston and Blegdamsvej 17, 2100 Copenhagen, University of Denmark. DK-2800 Kgs. Center for Cancer Systems Biology, Denmark Lyngby, Denmark Dana-Farber Cancer Institute, Harvard +4561339903 +45 50571799 University, Boston, MA 02115, USA +1(617)3738806 vulpeto@gmail.com the1schwartz@gmail.com sune.lehmann@gmail.com ABSTRACT 1. INTRODUCTION This exploratory study investigates the bipartite network of articles linked by common editors in Wikipedia, ‘The Free Encyclopedia that Anyone Can Edit’. We use the articles in the categories (to depth three) of Physics and Philosophy and extract and focus on significant editors (at least 7 or 10 edits per each article). We construct a bipartite network, and from it, overlapping cliques of densely connected articles and editors. We cluster these densely connected cliques into larger modules to study examples of larger groups that display how volunteer editors flock around articles driven by interest, real-world controversies, or the result of coordination in WikiProjects. Our results confirm that topics aggregate editors; and show that highly coordinated efforts result in dense clusters. Wikipedia is a good example of social production of knowledge. Authors and articles constitute a network, which we study here at the meso-level. Investigations on knowledge-producing agents and their networks are of interest to both network and quantitative analysis studies, as well as to the social sciences. Moreover, it is particularly interesting to try to understand the network structure and dynamics inferred from low level information subsequentlyl complemented with higher level information. Wikipedia’s network of authors and articles, is more horizontal than other networks (for example, those of the peer-reviewed scientific literature) – e.g., it has more edits per person and per article. Categories and Subject Descriptors 1.1 Related Literature H.5.3 [Information Interfaces and Presentation]: Group and Organization Interfaces—Computer-supported cooperative work, Web-based interaction; K.4.3 [Computers and Society]: Organizational Impacts—Computer-supported collaborative work; J.4 [Social and Behavioral Sciences]: Miscellaneous. 1.1.1 Network analysis General Terms Algorithms, Design, Human Factors. Keywords Bicliques, Wikipedia, Collaboration, Meso-level. Network analysis has previously been used to describe Wikipedia’s growth. For instance, Capocci et al. (2006) [1], delineate the properties of the growth of Wikipedia as a network, with topics modeled as vertices and hyperlinks between them represented as edges. This study shows how the growth of Wikipedia can be described with local rules such as preferential attachment, while contributors are still free to act globally in the network. It has also been discovered that many network characteristics are similar between different language versions of Wikipedia; examples are degree distribution, growth, reciprocity and clustering, Buriol et al, 2006[2]; Zlatic et al, 2006 [3]). 1.1.2 Quantitative analysis Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. WikiSym '09, October 25-27, 2009, Orlando, Florida, U.S.A. Copyright 2009 ACM 978-1-60558-730-1/09/10. Quantitative analysis of Wikipedia users has been investigated by Ortega and Gonzalez-Barahona (2007) [4] in a framework, where editors were classified by their activity during specific time periods. A comparison between imposed classifications and real clustering was performed by Capocci, Rao and Caldarelli (2008) [5]. 1.1.3 Cooperation The level of cooperation in Wikipedia has been carefully analyzed by Viegas et al. (2007) [6], who stress the need to study Wikipedia’s growth in terms of its clusters and namespaces beyond the articles. These authors emphasized that the fastest growing areas (namespaces) in Wikipedia are devoted to coordination of article-writing and conventions. They create a grid of categories and code the contents of discussion pages according to that grid. They discover that these pages mostly act as a place for strategic planning of edits and enforcement of standard guidelines. A study by Wilkinson and Huberman (2007) [7] shed light on the stochastic mechanism by which articles accrete edits. They show that there is a positive correlation between article quality and number of edits, thereby validating Wikipedia as a successful collaborative effort. A more recent study by Kittur and Kraut (2008) [8] specifies better the impact of adding editors for the quality of articles: the addition of editors improves the quality of an article in its formative stage, and when the coordination is done directly in the writing of the article, but the addition of editors to an article can be harmful when the coordination is done explicitly in talk pages. 1.1.4 Visualizations The visualization of collaboration within Wikipedia is also an active field of research; the tools include: (1) history flow (Viégas et al, 2004 [9]) an application that can be used to visualize the contributions to an article; (2) visualization of the whole coauthorship networks (Biuk-Aghai, 2006 [10]), and (3) the use of revert graph visualizations (Suh et al, 2007 [11]). 1.2 Conceptual Framework 1.2.1 Meso-zoom Most of the work referenced above focuses on either the global statistics of the entire Wikipedia project, or on the atomic descriptions of individual articles. However, collaboration in Wikipedia occurs at the meso-level, where groups of people collaborate in order to create articles . We here focus on the mesolevel, not only in terms of scale, but also in terms of analysis. This is a study where low-level phenomena – i.e., agents and their interactions and behaviors, inform a higher level – that of clusters between articles and editors. 1.2.2 Meso-approach We stay in the middle. Modules of articles and editors are investigated, rather than whole wikipedias and their statistics or single discussions and their descriptive sociologies. Moreover, we stay in the middle regarding our approach, supported by our interdisciplinary skills in physics and philosophy: we employ network visualizations, but we neither make comprehensive statistical analyses nor detailed ethnographic studies. Although this interdisciplinary approach may appear lacking from the point of view of either of these ‘pure fields’, we believe that the interdisciplinary nature of this study allows us to integrate mathematical tools and sociological methodologies to allow us to see general patterns without the oversimplification that is often the result of a purely quantitative approach. In the following sections we introduce bipartite networks and present and defend our choices concerning data and visualization. Subsequently we show several case studies and examples of bipartite modules surrounding various controversies, interests and projects. We consider the network formed by overlapping clusters of articles and editors and utilize this to detect isolated cliques. We also present the clusters, which are not bounded by content. Finally, we discuss the results and propose lines for future research. 2. Method 2.1 Bipartite Networks A bipartite network is a graph G = (U, V, E) whose vertices (or ‘nodes’) can be divided into two disjoint sets U and V such that every edge (or ‘link’) E connects a vertex in U to a vertex in V; that is, U and V are independent sets. When we consider articles in Wikipedia and their editors, a bipartite network is a convenient representation: U is the set of editors and V is the set of articles in Wikipedia. The bipartite network formalism is ideal for studying collaboration, because the network structure encodes knowledge about which articles editors have edited together. By studying the clusters (or ‘modules’) in the bipartite network, we are able to discover clustering of editors and articles and smaller patterns of collaboration. We choose to call dense groups clusters or modules rather than ‘community', because the latter is an ill-defined concept across disciplines and may imply structures at the macro-level not present in this meso-level study. These dense groups could also be called 'epistemic communities' as used by Roth (2006) [12] where epistemic communities are understood as a descriptive instance only, not as a coalition of people who have some interest to stay in the community: it is a set of agents who participate in building the same knowledge. Bicliques or their various names (closed sets, closed couples, formal concepts, maximal rectangles, bipartite communities) were initially studied by mathematicians Birkhoff (US), Barbut (F) together with Monjardet (F) and by computer scientist Rudolf Wille (DE). And they continue to be explored in formal concept analysis and by mathematical sociologists. We do not choose to review these mathematical formulations of bicliques at length, and focus instead on their use in network research where they are the building block of clusters/communities/groups. One method for detecting modules in bipartite networks, grounded in physics of networks and expanding the work by Palla et al (2005) [13] was developed by Lehmann et al. (2007) [14]. This method is based on detecting the most dense areas of the graph (called maximal bi-cliques) and then agglomerating overlapping bi-cliques into larger modules. More formally, a biclique is a complete subgraph of a bipartite network. A ‘maximal’ biclique is defined as a biclique that is not a subgraph of any larger bi-clique. We use the notation Ku,v to describe a biclique with u nodes in node-set U and v nodes in node-set V. Connecting this to the network of editors and articles in Wikipedia, a K3,5 cliques describes a structure where three editors have all edited the same five articles. Two bi-cliques of size Ka,b are adjacent if they share at least a Ka−1,b−1 clique. A Ka,b module (or ‘cluster’) is the union of all adjacent Ka,b cliques. One important feature of this definition is that nodes can belong to more than one cluster; that is, two distinct modules may overlap. Furthermore, by changing the values of a and b allows for different zooms. 2.2 Data and Visualization 2.2.1 Subset We analyze a subset of the English language Wikipedia, namely the articles in the categories Philosophy and Physics to depth level three. The choice of a subset is, after all, arbitrary but our sample was motivated by familiarity with the topics (given our educational background, which is important to make semantic claims about them) and by the size of the disciplines and their representation in Wikipedia. As categories in Wikipedia can be nested recursively, the set of articles includes not only articles inside Physics and Philosophy but also those in different subjects up to three steps of association from the main categories. The decision to include sub-categories and sub-sub-categories is similar to the choice of Halavais and Lackaff (2008) [15]. The authors assume that a ‘core’ of the disciplines can be sampled in this way. 2.2.2 Filtering Similarly, editors were filtered by the number of edits they had contributed to each article. Editors that edited 7 or more, or 10 or more times in an article were included (both thresholds were applied but for different purposes). This filtering helped to avoid clutter (and allow for computational capacity), and helps us concentrate on the most engaged editors and articles in dense clusters. Although sporadic edits can be important to Wikipedia as a whole, they are less relevant when considering the cooperation and interaction between editors of a small subset of articles. A few examples indicate that the lack of information regarding sporadic editors does not compromise the analysis of highly engaged clusters of editors and articles. 2.2.3 Anonymity The ‘real nicknames’ of the editors are kept due to the public nature of their work; as is clear from the examples, their identity is not at stake, not more than by creating an account in the Wikipedia website. 2.2.4 Bi-clique visualization The open source program, BCFinder developed by Lehmann et al (2007) [14] was used to calculate and visualize the modules that arise from combining adjacent bi-cliques. BCFinder allows one to visualize the articles and editors of each cluster (and also easily access those pages and user pages in Wikipedia)., In addition, it makes it possible to visualize the network of modules. Each Ka,b module can be thought of as ‘zooming’ into a relevant area of the network. Moreover, one can view the network of modules, where each cluster is a node and two module-nodes are linked if they share either one or more editors or one or more articles (see example in Fig. 5). A distinct network of modules is created by each zoom and yields insight about which zooms divide the network into meaningful sub-parts. Further, the network of clusters allows one to identify isolated clusters. Using the filter of minimum number of edits-per-article-per-editor set to 7, we worked with 33335 editors and 17643 articles (we call this the ‘7edit network’); when the minimum number of edits-per-articleper-editor was set to 10 we worked with 19612 editors and 13241 articles (the ‘10-edit network’). 2.2.5 Typology Upon getting all possible clusters, they were grouped in order to identify the specific examples below. Although taken from a specific Ka, b cluster, these examples are fairly robust to changes in a or b, up to a certain point. The kinds shown below span the possible types found in the data. 3. RESULTS 3.1 Controversies 3.1.1 Evolution/Creationism The first type of collaboration in Wikipedia is the one fueled by deep disagreement. One example of such a cluster is the controversy between evolution and creationism. In Figure 1 we display the major players of this cluster tying controversial articles. Here, we study the 10-edit network. The module is composed of adjacent bi-cliques of size K6,3 or greater. The articles present in this cluster show that a debate is taking place. For example, the two articles ‘Evolution’ and ‘Creationism’, are edited by the same group of editors. The controversy here surrounds a religious/non-religious discussion that ultimately questions the validity of science. The presence of ’Atheism’ and of ’Pseudoscience’ supports the debate of religious values in relation to scientific values. . Figure 1: The Evolution/Creationism debate is mirrored in the way the articles ‘Evolution’, ‘Pseudoscience’, ‘Creationism’, ‘Atheism’ and ‘Creation science’ belong to the same cluster. These articles are edited by at least 13 active editors engaged in this controversy. Figure 2: Zooming in the Evolution/Creationism debate by including more edits. The vertices are scaled according to number of links. More articles and more editors are involved in this dispute. This cluster gives clues about some of the hidden players, for example ‘Richard Dawkins’ and the ‘Discovery Institute’. In Figure 2, another cluster around the same topic is shown. In this figure, each vertex is scaled such that nodes with more links are larger; this makes it easier to see that the two major articles are ‘evolution’ and ‘creationism’. Some of the smaller articles yield further insight into other actors participating in this dispute: ‘Richard Dawkins’ “is a British ethologist, evolutionary biologist and popular science writer. In addition to his biological work, Dawkins is well-known for his views on atheism, evolution, creationism, intelligent design, and religion. He is a prominent critic of creationism and intelligent design” as is stated in the first lines of the Wikipedia articlei. An important concept in this controversy seems to have been heavily edited as well: ‘Irreducible complexity’ which “is an argument made by proponents of intelligent design that certain biological systems are too complex to have evolved from simpler, or “less complete” predecessors, through natural selection acting upon a series of advantageous naturally occurring chance mutations”ii. On the other side of the debate, the major concept at stake is ‘Natural Selection’ which “is the process by which favorable heritable traits become more common in successive generations of a population of reproducing organisms, and unfavorable heritable traits become less common.”iii These clusters can also reveal players that would be otherwise hidden to those not ii From Wikipedia, “Richard dawkins.” Retrieved on May 2nd 2008 from http://en.wikipedia.org/wiki/Richard Dawkins. ii From Wikipedia, “Irreducible complexity.” Retrieved on May 2nd 2008 from http://en.wikipedia.org/wiki/Irreducible complexity. iii From Wikipedia, “Natural selection.” Retrieved on May 2nd 2008 from http://en.wikipedia.org/wiki/Natural selection. involved. For example, the Discovery Institute “is a U.S. think tank based in Seattle, Washington, best known for its advocacy of intelligent design and its Teach the Controversy campaign to teach creationist anti-evolution beliefs in United States public high school science courses.”iv Investigating the other set of nodes (editors) involved in this discussion is also revealing. Represented in their user pages we discover a range of attitudes. One editor states clearly that he was involved with the article ’Intelligent Design’, which he started in 2001, but from which he was banned in 2008. Other editors decided to leave Wikipedia—it is not clear if the controversy discussed here played a role. Still other editors appear to have been highly involved in fighting vandalism; it is well known that controversies are more prone to vandalism (Viégas et al, 2004 [9]). 3.1.2 Intelligence and Global Warming Several other controversies can be identified based on the modules in our subsection of Wikipedia. The controversy in Figure 3 is based on the 7-edit network, and displays a module based on K5,4 bi-cliques. This group is engaged in a discussion of the issue of intelligence and the validity of the intelligence tests and some claims for correlations. In addition, ‘The Bell Curve’ is a controversial book on how intelligence can be a predictor of social factors. Likewise ‘IQ and the Wealth of Nations’ is another controversial book discussing the relation between IQ prosperity of nations. iv From Wikipedia, “Discovery institute.” Retrieved on May 2nd 2008 from http://en.wikipedia.org/wiki/Discovery institute. Figure 3: Controversy surrounding intelligence, its measures and correlations comprised of the articles ‘Race and intelligence’, ‘The Bell Curve’, ‘IQ and the Wealth of Nations’, ‘Intelligence quotient’, and ‘Flynn effect’. Figure 4: Controversy surrounding global warming. It comprises the articles ‘Solar variation’, ‘El Niño-Southern Oscillation’, ‘Carbon dioxide’, ‘Sea level rise’, ‘Global warming controversy’, and ‘Fossil fuel’. Another characteristic example of a ‘conflict-cluster’ is displayed in Figure 4. Here controversy regards global warming and the diverse factors surrounding this subject. The network is based on the 7-edit filter and the module is slightly more sparse than the ones considered so far, constructed from adjacent K4,3 bi-cliques. The central article in this cluster is ‘Global warming controversy’. But the pages ‘Solar variation’, ‘Carbon Dioxide’, ‘Sea level rise’, ‘Fossil Fuel’ and ‘El Niño-Southern Oscillation’ are all components in the discussion on the human components involved in global warming. 3.2 Isolated Clusters In order to understand the significance of the next type of collaboration in Wikipedia, it is useful to first discuss the network of modules. The network of modules allows one to identify modules in the bipartite network of editors and articles, which are not connected to any other modules. Figure 5 is an example of the network between the modules 10-edit network, with modules constructed from K7,2 cliques. Each module is represented by a pie-chart colored according to its fraction of editors (red) and articles (blue). The modules are connected by red links (overlapping editors) and blue links (overlapping articles); the width of each link is proportional to the number of overlapping nodes. Figure 5: Network of the clusters made of K 7,2 bi-cliques. Circles represent modules, which share articles (blue links) and editors (red links) with each other. The numbers are labels that identify each cluster. The network-of-clustersview helps to understand the relationships between the clusters and to identify isolated clusters that do not share articles or editors with others. The clusters 780, 771, and 779 are displayed in Fig. 6; the clusters labeled 776 and 781 are displayed in Fig. 7. Figure 5 shows three clusters 780, 771 and 779 (these numbers are just labels) that do not share links (either articles or editors) with the others. Two other clusters 780 and 776 are sparsely connected. Let us investigate these modules and begin to understand the causes underlying this network topology. The three isolated clusters correspond to topics that gather focused and dedicated authors: Mormonism, Zionism and Scientology (Figure 6). It is not fully surprising that all of those topics are isolated from other clusters since it could be argue that their practice in the ‘real world’ is similar: organized in sub-cultures, highly active, but isolated from other areas of knowledge and/or society. In Figure 5, two other clusters are connected with each other but not with the remaining modules; these are plotted in Figure 7. Both modules are devoted to political ’isms’ and share one editor and a single article the one on ‘Anarchy’. One of these clusters is interested in the definition and background of anarchism as the articles are: ‘Individualist anarchism’, ‘Mutualism (economic theory)’ (is an anarchist school of thought) and ‘Anarchism’. This cluster is then related to another interested in defining political ‘isms’: ‘Anarchism’, ‘Anarchocapitalism’, ‘Socialism’ and ‘Capitalism’. Figure 6: Isolated clusters: The left panel is a module focused on the topic of Mormonism, which comprises paradigmatic articles: ‘First Vision’, ‘Mormonism and Christianity’ and ‘Joseph Smith, Jr.’; the middle panel surrounds the topic of Zionism in all three articles: ‘Anti-Zionism’, ‘Zionist political violence’ and ‘Zionism’; the right panel surrounds the topic of Scientology: ‘Dianetics’, ‘Church of Scientology’ and ‘Fair Game (Scientology)’. Figure 7: Two connected clusters that are disconnected from the remaining network of modules. (left) Cluster focused on Anarchism. (right) Cluster focused on political ‘isms’: ‘Anarchism’, ‘Anarcho-capitalism’, ‘Socialism’ and ‘Capitalism’. 3.3 Shared Interests In all the clusters, the editors share the interest (and practice) of editing the same articles. Some of them can be grouped by a shared interest (or a number of related ones). These groups are revealed by the bi-cliques, some of which turn out to be coordinated through a WikiProject. 3.3.1 Mantras Figure 8 shows another example of a cluster realized from shared interest practice, although this one is not concentrated in a WikiProject. This project concerns the topics ‘Buddhism’, ‘Yoga’, ‘Tantra’, ‘Mantra’ and ‘Guru’. It reveals common interests between the 5 editors and the 5 articles in this K4,4 biclique cluster based on the 7-edit network. Although ‘Tantra’, ‘Yoga’ and ‘Guru’ are not related directly, they are part of the same vocabulary and interests of the practitioners of yoga, guru followers and tantra interested people. This cluster reflects a practice that happens beyond Wikipedia, but a practice that is mapped onto the way the articles are edited. Figure 8: Cluster showing a relation between articles about related practices: ‘Buddhism’, ‘Yoga’, ‘Tantra’, ‘Mantra’, and ‘Guru’. 3.3.2 WikiProjects 3.3.2.1 Elements Figure 9 displays 8 articles and 10 editors, which constitute a K7,3 module in the 10-edit network. This collaboration is a clear example of an orchestrated effort to improve the articles describing the elements of the periodic table. One of the WikiProjects is “a collection of pages devoted to the management of a specific topic or family of topics within Wikipedia; and, simultaneously, a group of editors that use said pages to collaborate on encyclopedic work. It is not a place to write encyclopedia articles directly, but a resource to help coordinate and organize article writing and editing”v. The WikiProject about elements presents itself in the following manner: ”This WikiProject has managed to standardize the articles on the known chemical elements (see Guidelines page). Now it is aimed at the maintenance of these at an agreed upon format discussed in Wikipedia talk:WikiProject Elements and at the expansion and improvement of each article to featured article quality (check out our Goals below).”vi. In this cluster the editors are engaged in improving the following articles: ‘Hydrogen’, ‘Oxygen’, ‘Gold’, ‘Mercury’, ‘Magnesium’, ‘Lithium’, ‘Krypton’, ‘Potassium’. An investigation of their user pages reveals that the editors involved are several administrators with daily activities that range from working mathematicians to geologists and chemists. The various editors have different levels of (dis)comfort with anonymity: some use their real name, some keep it hidden but provide extensive information about their activities and, at least one, copes with anonymity in an interesting manner: ”Male, European, and already paranoid about giving away this much information”. vi v From Wikipedia, “Wiki project.” Retrieved on May 2nd, 2008 from http://en.wikipedia.org/ wiki/Wikipedia:WikiProject. From Wikipedia, “Wikiproject elements.” Retrieved on May 2nd 2008 from http://en.wikipedia.org/wiki/Wikipedia:WikiProjectElements. Figure 9: Cluster revealing the coordinated effort to improve Wikipedia articles about the elements of the Periodic Table. Figure 10: Cluster showing more elements that are part of the WikiProject concerned with improving the articles of the elements of the Periodic Table by decreasing the minimum number of edits allowed. Additional data about this cluster can be obtained by considering the 7-edit network. For the same clique zoom of K7,3, Figure 10 has 16 editors and 20 articles. As it is a coordinated effort, the additional information gained by increasing the number of edits is only that there are more people and articles involved in the same topic: We see 20 elements instead of the 8 elements that were visible in the case of the previous cluster with fewer editors and articles. 3.3.2.2 Electronics Another example of a cluster that reveals a WikiProject is displayed in Figure 11. This project surrounds the topic of electronics and several of its concepts (’Alternating current’, ’Decibel’) and tools (’Oscilloscope’, ’Electric motor’). The K4,4 clique cluster comprises 7 editors and 11 articles in the 7-edit network. The presentation of the WikiProject about Electronics is the following: “The aim of this project is to better organize information in articles related to electronics. This page contains only suggestions, with the hope to help other Wikipedians writing high-quality articles with the minimum effort”vii. vii From Wikipedia, “Wikiproject electronics.” Retrieved on May 2nd 2008 from http://en.wikipedia.org/wiki/Wikipedia:WikiProjectElectronics. Figure 11: Cluster supported by the WikiProject Electronics around the topic of, well, of electronics: ‘Electrometer’, ‘Decibel’, ‘Potentiometer’, ‘Alternating current’, ‘Electrical engineering’, ‘Electronics’, ‘Oscilloscope’, ‘Resistor’, ‘Transistor’, ‘Electric motor’ and ‘Capacitor’. 3.4 Non-Content Bounded Clusters The bipartite network of editors and articles also contains modules, in which there is no apparent correlation between the topics. Figure 12 displays such a module with 5 articles and 9 editors around topics as diverse as: ’Joseph Stalin’, ’Martin Luther King, Jr.’, ’Tsunami’, ’Ku Klux Klan’ and ’Albert Einstein’. It is a curiosity to observe what topics would be included in these generalist clusters that are heavily edited and by a small group of editors. Figure 12: There are also several clusters such as this one, which are not bounded by content, but probably by editing style edits maybe for adding links or fighting vandals. A more extensive list from a module of adjacent K4,1 bi-cliques with 45 articles is: Abortion, Jimmy Wales, Solar energy, Evolution, Fuck, Christianity, Ku Klux Klan, Beauty, Galileo Galilei, Racism, Stupidity, Black hole, Plato, Joseph Stalin, Sun, Volcano, Aristotle, Earthquake, Art, Rosa Parks, Nuclear power, Isaac Newton, Computer, Martin Luther King, Jr., Tsunami, Buddhism, Creationism, Bitch, Vietnam War, Tornado, Pi, Shit, Pope John Paul II, Albert Einstein, Internet, Thomas Jefferson, Vladimir Lenin, Love, Cunt, Renaissance, Islam, Slavery, Mother Teresa, Tropical cyclone, Music. One possible way to account for this variety in topic in this example of a cluster not bounded by content is that these articles have very general content, they are not highly specialized and therefore are more accessible to different kinds of editors. Another complementary explanation is that articles are sometimes edited, not in terms of topic, but rather kind of edit. An editor that is concerned with making tables, or fixing links would not be concerned with the specific topic and the edits are therefore due to syntax, layout, or spelling editors. 4. DISCUSSION By applying clustering tools from social network analysis to a subsection of Wikipedia, several interesting insights regarding the meso-level between single articles and global statistics were uncovered. Although we were limited to the articles that were included in the subsections of the categories Physics and Philosophy and therefore related to these two primary topics, these boundaries gave us a certain familiarity with the topics. This facilitated the extraction of information in a manner that would not have been possible, had the research been performed on random or unfamiliar topics. Controversies give rise to disputes that are not necessarily contained within one article. In fact, controversies typically span multiple articles and form tightly connected modules of editors who edit related topics actively and sometimes in direct opposition to each other. Wikipedia, as expected, mirrors the discussions in society. Clustering tools allow us to probe other structures than the ‘web of knowledge’ that arises from the networks where the nodes are articles and the hyperlinks connect them. The article on ‘Evolution’ links not only to ‘Darwin’ or ‘Wallace’, but also connects to ’Atheism’, for example. This modular structure reflects the controversy currently taking place on the scale of the entire North-American society, which is actively engaged in discussing the possibility of creationism to be taught alongside with evolution. In this manner, analyzing the modules in Wikipedia, provides information about another layer of the construction of knowledge which is not necessarily tied with the topics closest in character, but with those that create issues which must be articulated and disputed in relation to each other. In the case of the coordinated efforts, such as the Project Elements, the attempt to achieve Featured Article status seems to aggregate people (Viegas, Wattenberg and McKeon, 2007) [16] and also, as previously proven by Wilkinson and Huberman (2007) [7] the more edits an article has, the more it is likely to accrete. Therefore, the creation of WikiProjects is shown to be a good way to mobilize work in one direction, especially by trying to produce Featured Articles. WikiProjects are a good example of the work carried out at the meso-level: they do not rely on massive inputs by the ‘wisdom of the crowds’ nor do they rely uniquely on the dedication of one single editor. WikiProjects result in clusters of editors with common interests that have found a way to coordinate work successfully aggregating people and resulting in highly developed articles. As expected, some clusters reflect the way those same clusters manifest in ’real life’. If topics or practices aggregate tight and closed clusters, it is not surprising that the articles about those clusters are also edited by a closed cluster of editors. The bipartite clustering tools and the network of modules can be used, not only to identify some of those modules in ‘real life’, but also to understand the relations between the modules and the most important players. For example, in the controversy between ’Evolution’ and ’Creationism’, there are people and groups who are quite outspoken (’Dawkins’, ’Discovery Institute’) and therefore their articles are edited along with the other articles present in the controversy. Another example is that specific properties about how some articles are edited can be related to some assumed properties of groups in the ‘real world’: the isolated groups on ‘Mormonism’, ‘Scientology’ and ‘Zionism’ may show that these groups in society are also quite isolated and dedicated to their cause. Although it is hard to prove the behavior of these groups in ‘real life’, a recent case of Wikipedia banning the Church of Scientology from editing (Wired, 2009) [17] supports that these editing patterns may reflect that these groups edit directly their own pages and that the discussions about them, some even controversial, are quite isolated. These clusters can be seen as ‘epistemic communities’, in the weak sense, that of a group of people gathering around a knowledge topic (and not in the strong sense where Roth and Bourgine (2004) [18] – define an epistemic community by the group of people that maximally share a number of concepts). These clusters are not strictly ’Communities of Practice’ (Lave & Wenger, 1991 [19]) because the authors need not be acquainted or involved in a common practical task. Regardless, a community of practice is certainly a special type of knowledge community. The participation in Wikipedia as a whole, although a theme to be developed elsewhere, can be said to be a large community of practice where editors interact using shared paradigms, meanings, values and practices and where a lot of the learning is tacit: wikipedians learn how to edit articles, how to fight vandals, how to use policies to make their points through, how to present themselves in user pages and so on. Bryant, Forte and Bruckman (2005) [20] in ‘Becoming Wikipedian: transformation of participation in a collaborative online encyclopedia,” argued that, “observations of members’ behavior in Wikipedia reveals that the three characteristics of Communities of Practice identified by Wenger are strongly present on the site: community members are mutually engaged, they actively negotiate the nature of the encyclopedia-building enterprise, and they have collected a repertoire of shared, negotiable resources including the Wikipedia software and content itself.” Clustering allows us to zoom in into this community of practice and detect more specific modules bounded by shared interest. In WikiProjects, in particular, authors are involved in a common task and are, therefore, creating structures that are closer to mini-communities of practice than the other ‘epistemic communities’. Finally, it should be noted that in order to complete the typology with the clusters found in the data, some of the clusters contain a number of articles in topics as diverse as ‘Tsunami’ and ‘Albert Einstein’. It is not surprising that this module is diverse. The K4,1 cliques are 4 editors that have co-edited just one article. If they had co-edited two or more articles, then one would expect more similarity (in general the articles become more homogeneous). This type of clique with a low second index means that the articles do not have anything in common. This type of clusters are a hint that Wikipedia is also a product of more loose dedications by people who edit in articles which are more broad, but also that there are different editing patterns, and not all are content-driven. Editing to fix typos, or to make tables of contents can also group people. As this is a study with both quantitative and qualitative features, we used semantic categories top-down to describe the kinds of clusters found in the data, harvested bottom-up. This way we assessed qualitatively the nature of collaboration between editors in a subset of the English Wikipedia, grounded on network analysis. 5. LIMITATIONS AND RECOMMENDATIONS FOR FUTURE RESEARCH A more systematic study could reveal possible ‘network signatures’, i.e. ways to identify controversies or WikiProjects directly from the network structure. It will be interesting to complement the insights discovered from our meso-level modules with a deeper probe into the specific connections through discussion pages, on the level of individual articles and paragraphs to understand the patterns of distributed work and perhaps, cognition in greater depth. Analyzing the network of modules informs us about the individual modules and their structural relation to each other. Further work could provide information to understand the network of Wikipedia in relation to other networks (scientific collaborations, open source projects). In the future, it will be interesting to expand the bipartite clustering technique and approach to other areas of Wikipedia and other datasets; to organize the algorithm in order to allow for the module surrounding any article to be visualized, and contextualize the findings in the light of more abstract claims of the power of technology and cluster, in specific wikis and wikipedias to organize knowledge, work together, and ultimately be part of a cognitive system that comprises humans, technologies and values. It will also be interesting to pursue the interdisciplinary meso-level of analysis as it seems to result in insights which inhabit the area between the quantitative patterns and the qualitative details usually found by other more traditional disciplines. 6. CONCLUSION Detecting modules of articles and editors in Wikipedia yields important insights into the nature of collaboration. The technique used in the present research probes a level where collaboration is surely taking place because people in fact gather around a number of articles and work intensely on them. 7. ACKNOWLEDGMENTS Rut Jesus acknowledges support by the Portuguese Foundation for Science and Technology with the grant SFRH/BD/ 27694/2006 and would like to thank Camille Roth for discussions and bibliography concerning biclique history. Sune Lehmann acknowledges support by the Danish Natural Science Research Council and James S. McDonnell Foundation 21st Century Initiative in Studying Complex Systems, the National Science Foundation within the DDDAS (CNS-0540348), ITR (DMR-0426737) and IIS-0513650 programs, as well as by the U.S. Office of Naval Research Award N0001407-C and the NAP Project sponsored by the National Office for Research and Technology (KCKHA005). 8. REFERENCES [1] A. Capocci, V. D. P. Servedio, F. Colaiori, L. S. Buriol, D. Donato, S. Leonardi, and G. Caldarelli, “Preferential attachment in the growth of social networks: the case of wikipedia,” arXiv:physics/0602026, Feb 2006. [2] L. S. Buriol, C. Castillo, D. Donato, S. Leonardi, and S. Millozzi, “Temporal analysis of the wikigraph,” in WI ’06: Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence, (Washington, DC, USA), pp. 45–51, IEEE Computer Society, 2006. [3] V. Zlatic, M. Bozicevic, H. Stefancic, and M. Domazet, “Wikipedias: Collaborative webbased encyclopedias as complex networks,” arXiv:physics/0602149, Jul 2006. [4] F. Ortega and J. M. Gonzalez-Barahona, “Quantitative analysis of the wikipedia cluster of users,” in WikiSym ’07: Proceedings of the 2007 international symposium on Wikis, (New York, NY, USA), pp. 75–86, ACM, 2007. [5] A. Capocci, F. Rao, and G. Caldarelli, “Taxonomy and clustering in collaborative systems: The case of the on-line encyclopedia wikipedia,” EPL (Europhysics Letters), vol. 81, no. 2, pp. 28006+, 2008. [6] F. B. Viegas, M. Wattenberg, J. Kriss, and F. van Ham, “Talk before you type: Coordination in wikipedia,” in System Sciences, 2007. HICSS 2007. 40th Annual Hawaii International Conference on, pp. 78–78, 2007. [7] D. M. Wilkinson and B. A. Huberman, “Assessing the value of coooperation in wikipedia,” arXiv:cs/0702140, Feb 2007. [8] A. Kittur, and R.E. Kraut, “Harnessing the Wisdom of Crowds in Wikipedia: Quality Through Coordination.” CSCW 2008: Proceedings of the ACM Conference on Computer-Supported Cooperative Work. New York: ACM Press, 2008. [9] F. B. Viegas, M. Wattenberg, and K. Dave, “Studying cooperation and conflict between authors with history flow visualizations,” in CHI ’04: Proceedings of the 2004 conference on Human factors in computing systems, pp. 575–582, ACM Press, 2004. [10] R. P. Biuk-Aghai, “Visualizing co-authorship networks in online wikipedia,” in Communications and Information Technologies, 2006. ISCIT ’06. International Symposium on, pp. 737–742, 2006. [11] B. Suh, E. H. Chi, B. A. Pendleton, and A. Kittur, “Us vs. them: Understanding social dynamics in wikipedia with revert graph visualizations,” in Visual Analytics Science and Technology, 2007. VAST 2007. IEEE Symposium on, pp. 163–170, 2007. [12] Camille Roth, “Co-evolution in Epistemic Networks – Reconstructing Social Complex Systems”, Structure and Dynamics: eJournal of Anthropological and Related Sciences : Vol. 1: No. 3, Article 2, 2006. [13] G. Palla, I. Derenyi, I. Farkas, T. Vicsek, “Uncovering the overlapping community structure of complex networks in nature and society,” Nature 435, pp. 814-818, 2005. [14] S. Lehmann, M. Schwartz, and L. K. Hansen, “Biclique communities,” arXiv:0710.4867, 2007. [15] A. Halavais and D. Lackaff, “An analysis of topical coverage of wikipedia,” Journal of ComputerMediated Communication, vol. 13, no. 2, pp. 429– 440, 2008. [16] F. B. Viégas, M. Wattenberg, and M. Mckeon, “The hidden order of wikipedia,” Online Communities and Social Computing, pp. 445–454, 2007. [17] R. Singel, “Wikipedia Bans Church of Scientology” in Wired.com, May 29, 2009. Retrieved on July 4th, 2009 from http://www.wired.com/epicenter/2009/05/wikipediabans-church-of-scientology. [18] C. Roth and P. Bourgine, “Epistemic communities: description and hierarchic categorization,” arXiv:nlin/0409013, Sep 2004. [19] J. Lave and E. Wenger, Situated Learning: Legitimate Peripheral Participation. Cambridge University Press, Cambridge, 1991. [20] S. Bryant, A. Forte and A. Bruckman, “Becoming Wikipedian: transformation of participation in a collaborative online encyclopedia,” Proceedings of the 2005 international ACM SIGGROUP conference on Supporting group work, November 06-09, 2005. What Cognition Does for Wikis Rut Jesus Center for Philosophy of Nature and Science Studies University of Copenhagen, Denmark vulpeto@nbi.dk ABSTRACT Theoretical frameworks need to be developed to account for the phenomenon of Wikipedia and writing in Wikis. In this paper, a cognitive framework divides processes into the categories of Cognition for Planning and Cognition for Improvising. This distinction is applied to Wikipedia to understand the many small and the few big edits by which Wikipedia’s articles grow. The paper relates the distinction to Lessig’ Read-Only and ReadWrite, to Benkler’s modularity and granularity of contributions and to Turkle and Papert’s bricoleurs and planners. It argues that Wikipedia thrives because it harnesses a Cognition for Improvising surplus oriented by kindness and trust towards distant others and proposes that Cognition for Improvising is a determinant mode for the success of Wikis and Wikipedia. The theoretical framework can be a starting point for a cognitive discussion of wikis, peer-produced commons and new patterns of collaboration. Categories and Subject Descriptors H.5.3 [Information Interfaces and Presentation]: Group and Organization Interfaces—Computer-supported cooperative work, Webbased interaction; K.4.3 [Computers and Society]: Organizational Impacts—Computer-supported collaborative work; J.4 [Social and Behavioral Sciences]: Miscellaneous. General Terms Algorithms, Design, Human Factors. Keywords Wikis, Wikipedia, Collaboration, Theoretical Development, Cognition for Planning, Cognition for Improvising, Cognitive Surplus. INTRODUCTION Motivation It has been repeated that “The problem with Wikipedia is that it only works in practice. In theory, it can never work.” This claim has been made into the zeroeth law of Wikipedia [1]. This phrase can be understood in several ways, and therefore appeals to people from different quadrants. The phrase appeals to those who would argue from a moral point of view and count the number of good-doers and bad-doers in the world, who are surprised that the openness of Wikipedia attracts more people who contribute positively for the project than it attracts people who would destroy its viability. The phrase appeals to those pragmaticists, who like to point out that the success is visible, and that practice is what matters, not theories that utopias are or are not possible. As put by Clay Shirky: Wikipedia’s “utility is settled, interesting questions lie elsewhere" [2]. Even if accuracy is being studied and is important to develop tools to help navigate the trustworthiness of the content, it is also a fact that Wikipedia is a top10 website, and is widely used, cited or not. The phrase about Wikipedia working in practice and not in theory also applies to research: it may be the Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission. WikiSym '10, July 7-9, 2010, Gdańsk, Poland. Copyright © 2010 ACM 978-1-4503-0056-8/10/07. case that we can see the results of its success but lack accompanying theories to understand why and how Wikipedia works (game theory, for example, accounts for people only behaving by direct self-interest). The thread that will be pursued here is the development of a cognitive distinction to account for the phenomenon of the use of wikis, and, specifically, Wikipedia. Substantial research on Wikipedia has been done in the last few years, and presented in conferences such as WikiSym, but there is a clear lack of philosophical approaches (one issue of Èpisteme dealt with the epistemology of Wikipedia [3]), and, with very few exceptions (such as the description of bot use to vandal fighting using distributed cognition [4]), cognitive theory has not been involved in Wikipedia research. Moreover, there has been ample discussion about who writes Wikipedia, both in speech and in research papers. Jimmy Wales emphasized the community who makes most of the edits [5], Aaron Swartz emphasized the size of the edits to conclude about the substantial additions by anonymous users [6], and a paper discussed the “Wisdom of the Crowds vs. the Rise of the Burgeoisie” [7]. In the present paper, the focus is not on who writes Wikipedia but on the construction of a cognitive distinction to think about how Wikipedia is written – how to account for the many tinkering edits and the fewer substantial additions of content. Cognitive Theory Umbrella The distinction put forth in this paper between Cognition for Planning and Cognition for Improvising builds upon the dynamic and ecological views of cognition, which encompass Embodied, Situated and Distributed Cognitions (ESDC), a major trend in cognitive science [8-10]. In the last 15-20 years, these cognitive theories have set the focus on the embodiedness and embeddedness of the cognitive processes. In other words, these theories are not satisfied with the computational and brain-limited cognitivist theories and support, to a greater or lesser degree, that the environment, artifacts, and the body are important parts of the cognitive processes. The Extended Mind hypothesis, put forth by Clark and Chalmers [11] also plays a major role in these discussions, because it questions the philosophical place of the mind, when confronted with the claim that the mind might do more than just sit in the brain and compute purely abstract issues, but supports a stronger ontological claim than the weaker versions of ESDC. In this context, even a theory of cognition being ‘coordinated non-cognition’ [12] has been put forth. There has been a long discussion of what cognition (and cognizing) really means – is thinking purely mental symbol processing, is it problem-solving or is it information-processing involving body and environment? Although it is beyond the scope of this paper to resolve the issue of what cognition really is, taking into account the notion of cognitive artifacts is useful when speaking about wikis. Cognitive artifacts can range from physical objects, to behaviors, to processes that are used to aid, enhance or improve cognition. Some examples are a calendar, a shopping list or a computer. Wikis play a role in the cognitive processes of collaboration, and in the case of Wikipedia, a wiki is the mediating technology in the writing of the biggest encyclopedia. The distinction in this paper, between Cognition for Planning and Cognition for Improvising is inspired by ESDC approaches but is also transversal and complementary to those theories as the ESDC theories are mostly concerned with a spatial position of cognition, while this distinction is mostly concerned with a temporal position of cognition. Plan First, I will explain the two relations between Wikis and Cognition (what cognition does for wikis and what wikis do for cognition) to position this work as part of ‘what cognition does for wikis’. Then, I will construct a conceptual distinction between Cognition for Planning (CfP) and Cognition for Improvising (CfI) and show how it is a useful distinction to better understand Wikipedia. Then, the final argument is put forth where Wikipedia’s success depends on the Cognition for Improvising surplus, a mode of great use in a project that grows incrementally. The theoretical framework proposed here is part of a PhD thesis on cognition and Wikipedia, which includes data harvesting studies on co-authorship networks in Wikipedia (see, for example, Jesus et al, 2009 [13]). Although the data was important for the insights created, it is not shown here, to keep focus on developing a concise distinction. WIKIS AND COGNITION Wikis and cognition can be implicated in two ways: what cognition does for wikis, and what wikis do for cognition. Another way to understand the two directions of the implication between wikis and cognition is to consider two hypotheses, called weak and strong in relation to how much they alter our brain: The Weak Hypothesis: Wikis work because, through them as a tool, particular aspects of human cognition are used. Cognition for Improvising was always “there”, and wikis profit from tapping into it. The Strong Hypothesis: Not only wikis harness this surplus in Cognition for Improvising, but they also “shape” it. In this hypothesis, human cognition is changed/enhanced/extended by the use of wikis. In this paper, it is the Weak hypothesis that is dealt with and investigated in greater depth. The Strong Hypothesis is more speculative, and more relevant to the understanding of cognition and to understanding the different cognitive milestones [14] than the Weak Hypothesis, which focuses on what cognitive processes are at play in the use of a wiki, and in the construction of Wikipedia. COGNITION FOR PLANNING VS. COGNITION FOR IMPROVISING Cognition for Planning (CfP) is the kind of cognition that we use when we sit down to reflect on an issue and make a decision. Cognition for Improvising (CfI) is the kind of cognition that we use when reaching for a glass of water, where the body ‘knows’ how to make the movements, one after the other to reach the glass of water. Goal Level: At the extremes, higher level goals can look very different from lower level goals. Many smaller cognitive processes can constitute a bigger goal, in a modular way. Writing an encyclopedic article is a goal higher than correcting a typo. Cognition for Planning is present when there are higher-level, very well-defined goals, while Cognition for Improvising is present when lower-level, even very low-level goals are the ones at stake. While making a calculation there is the clear goal of getting a result in the end. It involves making a computation in the mind, or using the help of pencil and paper, where several processes are applied (some of which we may not ‘know how we are doing them’). These processes constitute the ‘problemsolving’ process. Other cognitive processes can have much lowerlevel goals, so much at a low-level that they may even not be called ‘goals’, such as saying one word, or moving an arm. Units: The minimum unit of analysis and of processing for Cognition for Planning is bigger than the minimum unit of analysis and of processing for Cognition for Improvising, in terms of time, decision and work. Cognition for Improvising is constituted by many small decisions, as in an improvisational dance, where each small decision brings the opportunity for the next. Cognition for Planning is constituted by greater decisions, like a rehearsed dance that encompasses decisions about the whole structure. Action vs. Reaction: While Cognition for Planning is what we use in a coordinated effort to produce a specific result, acting upon the world (for example, saving food for the winter), Cognition for Improvising is what we use in replying to an immediate disturbance or interaction (for example, ducking if someone shoots), reacting to the world. Cognition for Planning allows us to construct futures, and remember pasts, while Cognition for Improvising allows us to deal with the here-andnow challenges. Relations Between CfP and CfI Having described the distinction between Cognition for Planning and Cognition for Improvising, it is important to stress that these ‘types’ of cognition can happen in parallel. There may be activities where we use one of these types of cognition, and other activities for which we use the other. Research activity, for example, comprises paper writing, which uses Cognition for Planning, but many of the sources of inspiration come from conversation, which usually uses Cognition for Improvising, as it is a quick exchange of small units of thought, quite reactive to what is going on. Cognition for Planning is a more complex category that includes Cognition for Improvising, thus these two types of cognition are often present simultaneously. Using more Cognition for Improvising can happen if the cognitive overload is diminished. The notion of cognitive overload goes at least back to Simon [15] in writing “On how to decide what to do”. For example, some tasks take immense cognitive power, such as writing a thesis, where I need to decide what to write about, decide to sit at this precise moment, and also what to write (and much more…). If the tasks can be broken down into parts that are already defined, then, instead of using so much Cognition for Planning, one can use more Cognition for Improvising. A story from Zen and the Art of Motorcycle Maintenance [16] may elucidate this: the son is stuck wanting to write a letter to the mother and not knowing where to start. The father suggests that he’d keep it simple – he should first write a list of the things he wants to say and then make the decision of which one to say first. To both decide what to write and what to write first can be too big a cognitive task, and therefore there is a cognitive overload. Benkler’s Modularity and Granularity Wikipedia as an encyclopedia and text Breaking a cognitive task into smaller parts, relates to the modularity and granularity concepts proposed by Benkler [17]. “Modularity” describes the extent to which a project can be broken down into smaller components. These components can be produced independently and can later be assembled into a whole. “Granularity” describes the size of the components, in terms of the time and effort that an individual must invest in producing them. When a project has modules of small size, it more easily harnesses Cognition for Improvising. The nature of Wikipedia as both an encyclopedia and its support as text (in contrast with many FLOSS projects which are programs and written in code) increases the way by which Cognition for Improvising can be used as Wikipedia is quite modular and very fine-grained: it can be built from many small contributions. An encyclopedia is really a collection of articles; an article is a small module of cognitive ‘coherence’ (smaller than a book, for example). Moreover, text has a very small granularity, allowing contributions as small as the fixing of a comma or the addition of a reference. COGNITION FOR IMPROVISING SURPLUS Below it is argued that Wikipedia’s and other wikis’ success is partially a result from harnessing a surplus of Cognition for Improvising. Cognition for Improvising is used in very immediate, concrete surroundings, quite often embodied, or in interaction, in conversation, but encyclopedias were still being written using great amounts of Cognition for Planning. Someone would plan the distribution of work, and once given an assignment, a scholar would plan the writing of an encyclopedic article. This work wasn’t absolutely individual, the article would be sent to the editors, and comments and corrections would be added. In the end, the editors would also check for style. Nonetheless, most of the cognitive work was being done with great amounts of Cognition for Planning. Cognition for Improvising is a mode that can be used for incremental writing. The use of this mode is independent from ethical and motivational reasons that stimulate people to contribute to Wikipedia. The motivation of belonging to a greater project, the security of the copyleft license, the interest in doing good are all crucial for Wikipedia’s success, as well as many architectural decisions of the site and wikis which allow for discussion and negotiation, and the possibility of shaping the meta-level of Wikipedia. Wikipedia is possible because there is the mode Cognition for Improvisation, which can be used because there is a surplus. Clay Shirky speaks of the “cognitive surplus” [18], in anecdotal form, when in a lecture he tells the story of explaining to a TV-producer the intricacies of making a Wikipedia article, to which he gets the question “But where do people find the time?” His witty answer is, "No one who works in TV gets to ask that question. You know where the time comes from. It comes from the cognitive surplus you've been masking for 50 years." Yochai Benkler, who has analyzed what he calls the “commons-based peer production” from an economic perspective in the book The Wealth of Networks (2006) [17], speaks of the difference between market and nonmarket production and describes some of the necessary characteristics of peer-production, in order for it to harness the excess capacity of time and interest in human beings. The processing, storage, and communications capacity in computers are available to be used for activities whose rewards are not monetary or monetizable, directly or indirectly. Benkler describes extremely succinctly what the processes are by which the harnessing of this excess capacity can be effective: For this excess capacity to be harnessed and become effective, the information production process must effectively integrate widely dispersed contributions, from many individual human beings and machines. These contributions are diverse in their quality, quantity, and focus, in their timing and geographic location. The great success of the Internet generally, and peerproduction processes in particular, has been the adoption of technical and organizational architectures that have allowed them to pool such diverse efforts effectively. The core characteristics underlying the success of these enterprises are their modularity and their capacity to integrate many finegrained contributions." (in The Wealth of Networks, [17]) Kindness-Trust Surplus People have time, effort and kindness available to do things outside the markets and the quest for survival. Although people’s lives are complex, in the normal lives we lead we usually act out of kindness and in ways that build trust to those close to us, and we do fewer acts of kindness for those farther away. We may, though, have a greater potential to do these acts of kindness and of building trust than what is necessary for building the close relationships, and therefore there is a surplus that can be exploited. It is possible to harness this potential because it responds to the human motivation of following ‘higher’ values, being part of something ‘greater than themselves’, contributing to the common good, altruism, and engaging in community. This tapping of the ‘kindness surplus’ is possible because there was an environment that felt trustworthy, safe, useful, and therefore the kindness-trust could be expressed. We were used to rely upon trust and kindness in a small immediate environment; now, with the right values, technologies and affordances, we can harness those capacities to produce something not any longer in the small immediate scale, but at a greater scale. Some internet projects have been more equalitarian, providing a space for trust at a distance, despite their rich-white-western biases. These new peerproduction models somehow ‘short circuited’ these distances, and trust and kindness became visible. Particular wiki characteristics Wiki characteristics such as watch this page, recent changes (especially when wikis are smaller), and discussion pages – all support an immediate, reactive, and concrete mode of interaction and contribution, which uses Cognition for Improvisation. Just replying to a point in a discussion or fixing a typo in someone’s just added paragraph are behaviors that contribute to the whole. Watch this page is an attention-grabber, whereby it is easier to reply to a change that was made, by correcting, improving, or reverting if it was the case of a small mistake, a good addition or an act of vandalism. Recent changes was the most important feature of wikis, as Ward Cunnigham, their inventor, said, “we knew where the action was taking place” (Cunningham, open Space, WikiSym’09, personal communication). They also pointed the attention to where something was happening. A loose comparison would be to say that there is not a big need for Cognition for Planning if one were to walk by the main square of one’s village and suddenly saw a group of people gathered. It would only be natural to join them and improvise a conversation with a friend or an acquaintance, using Cognition for Improvising. Both watch this page and recent changes (and similar functions) also play on the stigmergic effect [19] whereby a change (an edit) left in the environment (an article), is a communication device about the possible next change to do. As for discussion pages, the implication of Cognition for Improvising is even more direct – engaging in a discussion is interacting back-and-forth, using more of the Cognition for Improvising than the Cognition for Planning. Division of work These two cognitions also reflect some of the spontaneous division of work that has been seen in Wikipedia. While the addition of a substantive piece of text is something that happens mostly using Cognition for Planning, the small tinkerings are done with the Cognition for Improvising. In terms of number of edits there is a clear split where few edits add much previously-thought content, while many edits add a small change that is a quickreactive contribution. The division between these two groups of edits follows the division between Cognition for Planning and Cognition for Improvising. Bots (small programs that edit systematically) fall out of this distinction, as their ‘behavior’ is mostly syntactic (example: find ‘tpyo’, replace by ‘typo’) and bots do not use the more intricate semantically-rich notions of planning and improvising. Turkle and Papert [20] develop a brilliant distinction between planners and bricoleurs which relates to the distinction proposed here. Cognition for Improvising is certainly more present in bricoleurs’ activities that deal more with the concrete, while Cognition for Planning is used in the abstract thinking of planners. Nonetheless, distinguishing what cognitions are at play in the writing of Wikipedia is more appropriate to do using a temporal distinction of cognition than a personal style of dealing with the world. It is not possible to divide people in using one or the other type of cognition because often both cognitions are used. In this sense, the distinction is more useful to understand contributions than contributors. The distinction put forth in this paper, can also be used to understand greater patterns of the information and communication technologies. Lawrence Lessig, the scholar who started Creative Commons and who is the greatest advocate for a review of copyright to increase the freedom, describes, in the book “Remix” [21] two cultures, which are present in the Internet. RO, which stands for Read-Only, applies to sites where one can only consume the information, such as newspapers; and RW, which stands for Read-Write, applies to sites where one can directly interfere, by commenting, changing and engaging, such as blogs and wikis. These two cultures are examples of the two economies that are present, the commercial economy and the sharing economy. These have run in parallel for a long time. Lessig advocates that a change of the law is necessary in order to not criminalize the sharing economy, and shows that there are many possible hybrid models, in which both economies are present, such as Free and Open Source Software. The appeal to the RW culture is derived from the possibility to use the Cognition for Improvising surplus, which allows for a whole segment of remixes to exist and thrive. CONCLUDING REMARKS In relation to Wikipedia, it is obvious that both types of cognition are at play. This is true for the writing of Wikipedia articles where some edits are plainly ‘adding information’, making more use of Cognition for Planning, while others are ‘clarify info’ and ‘fix typo’, making more use of Cognition for Improvising. The massive use of Cognition for Improvising accounts for the many actions in Wikipedia that are ‘bottom-up’, such as the division of work. There is, though, also a hierarchy and a structure of policy, with norms that are top-down (even if mostly arose bottom-up). To conclude, it is fascinating to see how this separation of big/small goals, planning/improvising, bottom-up/top-down also shows up in the self-reported motivations for contributing, which show the self-awareness for why people contribute. In the Wikipedia-wide survey (Philipp Schmidt, talk at WikiMania’09) – the top two reported self-reported motivations were: 72% I like the idea of sharing knowledge and want to contribute to it 69% I saw an error I wanted to fix These self-reported motivations show the inclination to the greater utopian hope (represented by the motto “The Free Encyclopedia That Anyone Can Edit” and decisions of non-profit, early GFDLlicensing, late CC-BY-SA-licensing) which include the use of the Kindness-Trust Surplus. But these self-reported motivations also show the inclination to the possible use of the Cognition for Improvising Surplus by simply fixing an error and contributing incrementally. REFERENCES [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] “http://en.wikipedia.org/wiki/User:Raul654/Raul%27s_laws.” C. Shirky, Here Comes Everybody: The Power of Organizing Without Organizations, Penguin Press HC, The, 2008. D. Fallis, “Introduction: The Epistemology of Mass Collaboration,” Episteme, vol. 6, 2009, pp. 1-7. R.S. Geiger and D. Ribes, “The work of sustaining order in wikipedia: the banning of a vandal,” Proceedings of the 2010 ACM conference on Computer supported cooperative work, Savannah, Georgia, USA: ACM, 2010, pp. 117-126. “Wikipedia, http://en.wikipedia.org/wiki/Wikipedia.” A. Swartz, “Who Writes Wikipedia? (Aaron Swartz's Raw Thought).” A. Kittur, E. Chi, B. Pendleton, B. Suh, and T. Mytkowicz, “Power of the few vs. wisdom of the crowd: Wikipedia and the rise of the bourgeoisie,” 25th Annual ACM Conference on Human Factors in Computing Systems (CHI 2007); 2007 April 28 - May 3; San Jose, CA, 2007. P. Robbins and M. Aydede, The Cambridge Handbook of Situated Cognition, Cambridge University Press, 2008. N. Payette, Beyond the Brain: Embodied, Situated and Distributed Cognition, Cambridge Scholars Publishing, 2008. I.E. Dror and S. Harnad, Cognition Distributed: How cognitive technology extends our minds, John Benjamins Publishing Company, 2008. A. Clark and D. Chalmers, “The Extended Mind,” Analysis, vol. 58, 1998, pp. 19, 7. L. Barsalou, C. Breazeal, and L. Smith, “Cognition as coordinated noncognition,” Cognitive Processing, vol. 8, Jun. 2007, pp. 91, 79. R. Jesus, M. Schwartz, and S. Lehmann, “Bipartite networks of Wikipedia's articles and authors: a meso-level approach,” Proceedings of the 5th International Symposium on Wikis and Open Collaboration, Orlando, Florida: ACM, 2009, pp. 1-10. S. Harnad, “Post-Gutenberg Galaxy: The Fourth Revolution in the Means of Production of Knowledge,” Public-Access Computer Systems Review, vol. 2, 1991, pp. 39 - 53. H.A. Simon, “On How to Decide What to Do,” Bell Journal of Economics, vol. 9, Autumn. 1978, pp. 494-507. R.M. Pirsig, Zen and the Art of Motorcycle Maintenance: An Inquiry into Values, Harper Perennial Modern Classics, 2008. Y. Benkler, The Wealth of Networks: How Social Production Transforms Markets and Freedom, Yale University Press, 2006. C. Shirky, “Cognitive Surplus Talk,” 2008. F. Heylighen, “Why is Open Access Development so Successful? Stigmergic organization and the economics of information,” 2006. S. Turkle and S. Papert, “Epistemological Pluralism: Styles and Voices within the Computer Culture,” Signs, vol. 16, 1990, pp. 157, 128. L. Lessig, Remix: Making Art and Commerce Thrive in the Hybrid Economy, Penguin Press HC, The, 2008. Z. APPENDIX Z 1 COGNITIVE ONTO-PLAY @ ‘What is Ontology?’, PhD course hosted by the Center for the Philosophy of Nature and Science Studies, University of Copenhagen by rut jesus, Monday morning, April 2009 If it was just to read the texts we would have all stayed at home. We came because learning is much more than what is written, what is said. We came because a lecture helps to keep concepts in the mind, we came because we hoped for interesting discussions, we came, all in all, because we want to meet others. What is ontology? Is questioning. I am aware of the limits of an academic setting, unfortunately, so I tried to design something that fits in this world but tries to promote interaction, live reflection, and add some joy. It is a kind of ice-breaker/reading checking/innovative game session to allow people to pay attention to each other, and discuss some relevant topics. For those meta-inclined, it is, afterall, an ontology course, — there are two movements: one in which there is a wave of group and individual – where we alternate between relating to many and to few. And the other movement is more of a progression between different interactions: first, we’ll walk around and look into each other, paying attention without words. Then, we’ll use the word to introduce ourselves to all we haven’t met, and then we get into longer conversations through the game “Ontologies are Conversations”: Everyone takes a number of cards, different ones. Then you choose a first one, approach a person. You decide in your mind, from 0-10 how interested you are in the subject – whatever comes to your mind. You talk (for about a minute). The other person can engage in a little talk – and guesses what was the number guessed in the beginning (from the evidence in the talking). You change roles. Then (I’ll ring bells) In the end, you choose a partner, together you choose one card, and you talk for about 5 minutes on the topic. And you can keep the cards to use at lunch. The activities are summarized in the table below; as well as examples from the card question: Attention/Silence All Few Look around. Look into Choose one to be nearby, and one to people’s eyes. Go to a be far away. (3 minutes) color – of your eyes, of your shirt, of your liking. Try to understand why the others are there. Move to another color. (You may show, you may not speak) (3 minutes) 247 Word/Communication Discussion: “Ontologies conversations” R Go to everyone you have Tell all your life in 1 minute to 2 never met: give your people. The others do the same. (3 hand, say your name. (3 minutes) minutes)/ Choose three words – or a small sentence – (put different emphasis) to talk about what you do: field, university, theme. Go around. Say these. Make the other repeat. Take 5-7 cards. (you can choose which you prefer…) Phase one: approach a person, talk about it. (find a number from 110 on interest – the other should guess what your number was) (5 minutes) Phase two: find a partner – propose a card, agree. Have a little discussion on it. (5 minutes) Examples of the Questions and Citations which were inspired or taken from the readings for the Ontology course: Indeed, such questions have seldom been perceived let alone posed. Do you change what you observe? How many hours a day — do you live in the imaginary? 'Reality check': does the real give answers? Reification: The mental conversion of a person or abstract concept into a thing. (oxford english dictionary) "The problem for science is to understand the proper domain of explanation of each abstraction rather than become its prisoner. (Levins&Lewontin, 2006, p.150) "Do not presume, do not despair" All is a reduction. Where does emergentism come from? What do you think of the existence of these people? 248 What if we traded minds and bodies? How can ontology change your lives? "To state the obvious: nothing is really something else" Philosophers have always wondered whether ordinary truth is independent of the human mind. "Of course, mathematics can often get on quite well without this philosophical interpretive work, and sometimes the interpretive work is premature and is a distraction at best." 2 ONTOLOGICAL LABYRINTH catalogue of activities to enlighten about and beyond ontological issues in cooperation in Wiki articles A collaboration between Ontology course and SiV, 2009, through Rut Jesus/Di Ponti 2.1 Setting/Invitation/Acknowledgments thanks to all who still wanna play… On the 10th of September 2009, an odd SiV will take place from 14:15-17:00 starting at Kc7, NBI, Blegdamsvej 17. All old SiVers are invited, plus the ontology course participants and also interested locals. Upon the ontological issues raised in the course last spring, I will start by introducing the players in Wikipedia and its studies. Then we’ll focus on issues on collaboration, on Wikis, on free production of knowledge, on network analysis, and physics/sociology/philosophy studies. Then more general ontological questions will be addressed, which will hopefully help each individual to reflect upon their quests and projects, and in the end we gather the knowledge and the experience. It will take longer because I would like to, the whole way through, to formulate new formats (instead of a big group that discusses one at the time) that support the different parts of the presentation. We will play cards, make maps, go find treasures and use the space and each other, to think about ontology, collaboration, forms and norms. 2.2 Introduction This is a catalogue of ways to put people to play, reflect, think. This is an innovative way to present my learnings, my thoughts and my investigations into the broad ontological theme within my research on cooperation, Wikis and networks. The ontology course was a boost to try new structures that can help people to collaborate, think together, feel part, while investigating reality, and ontological standings of our studies. Moreover, this is naturally part of expontology, a concept created after the course, as can be seen below, from the Expontology Manifest: Our approach, • taking departure from case studies, is evidence based or data driven; 249 • allows also for qualitative kinds of data, and takes various forms of human experience as an addional source of knowledge; and • aims at looking critically at alternative ways to construct ontologies and to question ontological presuppositions, and forms a part of rational inquiry, using the standard arsenal of reasoning and experimental techniques; thus motivates the neologism expontology, combining experience, experimental research and ontology. We aim at pursuing expontology not as a unified investigation into one topic, but going stepwise forward in parallel with specific case studies, that will be compared for metaanalysis as the individual projects proceed. The deepest learning on ontology, is that ontology is really about reflection as one should ask the question: what is it that I am doing? What is this reality that I am presupposing? How do I methodologically support this research? In my case, I use a mixed qualitative and quantitative methodology, and how does that support viewing cooperation in Wiki-articles? 2.2.1 On research Writing and citing are a tool, a language tool, very useful, and very dry. But when thinking of other tools — can they really bring something new and of quality? One could argue that "if you do what you always did you'll get the results you always got" not meaning though that ‘if you do something else, you’ll get something else.’ It seems still worth to try these new formats that, if not getting to something better, have already contributed to the enthusiasm. And, as for methods of reasoning, it is only by opening and trying something new that we can see a certain abduction that can possibly bring things up. One may not be able to come to a conclusion on which tools then, are appropriate. the body, ‘playing’, are good tools to engage: good tools... but to think? To argue? How much of these old tools do we use because of a general acceptance/practice/normalization in science, and how many because of a real advantage in doing these things this way? How contingent in the structures of science is this process of inquiry? These are deep philosophy of science questions, that I raise here, but will not be able to answer. There will still be few pillars: - 2.2.2 Leaving footprints of the inquiry seems to be very useful, so I will write this report and document the process; Research is somehow a labyrinth/treasure-hunt/serendipity/abduction – one is not sure what the answer is, when one starts the inquiry, so this investigation is also a labyrinth, a treasure-hunt [hosted @ NBI, which is also a labyrinth]; Research needs the ability to go from the concrete to the abstract, and from the abstract to the concrete [up the hill, down the hill], so these activities will start by ‘my particular/concrete’ [the bottom of the hill on my side], abstract that to general questions [the top of the hill] and then be concretized again in your issues [the bottom of the hill on the other side]. On Interaction & On Facilitation Thoughts, themes, lives are interwoven. To take apart line by line, is also, to undo the quilt. Cooperation has been studied from many different perspectives: in 1984, Axelrod published on tit-for-tat strategies; cooperation has been the issue on ant studies; cooperation has been distinguished from collaboration (uniquely human) and from coordination (organization in order to do something); cooperation in biology, is emphasized by Lynn Margulis on symbiosis – a deep cooperation as the beginning of life; and this list could go on and on; cooperation happens through interaction; and interaction and cooperation can be promoted by 250 facilitation; and they all remind us that cognition is no longer, or no longer needs to sit, in a brain alone, but also in the distribution with cognitive artifacts, with other people; cognition is no longer this ‘brain-in-a-vat’; needs our interactions, with each other, with the bodies; cognition can then happen at a distance, through wires, computers, procedures, Wikis; cognition can happen right here and now, in the handling of the concrete, in deciding this and that. Ib Ravn (and colleague) at the Learning Lab / DPU have theorized about facilitation in meetings. The minds should be stimulated (as we knew); but also the emotions and people as social beings – able and interested in the interaction, in meeting each other, in learning from each other. But, I’d like to add, the body should also be part. The body should connect the stimulated minds, the engaged emotions, the interactive beings. Let’s not destroy the quilt – let’s use it for warmth and for connections. 2.2.3 On Inspiration The process of thinking and creating can and should be stimulated. This is well known to theater artists, for example, who go through a long brainstorm and several exercises before they start rehearsing for a play. This is well known to top professionals who are also very good at something else (science/music; writing/dancing; executives/sports) and draw inspiration from their second activity in their daily work. This is well known to many of us who go to lectures about unrelated topics and end up having ideas that are useful for our own work. Sometimes is just an analogy, sometimes a process that can be used, sometimes a link, that although unable to be explained, feels like an ‘eureka’. It is in this fashion that I propose that we use some ideas, even if directly unrelated, to play with. 2.3 Networks / Cooperation How do networks enlighten us about what is happening? What is the ontology of networks? They ‘appear’ bottom-up, some links and clusters are found from these, using some mathematical formula. So, they do exist, at least in our use of them to describe the world. First of all, they emphasize more the links and less the nodes. In the case of my research, bipartite networks, link two sides through the other side. Two authors may be linked by having edited in the same article – they may though have done very different things and been there at different times. Or they may have interacted in a stigmergic way, by leaving footprints in the environment that are then captured by others. The existence of networks, or communities, or the presence of cooperation — raises at least a concern with bottom-up vs. top-down. When using the word community: is that something that can be observed, that the people may even not be aware of? Is it a wisdom of the crowds, or is it something that has a number of roles, cultures, a sociological phenomenon, do they need to call themselves a ‘community’ to ‘feel part’? It can certainly be argued that these two kinds of communities exist, for which different notions of the word community can apply. What if these communities match? What if they don’t match? Does community depend on being ‘conscious’ about it? Aspects of Wiki cooperation can be related to what ants do – also building hills, and leaving footprints in the environment. But are they conscious about what is going on? Let’s assume not. In that case, this cooperation lacks the essential semantic aspect, keeping only the syntactical (cooperation, not collaboration). Otherwise, aspects of this cooperation are related to human collaboration, highly conscious and determined, like making governments or decisions. 2.3.1 Make a network of articles: People stand up, in a circle. One is ‘my article’ – half of them are blues and half are reds. Then they can mix, talk to people of the other color about direct and/or indirect collaboration. Each person you talk to, you give that person a piece of thread. In the end you line up. And 251 we got a network. In a way one ‘talks’ to one another by having talked with the same person. This exercise gives an insight into how the bipartite networks work, and what is being connected. 2.3.2 Make a lively Wiki Article Let’s write a new article, re-enacting the data. People have cards where there are 14 tasks/roles – things to do. These exist in a proportion based on Pfeil et al (2006). Add Link, Format, Add Information, Clarify Information, Fix Link, Delete Information, Delete Link, Grammar, Mark-up Language, Reversion, Spelling, Style/Typography, Vandalism Add Link Format Add Information Fix Link Style/Typography Clarify Information Delete Link Grammar Delete Information Reversion Spelling Mark-up Language Vandalism 24.27% 15.21% 13.27% 9.71% 8.73% 8.41% 5.18% 3.56% 3.24% 3.24% 2.27% 1.62% 1.29% 19 12 10 8 7 7 4 3 3 3 2 1 1 A poster is in a wall. Everyone can suggest a change, using a card that they have. This will give an insight into what kind of work is happening – many tinkerings, many links added, less vandalism, or deletion. 2.4 Treasure Hunt One can hear someone else’s research, in ontological and epistemological issues, and use it to one’s own questions. Most times, this is done implicitly, here more explicitly. But how to suggest that people reflect? Moreover – research can mean a topic one professionally tries to find, but also, as we are all learners of life, we also research other questions, other issues, be it in private terms, or questions that cannot be grasped by the methods that we have nowadays. We think aloud sometimes, we think for ourselves sometimes, we don’t think sometimes. The general structure of the next part is a kind of treasure-hunt/path of exercises — where you have to go to different stations, and do different things at different places. Some of it is alone, some you need a partner, or two. The idea is to bring some questions up. 2.4.1 2.4.1.1 Generalities Cards There are always cards. Cards with directions in the beginning. Cards to take notes. Cards with questions. The first card is the “card of what I am doing”: everyone gets a card to note down where they went and briefly what they did. In the end this is used to tell one’s story (each story is personal). As for other cards, always make them, trade them, add on them, keep them, hide them, lose them, abuse them… 252 2.4.1.2 You get initial directions you get four cards – with directions and very abstract ways of choosing (graphs – like glass bead game) – and can decide where to start, which order to go to. 2.4.1.3 you can play it alone, in duos or trios If in duos or trios, you have to either agree on some issues before you write them down, or you have to write separate cards. 2.4.2 2.4.2.1 Games: Choose your cards @ TT (1 person) Here is the THINK TANK. Also know as Heisenberg’s Bath Tub. It has been used by Heisenberg and by others after him to have good ideas. Write good ideas in cards. Add them to the Think Tank. Choose three cards from the Think Tank that you like, and take them with you. 2.4.2.2 different zooms game theme: BOTTOM-UP VS TOP-DOWN (alone) at the library: get one of three cards: In card 1, there are instructions for you to make many observations at a very zoomed in level: choose a book, take notice of its name, the color, the type, the quote in page 3. In card 2, there are instructions for you to make observations at a very zoomed out level: the size of the library, the categories, the purpose of libraries. In card 3, there are instructions to take notice of other senses: the lightning, the feelings present, the warmth. 2.4.2.3 Questions game (2 or 3 people) Meet someone else and talk about ‘the library’. Meet three on three. In a round ask questions. You can only ask questions. But you should try to have a conversation. Talk about what you just saw at the library… 2.4.2.4 Write your thoughts Theme: ONTOLOGICAL DISCIPLINARITY 2.4.2.4.1 questions for you Write here one question in a card related to you formal research. Write here one question in a card related to your informal research. Reflect upon what needs to exist – what are the ontological necessities – for those questions to even make sense. Answer the question: what is your field? If you can answer — what does that imply? If you can’t, how do you balance where you are? what is your reality? 2.4.2.4.2 Kid/Adult discussion (meet two on two) and have a dialog – one plays the curious smart kid and another one the adult: explain me what you are interested in. Why? Why? Why? What is reality made of? (The reason kids ask why is not to know if they should give you money, like financing institutions, but if it can fit it in some reality.) Both of you do this. Write down a number of cards with insights you just generated. 253 2.4.2.5 Or play with a distinction Hats — Gooey-Prickley look at our cards, and put gooey or prickly hat, and talk about them in different formats. Choose a card: put gooey hat – talk about that from that perspective; then put on prickley hat – and talk about that perspective… 2.5 Gathering in the end In the end, there are quite a few card-questions. We try to share what we learned/found interesting: - By reading cards aloud and discussing them - By posting them on the wall, in a mind.map, kind of visuals - by telling each one’s narrative, from the ‘what am I doing card’. 3 THE FEATURED TASK GAME by Diandanne Productions What is the Featured Task of WikiSym 2010? This is a pervasive game to identify what WikiSym people enjoy doing, while they get to know each other, think about wikis and savor the conference. You have a task to perform. Once you've accomplished it, sign your name, and give the task further to the person you talked to, or to anyone if the there was no person involved. If you don't want to do it, put it back in the bowl. In the end of the conference, we'll gather the tasks and announce which were the most popular ones. Examples of tasks: Task: Explain to someone how you discovered wikis. Task: Read a poem to someone. Task: Discuss with someone a way to do WikiSym 2011 better. Task: Learn a word in Polish. Task: Introduce someone to a very obscure theory of yours. Task: Introduce a newbie to someone they haven't met before. Task: Tell someone about a book you care about. Task: Say thank you to Phoebe, the conference chair. Task: Brainstorm a new project with someone. Task: Add a verse to the wiki song. Task: Add a concept to our collnnective minds. Task: Write a new task. 254 META-PHD DESCRIPTION "There is only one success – to be able to spend your life in your own way." Christopher Morley by Di Ponti Defense day, in the afternoon, NBI Many stories, many hopes and many processes get buried under a project. In this Meta-PhD I try to savage some, and give them the place they deserve, as essential parts of this path. I have been many years in school where one of my most typical frustrations has been to be asked to perform in ways that seemed not so efficient, not to speak of boring, annoying and even counter-productive (making me dislike subjects that I would otherwise have loved). Often, when preparing for an exam or writing a final paper, I had many parallel ideas of ways that information could be conveyed (that I could show that I had learned), in ways that I found far more challenging and creative. I wanted to write dialogues, plays and poems. I wanted to write children's books, short stories and paint murals. With few exceptions, those were thoughts I had, and plans I didn't carry through, for I figured out that I would have to do double work: I'd have to do the exam AND the play, the paper AND the mural. So this time, when I started the enterprise of writing a PhD, I again thought of a parallel project. This time I kept it (almost) secret. This time I was going to reflect deeply about the process, for no one would interfere with my learning. It was mine. First I had an idea of an installation, and kept colorful pieces of paper in a drawer of the desk. Then it became more lively, ideas of interactions and performances. And now, it is called the 'Meta-PhD Offense', a performance and exhibition at the NBI Library following the 'PhD Defense'. This parallel project could not have happened without the PhD — it is, indeed, about it — about the process of learning, about the lessons, about the how’s, and less about the what’s. But the PhD wouldn't have happened without the Meta-PhD either. I recall well the first time I seriously thought of giving up the PhD, as life seemed to lie elsewhere, and the struggles of the institution were becoming exasperating: I looked at the Meta-PhD and thought, “if I stop here, I would lose the Meta-PhD as well, and I just can't do that.” I took the rest of the day off, and was back the next morning. Many parts and longings constitute the Meta-PhD. It has a principle of radical transparency for the work done for a PhD is far more and more complex than what gets distilled into a thesis. It has a principle of inspiration — a phrase, a thought, or an anecdote, can bring people far in their thought process, in their grasping of the difficulties they have. It has a principle of critical thinking, especially about the university institution, where I, the more I could orient myself in it, the least I wanted to be part of such a stiffened world. And it has a principle of honesty, of mixing the private and the political, the academic, the abstract, the concrete, the plural and the unique and using those to look at the future, and reflecting upon, what was, what is and what will be (of me). The central point of the Meta-PhD is using the "PhD as a way of knowing oneself". It is a reflection upon the last years and a construction of lessons to be taken forward. It is asking: what do I take with me? A process, a technique or a capacity? Isn’t it a wisdom, more than it is a knowledge? The Official School of Being EXISTENTIAL FACULTY COPENHAGEN CITY DENMARK life Meta-PhD Di Ponti Cooperation and Cognition in Writing a PhD A poetic, reflexive and experiential study Advisors: Rhubarb & Barbarella Center for the Philosophy of Nature and Science Studies Offended: 4/9/2010