Thinking through networks: generating, visualising and analysing complex re-use graphs in the Humanities Tom Brughmans Archaeological Computing Research Group Department of Archaeology University of Southampton Marco Büchler Natural Language Processing Group Department of Computer Science University of Leipzig Abstract “Money does not smell.” This well-known phrase goes back to Caesar Vespasian in the 1st century AD introducing a latrine tax. Due to continuous re-use within the last almost 2000 years this saying gradually became “general knowledge” or part of our “standard language skills”. And this is not the only example. Taking a step back from the present and looking at our past from a long-term perspective reveals a wealth of such sayings, and people are often not aware of their historical roots. The academic process is not exempt from these phenomena of text re-use. This workshop aims at illustrating issues related to text re-use through a practical example, using seven different and in some cases independent English translations of the Holy Bible. At a more general level this workshop will discuss the potential of a networks perspective in the Humanities and introduce participants to generating and using complex graphs. The main objective of the workshop is to introduce participants to the potential of a complex networks perspective in the Humanities. Through a practical example of text re-use techniques it will cover some basic graph visualisation, exploration and analysis. Some of the questions this workshop will be addressing in particular are: What are typical research topics in the Humanities that can be examined through a complex networks perspective? How can applications in the Humanities make valuable contributions to ongoing discussions in network science? How can networks be generated by both quantitative and qualitative methodologies? What kinds of graph visualization do make sense in real world humanities applications? Which ways of communication between developers and users are necessary in order to bring those kinds of macro structure visualizations to a benefit for both groups? Keywords: Networks, Graph Mining, Text re-use, Bible, eHumanities Workshop outline Introduction Duration: 30 min Interactive discussion on networks/graphs and their application in the Humanities. What is a graph/network? Duration: 30 min Theoretical section introducing participants to common terms and techniques. Graph VS network, what is this all about. Short history of research Graph visualisation techniques Popular graph analysis techniques from social network analysis and complex networks in physics. Introduction to re-use graphs Duration: 20 min Introduction to defining quotations and research interests and how to explore these. What is a quotations Research interests of humanists Close reading (text transmission) Distant reading (Macro structures) Working with text re-use graphs Duration: 30 mins Examples of text re-use graphs and typical results through different visualisations. Macro view, a.k.a. distant reading Micro view, a.k.a. close reading Temperature view, a.k.a. what’s hot and what’s not or mid-distance reading. Dotview plot, a.k.a. mid-distance reading. Final discussion Duration: 10 min How can all this graph/network stuff be applied to the participants’ research? Software resources Pajek: http://pajek.imfm.si/doku.php UCINET: http://www.analytictech.com/ucinet/ Cytoscape: http://www.cytoscape.org/ Gephi: http://www.gephi.org/ Processing: http://www.processing.org/ Mathematica: http://www.wolfram.com/mathematica/new-in-8/graph-and-networkanalysis/index.html Matlab: http://www.levmuchnik.net/Content/Networks/ComplexNetworksPackage.html R: http://igraph.sourceforge.net/doc/R/00Index.html Sci2: https://sci2.cns.iu.edu/ Network workbench: http://nwb.cns.iu.edu/ Excel and NodeXL: http://nodexl.codeplex.com/ Python: http://networkx.lanl.gov/ orhttp://igraph.sourceforge.net/ TRACER: A text re-use tracing software http://mbuechler.e-humanities.net/tracer/ (available by Q4/2011 as beta release) Bibliographic resources Introductory publications: Albert, R. & Barabási, A., 2002. Statistical mechanics of complex networks. Reviews of modern physics, 74(January), pp.47-97. Barabási, A.-L., 2002. Linked: The New Science of Networks, Cambridge, Massachusetts: Perseus. Freeman, L., 2004. The development of social network analysis, Vancouver: Empirical Press. Newman, M.E.J., 2010. Networks: an introduction, Oxford: Oxford Univeristy Press. Watts, D.J., 2003. Six Degrees: The Science of a Connected Age, London: Vintage. Watts, D.J., 2004. The “New” Science of Networks. Annual Review of Sociology, 30(1), pp.243-270. Key publications in physics: Barabási, A.-L. & Albert, R., 1999. Emergence of Scaling in Random Networks. Science, 286(5439), pp.509-512. Newman, M.E.J., 2010. Networks: an introduction, Oxford: Oxford Univeristy Press. Newman, M., Barabasi, A.-L. & Watts, D.J., 2006. Structure and Dynamics of Networks, Princeton: Princeton University Press. Watts, D.J. & Strogatz, S.H., 1998. Collective dynamics of “small-world” networks. Nature, 393(6684), pp.440-2. Key publications in social network analysis: Carrington, P.J., Scott, J. & Wasserman, S., 2005. Models and methods in social network analysis, Cambridge ; New York: Cambridge University Press. Freeman, L.C., 1977. A Set of Measures of Centrality Based on Betweenness. Sociometry, 40(1), pp.35-41. Granovetter, M., 1983. The strength of weak ties: A network theory revisited. Sociological theory, 1(1), p.201–233. Scott, J., 2000. Social Network Analysis. A Handbook. 2nd ed., London, Thousand Oaks, CA and New Delhi: Sage Publications. Scott, J. & Carrington, P.J., 2011. The SAGE handbook of social network analysis, Sage. Wasserman, S. & Faust, K., 1994. Social network analysis : methods and applications, Cambridge: Cambridge University Press. Key publications in text re-use: Broder, A:. 1997. On the Resemblance and Containment of Documents. In Proceedings of the Compression and Complexity of Sequences 1997 (SEQUENCES '97). IEEE Computer Society, Washington, DC, USA, 21-. Clough, P., Gaizauskas, R., Piao, S. S. L., & Wilks, Y.. 2002. METER: MEasuring TExt Reuse. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL '02). Association for Computational Linguistics, Stroudsburg, PA, USA, 152-159. DOI=10.3115/1073083.1073110 http://dx.doi.org/10.3115/1073083.1073110. Lee, J., 2007. A Computational Model of Text Reuse in Ancient Literary Texts. Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics (2007), Pages: 472-479. Seo, J. and Croft, W. B., 2008. Local text reuse detection. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR '08). ACM, New York, NY, USA, 571-578. DOI=10.1145/1390334.1390432 http://doi.acm.org/10.1145/1390334.1390432 Graph/network applications in the Humanities Bentley, R.A. & Maschner, H.D.G., 2003. Complex systems and archaeology, Salt Lake City: University of Utah Press. Bergs, A., 2005. Social Networks and Historical Sociolinguistics. Studies in Morphosyntactic Variation in the Paston Letters (1421-1503). (Topics in English Linguistics 51), Berlin/New York: Mouton De Gruyter. Coward, F. & Gamble, C., 2008. Big brains, small worlds: material culture and the evolution of the mind. Philosophical transactions of the Royal Society of London. Series B, Biological sciences, 363(1499), pp.1969-79. Ferrer I Cancho, R. & Solé, R.V., 2001. The small world of human language. Proceedings. Biological sciences / The Royal Society, 268(1482), pp.2261-5. Green, S., 2002. Culture in a network: dykes, webs and women in London and Manchester. In N. Rapport, ed. British Subjects: An Anthropology of Britain. Oxford: Berg, pp. 181-202. Knappett, C., Evans, T. & Rivers, R., 2008. Modelling maritime interaction in the Aegean Bronze Age. Antiquity, 82(318), p.1009–1024. Lemercier, C., 2010. Formal network methods in history: why and how? In G. Fertig, ed. Social Networks, Political Institutions, and Rural Societies. Turnhout: Brepols publishers. Michel, J.-B., Shen, Y. K., Aiden, a. P., Veres, A., Gray, M. K., Pickett, J. P., Hoiberg, D., Clancy, D., Norvig, P., Orwant, J., Pinker, S., Nowak, M. a., Aiden, E. L., 2010. Quantitative Analysis of Culture Using Millions of Digitized Books. Science, 176. Newman, M., 2001. Scientific collaboration networks. results. Physical Review E, 64(1), pp.1-8. I. Network construction and fundamental Newman, M.E.J., 2001. Scientific collaboration networks. II. Shortest paths, weighted networks, and centrality. Physical Review E, 64(1), pp.1-7. Padgett, J. & Ansell, C., 1993. Robust Action and the Rise of the Medici, 1400-1434. American Journal of Sociology, 98(6), pp.1259-1319. Padgett, J.F. & McLean, P.D., 2006. Organizational Invention and Elite Transformation: The Birth of Partnership Systems in Renaissance Florence. American Journal of Sociology, 6(5), pp.14631568. Ruffini, G.R., 2008. Social networks in Byzantine Egypt, Cambridge: Cambridge University Press. Schich, M. & Coscia, M., 2011. Exploring Co-Occurrence on a Meso and Global Level Using Network Analysis and Rule Mining. In Proceedings of the ninth workshop on mining and Learning with Graphs (MLG ’11). San Diego: ACM. White, D.R. & Johansen, U.C., 2005. Network analysis and ethnographic problems. Process models of a Turkish nomad clan, Oxford.