it here - Archaeological Networks

advertisement
Thinking through networks:
generating, visualising and analysing
complex re-use graphs in the
Humanities
Tom Brughmans
Archaeological Computing Research Group
Department of Archaeology
University of Southampton
Marco Büchler
Natural Language Processing Group
Department of Computer Science
University of Leipzig
Abstract
“Money does not smell.” This well-known phrase goes back to Caesar Vespasian in
the 1st century AD introducing a latrine tax. Due to continuous re-use within the last
almost 2000 years this saying gradually became “general knowledge” or part of our
“standard language skills”. And this is not the only example. Taking a step back from
the present and looking at our past from a long-term perspective reveals a wealth of
such sayings, and people are often not aware of their historical roots. The academic
process is not exempt from these phenomena of text re-use. This workshop aims at
illustrating issues related to text re-use through a practical example, using seven
different and in some cases independent English translations of the Holy Bible. At a
more general level this workshop will discuss the potential of a networks perspective
in the Humanities and introduce participants to generating and using complex graphs.
The main objective of the workshop is to introduce participants to the potential of a
complex networks perspective in the Humanities. Through a practical example of text
re-use techniques it will cover some basic graph visualisation, exploration and
analysis. Some of the questions this workshop will be addressing in particular are:





What are typical research topics in the Humanities that can be examined
through a complex networks perspective?
How can applications in the Humanities make valuable contributions to
ongoing discussions in network science?
How can networks be generated by both quantitative and qualitative
methodologies?
What kinds of graph visualization do make sense in real world humanities
applications?
Which ways of communication between developers and users are necessary in
order to bring those kinds of macro structure visualizations to a benefit for
both groups?
Keywords: Networks, Graph Mining, Text re-use, Bible, eHumanities
Workshop outline
Introduction
Duration: 30 min
Interactive discussion on networks/graphs and their application in the Humanities.
What is a graph/network?
Duration: 30 min
Theoretical section introducing participants to common terms and techniques.

Graph VS network, what is this all about.

Short history of research

Graph visualisation techniques

Popular graph analysis techniques from social network analysis and complex
networks in physics.
Introduction to re-use graphs
Duration: 20 min
Introduction to defining quotations and research interests and how to explore these.

What is a quotations

Research interests of humanists

Close reading (text transmission)

Distant reading (Macro structures)
Working with text re-use graphs
Duration: 30 mins
Examples of text re-use graphs and typical results through different visualisations.

Macro view, a.k.a. distant reading

Micro view, a.k.a. close reading

Temperature view, a.k.a. what’s hot and what’s not or mid-distance reading.

Dotview plot, a.k.a. mid-distance reading.
Final discussion
Duration: 10 min
How can all this graph/network stuff be applied to the participants’ research?
Software resources
Pajek: http://pajek.imfm.si/doku.php
UCINET: http://www.analytictech.com/ucinet/
Cytoscape: http://www.cytoscape.org/
Gephi: http://www.gephi.org/
Processing: http://www.processing.org/
Mathematica: http://www.wolfram.com/mathematica/new-in-8/graph-and-networkanalysis/index.html
Matlab:
http://www.levmuchnik.net/Content/Networks/ComplexNetworksPackage.html
R: http://igraph.sourceforge.net/doc/R/00Index.html
Sci2: https://sci2.cns.iu.edu/
Network workbench: http://nwb.cns.iu.edu/
Excel and NodeXL: http://nodexl.codeplex.com/
Python: http://networkx.lanl.gov/ orhttp://igraph.sourceforge.net/
TRACER: A text re-use tracing software http://mbuechler.e-humanities.net/tracer/
(available by Q4/2011 as beta release)
Bibliographic resources
Introductory publications:
Albert, R. & Barabási, A., 2002. Statistical mechanics of complex networks. Reviews of modern
physics, 74(January), pp.47-97.
Barabási, A.-L., 2002. Linked: The New Science of Networks, Cambridge, Massachusetts: Perseus.
Freeman, L., 2004. The development of social network analysis, Vancouver: Empirical Press.
Newman, M.E.J., 2010. Networks: an introduction, Oxford: Oxford Univeristy Press.
Watts, D.J., 2003. Six Degrees: The Science of a Connected Age, London: Vintage.
Watts, D.J., 2004. The “New” Science of Networks. Annual Review of Sociology, 30(1), pp.243-270.
Key publications in physics:
Barabási, A.-L. & Albert, R., 1999. Emergence of Scaling in Random Networks. Science, 286(5439),
pp.509-512.
Newman, M.E.J., 2010. Networks: an introduction, Oxford: Oxford Univeristy Press.
Newman, M., Barabasi, A.-L. & Watts, D.J., 2006. Structure and Dynamics of Networks, Princeton:
Princeton University Press.
Watts, D.J. & Strogatz, S.H., 1998. Collective dynamics of “small-world” networks. Nature,
393(6684), pp.440-2.
Key publications in social network analysis:
Carrington, P.J., Scott, J. & Wasserman, S., 2005. Models and methods in social network analysis,
Cambridge ; New York: Cambridge University Press.
Freeman, L.C., 1977. A Set of Measures of Centrality Based on Betweenness. Sociometry, 40(1),
pp.35-41.
Granovetter, M., 1983. The strength of weak ties: A network theory revisited. Sociological theory,
1(1), p.201–233.
Scott, J., 2000. Social Network Analysis. A Handbook. 2nd ed., London, Thousand Oaks, CA and New
Delhi: Sage Publications.
Scott, J. & Carrington, P.J., 2011. The SAGE handbook of social network analysis, Sage.
Wasserman, S. & Faust, K., 1994. Social network analysis : methods and applications, Cambridge:
Cambridge University Press.
Key publications in text re-use:
Broder, A:. 1997. On the Resemblance and Containment of Documents. In Proceedings of the
Compression and Complexity of Sequences 1997 (SEQUENCES '97). IEEE Computer Society,
Washington, DC, USA, 21-.
Clough, P., Gaizauskas, R., Piao, S. S. L., & Wilks, Y.. 2002. METER: MEasuring TExt Reuse. In
Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL '02).
Association for Computational Linguistics, Stroudsburg, PA, USA, 152-159.
DOI=10.3115/1073083.1073110 http://dx.doi.org/10.3115/1073083.1073110.
Lee, J., 2007. A Computational Model of Text Reuse in Ancient Literary Texts. Proceedings of the 45th
Annual Meeting of the Association of Computational Linguistics (2007), Pages: 472-479.
Seo, J. and Croft, W. B., 2008. Local text reuse detection. In Proceedings of the 31st annual
international ACM SIGIR conference on Research and development in information retrieval (SIGIR
'08). ACM, New York, NY, USA, 571-578. DOI=10.1145/1390334.1390432
http://doi.acm.org/10.1145/1390334.1390432
Graph/network applications in the Humanities
Bentley, R.A. & Maschner, H.D.G., 2003. Complex systems and archaeology, Salt Lake City:
University of Utah Press.
Bergs, A., 2005. Social Networks and Historical Sociolinguistics. Studies in Morphosyntactic Variation
in the Paston Letters (1421-1503). (Topics in English Linguistics 51), Berlin/New York: Mouton
De Gruyter.
Coward, F. & Gamble, C., 2008. Big brains, small worlds: material culture and the evolution of the
mind. Philosophical transactions of the Royal Society of London. Series B, Biological sciences,
363(1499), pp.1969-79.
Ferrer I Cancho, R. & Solé, R.V., 2001. The small world of human language. Proceedings. Biological
sciences / The Royal Society, 268(1482), pp.2261-5.
Green, S., 2002. Culture in a network: dykes, webs and women in London and Manchester. In N.
Rapport, ed. British Subjects: An Anthropology of Britain. Oxford: Berg, pp. 181-202.
Knappett, C., Evans, T. & Rivers, R., 2008. Modelling maritime interaction in the Aegean Bronze Age.
Antiquity, 82(318), p.1009–1024.
Lemercier, C., 2010. Formal network methods in history: why and how? In G. Fertig, ed. Social
Networks, Political Institutions, and Rural Societies. Turnhout: Brepols publishers.
Michel, J.-B., Shen, Y. K., Aiden, a. P., Veres, A., Gray, M. K., Pickett, J. P., Hoiberg, D., Clancy, D.,
Norvig, P., Orwant, J., Pinker, S., Nowak, M. a., Aiden, E. L., 2010. Quantitative Analysis of Culture
Using Millions of Digitized Books. Science, 176.
Newman, M., 2001. Scientific collaboration networks.
results. Physical Review E, 64(1), pp.1-8.
I. Network construction and fundamental
Newman, M.E.J., 2001. Scientific collaboration networks. II. Shortest paths, weighted networks, and
centrality. Physical Review E, 64(1), pp.1-7.
Padgett, J. & Ansell, C., 1993. Robust Action and the Rise of the Medici, 1400-1434. American
Journal of Sociology, 98(6), pp.1259-1319.
Padgett, J.F. & McLean, P.D., 2006. Organizational Invention and Elite Transformation: The Birth of
Partnership Systems in Renaissance Florence. American Journal of Sociology, 6(5), pp.14631568.
Ruffini, G.R., 2008. Social networks in Byzantine Egypt, Cambridge: Cambridge University Press.
Schich, M. & Coscia, M., 2011. Exploring Co-Occurrence on a Meso and Global Level Using Network
Analysis and Rule Mining. In Proceedings of the ninth workshop on mining and Learning with
Graphs (MLG ’11). San Diego: ACM.
White, D.R. & Johansen, U.C., 2005. Network analysis and ethnographic problems. Process models of
a Turkish nomad clan, Oxford.
Download