1 In this video I describe how I built a network graph to visualize how declarations of unknowing appear in Old English texts. I want to break the process down into five steps: 1. 2. 3. 4. 5. I searched for these declarations in the Dictionary of Old English Corpus; I created and analyzed a small data set of passages; I came up with a way of representing these passages as a network graph; I then translated the data to a form Gephi can understand; and I prepared the visualization in Gephi. Part 1: Mining the Dictionary of Old English Corpus This article begins with the Dictionary of Old English Corpus, the resource that underlies my research. The Dictionary of Old English Corpus contains at least one version of every known Old English text. In my project, I used its latest, most powerfully searchable, not yet released 2014 version to search for the combinations of words that mark out declarations of unknowing. As I note in my paper, I searched the DOE Corpus for the words “nis (n)ænig” (there is no one), which mark out the declarations of unknowing that I am looking for. The search returns all citations from the Old English corpus in which the two words appear—in any order. 2. Building the Data Set Then I analyzed each of these passages in context to determine if the passages contained declarations of unknowing and if so, what topic the declarations of unknowing referred to. Given the small size of the data set, it was enough to use an Excel file rather than a full-fledged SQL database to capture it. (show Excel file) 3. Modelling the Passages as a Network Graph Before even turning to Gephi (or any other network visualization tool), I had to think how I would represent my data as a network graph. A network graph consists of a set of things, called nodes or vertices, pairwise connected by relationships called edges. Network graphs are extremely versatile. In humanities research specifically, network graphs can be used for almost anything. They can represent relationships between characters (this network visualization of Les Miserables tracks interactions between characters in the novel); or correspondence among a group of intellectuals who exchanged letters over a period of time (this network visualization tracks letters between Enlightenment-era writers and thinkers); or networks of sale and purchase of medieval manuscripts (http://mappingbooks.blogspot.ca/2014/01/charting-former-owners-of-penns-codex.html); or—in my case—the distribution of Old English formulaic phrases that occur within a group of Old English texts. In short, you can use network graphs to visualize anything, as long as that thing can be imagined as a network of relationships. 2 Whatever your research, then, your first step is the conceptual framework: within your data, figure out what you want to model as things and what you want to model as relationships; in network terms, what you want to model as nodes (points in your network) and what you model as edges (linking lines in your network). At this point it might actually be helpful to sketch out a tiny part of your data on paper, as an easy way of prototyping your graph. In my visualization, I decided to model the declarations of unknowing themselves as relationships between texts and topics (the latter usually supernatural). That is, the texts and topics were the nodes; the declarations of unknowing were the lines. This model is in keeping with the notion of traditional referentiality: in a tradition-based poetics, motifs exist not just in themselves or in the texts where they appear; they exist and carry with them the context of the entire tradition. I model the declarations as relationships because each of them mediates between internal and external context, between the textual moment where it occurs and the wider tradition it recalls. Part 3: Data for Gephi Once I knew what shape I wanted my graph to take, I put my data in a form understandable to Gephi, the free, open-source network graphing software I am using. First, I downloaded Gephi and installed it on my machine. (Unlike, say, Google Fusion Tables, this is not an in-browser application.) Then, based on my dataset, I created two spreadsheets: Nodes and Edges. The first spreadsheet is Nodes. Nodes contains the texts and topics. The spreadsheet has three columns: first, a unique id number; second, a label for the text or topic (a short title for the text, e.g. the DOE Corpus short title “Sat” for the Old English poem Christ and Satan, or a short topic description, e.g. “Doomsday”); third, a field that says either “Text” or “Topic” indicating if the node in question is a text or a topic. The second spreadsheet is Edges. Edges contains a list of the connections between nodes: an entry on the same row with a “source” node and a “target” node means that there’s a line between those two nodes. That is, if you have “61” under source and “2” under target means that there is a declaration of unknowing in text 62 (Blickling Homily 5) referring to topic 2 (God & Heaven). You can generate these two files manually, which is really tedious. Or you can generate them by running some sql queries on the larger Excel file containing your data. However you generate them, make sure you save them as .csv (comma-delimited). All right: now you have an idea of what you want your graph to look like; you have your data; and you have Gephi on your computer. Your next step is to get your data into Gephi and actually generate your network graph. 3 Part 4: Make the Graph To import your data, open up Gephi. Select New Project data Laboratory. Select Import Spreadsheet. First select your Nodes file and make sure you designate it as Nodes table on the import screen. Once you successfully imported the Nodes, add the Edges. Again, Select Import Spreadsheet. Select your Edges file and make sure you designate it as the Edges table on the import screen. Now, time to see and edit the resulting graph. There are two kinds of edits you can make: structural (such as picking between different network representation algorithms) and appearance-related (such as making small tweaks to the representation of your network). Given the nature of my project—a small dataset, un-weighted edges, data firmly shaped by close reading—the changes I made were appearance-related. Thus the appearance of the graph is not entirely driven by the data; it is driven also by my interpretation of the data. 1. I centred the graph 2. I showed node labels (i.e. short titles as per DOE Corpus for the texts and topic names for the topics) 3. I dragged nodes into maximum visibility positions, with texts above and topics below 4. I sized nodes to correspond to the number of declarations of unknowing present in each text 5. I decided on a colour scheme for the nodes 6. I tweaked font size Important to note: Gephi has no undo button. But you can save multiple versions of your project, so you can revert to an earlier version if you make any colossal mistakes along the way. Despite some challenges with the documentation, what I liked about working with Gephi is that it let me model, visualize, and examine my data systematically while at the same time allowing me to handcraft the visualization. So the graph, as I noted earlier, is of course created from the textual data but not wholly driven by it: it is also very much shaped by my own interpretive and presentation decisions. This is in keeping with my own approach, which balances distant reading of a large corpus and the wider tradition with focused close readings of individual texts.