Seth C. Lewis • Proposed special issue in Digital Journalism • Page 1 Proposal for a Special Issue of Digital Journalism ‘Journalism in an Era of Big Data’ Guest Editor: Dr. Seth C. Lewis, Assistant Professor, School of Journalism & Mass Communication, University of Minnesota–Twin Cities, USA Introduction and overview The term “Big Data” is often invoked to describe the overwhelming volume of information produced by and about human activity, made possible by the growing ubiquity of mobile devices, tracking tools, always-on sensors, and cheap computing storage. “In a digitized world, consumers going about their day—communicating, browsing, buying, sharing, searching—create their own enormous trails of data” (Manyika et al., 2011, p. 1). Technological advances have made it easier than ever to harness, organize, analyze, and visualize massive repositories of these digital traces, both for corporations, governments, and researchers, as well as for any individual with an internet connection (Manovich, 2011). These streams of data, stored and structured in reusable ways, increasingly provide the “raw material of journalism” (Bell, 2012, p. 48)—the potential for new forms of gathering, filtering, and disseminating news information, refined by automation and algorithms. As Bell suggests, “One of the most important questions for journalism’s sustainability will be how individuals and organizations respond to this availability of data” (2012, p. 48). Contours of this response are beginning to emerge in the journalism field. The role of computer scientists, working as “programmer-journalists,” has grown substantially since the mid-2000s, as several leading news organizations have assembled teams dedicated to “news applications” and software-driven variations on traditional computer-assisted reporting (Parasie & Dagiral, 2012; Powers, 2012). Overall, the work of collecting, processing, assessing, and visualizing data sets as news can be seen as part a larger development, what several scholars have referred to as computational journalism. Anderson (2012) describes this as “increasingly ubiquitous forms of algorithmic, social scientific, and mathematical forms of newswork adopted by many 21st-century newsrooms and touted by many educational institutions as ‘the future of news’” (p. 1). Computational journalism, also called data journalism, can be seen as an outgrowth of and a departure from the computer-assisted reporting (CAR) of decades past (Flew, Spurgeon, Daniel, & Swift, 2012; Powers, 2012), leveraged on the growing abundance of Big Data and its appropriation by computer programmers. These developments raise all manner of questions for journalism and its role in public life. To the extent that Big Data represents a major social, cultural, and technological phenomenon (boyd & Crawford, 2012), one that challenges traditional modes of information organization and expression (Manovich, 2011), what are the implications of Big Data for journalism? How are journalists learning to cope with and take advantage of new data sets? Are there emerging sets of routines, values, and norms associated with Big Data in journalism? What forms and functions does data journalism serve, and with what kinds of consequences for the overall character of news production? What are the consequences of data for the epistemology of news, and for the nature of Seth C. Lewis • Proposed special issue in Digital Journalism • Page 2 professional identity and expertise in journalism (see Parasie & Dagiral, 2012)? How might scholars come to understand the sociology of data/computational/programmer journalism; which theoretical frameworks or conceptual approaches are most fruitful in helping to explain and predict the processes and outcomes of such journalistic activities (see Anderson, 2012)? Beyond the traditional journalism field, how are other actors, such as technology entrepreneurs and information scientists, transforming “information” into novel forms of “journalism” (Lewis, 2012, p. 851)? While some scholars, like those cited above, are beginning to address these questions, the academic literature, thus far, is rather thin. Perhaps most critically, there are key normative questions that need to be examined. The Big Data phenomenon raises serious concerns about consumer privacy, data accuracy, and the overall quantification of social life (boyd & Crawford, 2012; Oboler, Welsh, & Cruz, 2012). “This data layer,” one observer noted, “is a shadow. It’s part of how we live. It is always there but seldom observed” (quoted in Bell, 2012, p. 48). Like corporations and governments, journalism is just beginning to learn how to navigate this shadow layer of public life. What constitutes ethical practice in handling personal data when it is so easily gathered and analyzed? What special obligations might journalists have with regard to Big Data tools and techniques, and how should those be understood within existing or emerging sets of journalistic norms? One example: The transition from stories to data These descriptive, conceptual, and ethical questions point to the need for research to explore the implications of Big Data for journalism. Below, I offer an example from my own ongoing work that is illustrative of the pressing need for research in this area. Manovich (1999) famously argued that if narrative (first the novel, then the cinema) was the defining form of cultural expression in the modern age, the computer age has provided its correlate—the database. Unlike the cause-and-effect trajectory of linear items (events) in narrative, the database “represents a world as a list of items which it refuses to order” (Manovich, 1999, p. 85)—without beginning or end, without narrators or key actors, but rather a repository of spreadsheet-like information that reveals different permutations according to the actions that database users take. Commenting on Manovich’s “oversimplified but brilliant and provocative formulation,” Schudson (2010) acknowledged, “This idea implies quite a lot, I believe, about the future of news” (p. 100). Journalists traditionally informed the public with episodic, narrative accounts. But in an era of Big Data, journalists are fast adopting the tools of computation, statistics, and social science to explain difficult problems through databases (Cohen, Hamilton, & Turner, 2011; Flew et al., 2012). These databases present public data that users can search, scrutinize, and visualize in novel ways, adding layers of interactivity and customization that upend the traditional one-way model of news storytelling. In essence, the character of news information is undergoing a pivotal shift from “stories” to “data”— from linear, qualitative, narrator-guided modes of expression, to nonlinear, quantitative, and user-guided modes of exploration. While not replacing the traditional focus on “stories,” this data journalism phenomenon is sweeping through the profession, as major news organizations such as the Guardian, The New York Times and NPR deploy “data teams” comprised of reporters Seth C. Lewis • Proposed special issue in Digital Journalism • Page 3 working alongside data scientists, information designers, and computer programmers (Weber & Rall, 2012). A few questions naturally arise: How does this shift from stories to data affect the professional culture, ethics, and values of journalism? And, by extension, how does this shift affect the overall nature of news that is produced—the news on which people rely for knowledge about public affairs? The transition from story to data is more than a technical shift in news format; it speaks to a larger socio-cultural transformation occurring in the information environment, as databases become a critical mode through which “stories are told” about public affairs. Such issues remain unanswered even as some researchers have heralded the potential of data journalism, arguing that computer scientists can strengthen the democratic functions of journalism at a time when many professional news staffs are shrinking. “Could computing technology—which has played no small part in the decline of the traditional news media—turn out to be a savior of journalism’s watchdog online?” (Cohen, Li, Yang, & Yu, 2011, p. 1). But these aspirations are based on scattered anecdotes, and have yet to be examined empirically in the academic literature. The proposed special issue Against that backdrop, the proposed special issue aims to make space for scholarship that interrogates the evolving nature of journalism in an era of Big Data, seeking to understand the implications of this social, cultural, and technological phenomenon for news and its role in public life. The special issue would thus explore a range of phenomena at the junction between journalism and the social, computer, and information sciences; these phenomena are organized around the contexts of digital information technologies being used in contemporary newswork—such as algorithms, applications, sophisticated mapping, real-time analytics, automated information services, dynamic visualizations, and other innovations that rely on massive data sets and their maintenance. What are the implications of such Big Data phenomena for journalism’s professional norms, routines, and ethics? For its modes of production, distribution, and audience reception? For the overall social organization, epistemology, and normative character of news in democratic society? The special issue would not be merely a descriptive, reductionist catalog of these various tools and their impact in journalism, nor would it be focused heavily on “best practices.” Instead, the special issue would follow Anderson’s (2012) lead in encouraging a “sociological approach to computational journalism” (p. 13, emphasis in original). This approach means undertaking a research program that brackets, at least temporarily, questions of practical utility to maintain a healthy skepticism—keeping in mind the tradeoffs, embedded values, and power dynamics associated with technological change. This special issue thus would emphasize a range of critical engagements with the issues surrounding Big Data and journalism. The special issue would welcome papers drawing on a variety of theoretical and methodological approaches, with a preference for empirically driven or conceptually rich accounts. These papers might touch on a range of themes, including but not limited to the following: The history (or histories) of computational forms of journalism; The epistemological ramifications of “data” in contemporary newswork; Seth C. Lewis • Proposed special issue in Digital Journalism • Page 4 Norms, routines, and values associated with emerging forms of datadriven journalism (e.g., data visualizations, applications, interactives, and alternative forms of storytelling); The sociology of new actors connected to computational forms of journalism, within and beyond newsrooms (e.g., news application teams, programmer-journalists, tech entrepreneurs, web developers, and hackers); The social, cultural, and technological roles of algorithms, automation, real-time analytics, and other forms of mechanization in contemporary newswork, and the implications of such for journalistic roles and routines The ethics of journalism in the context of Big Data Approaches for conceptualizing the distinct nature of emerging journalisms (e.g., computational journalism, data journalism, algorithmic journalism, and programmer journalism) The blurring boundaries between “news” and other types of information, and the role of Big Data and its related implications in that process Proposed timeline Deadline for abstracts: May 1, 2013 Completed papers: November 1, 2013 Final revised papers due: March 1, 2014 Possible contributors C. W. Anderson, City University of New York (USA); he’s working on a cultural history of “data” in journalism, and likely could add a historical dimension to this discussion. Mike Annany, University of Southern California (USA); he is conducting research on the sociology of news algorithms. Nick Diakopolous; an independent researcher trained in computer science, he has done pathbreaking work in studying the particulars of computational journalism practice. Terry Flew, Queensland University of Technology (Australia); he and his colleagues published an important conceptual article on computational journalism; perhaps there’s an opportunity for a follow-up empirical investigation? Susan McGregor, Columbia University (USA); she has written about data management in journalism. Gunnar Nygren, Södertörn University (Sweden); he and colleagues have an ongoing research project examining data journalism at news organizations in Sweden. Sylvain Parasie (University of Paris Est Marne-la- Vallée; France); the work of this French sociologist (and his colleagues) has focused on the development of news applications and data journalism in news organizations. Seth C. Lewis • Proposed special issue in Digital Journalism • Page 5 Cindy Royal, Texas State University (USA); hers is one of the first scholarly accounts of “programmer-journalists,” and she could build on that work for this special issue. Nikki Usher, George Washington University (USA); in collaboration with the guest editor, she has done substantial research on the growing integration of computer programmers, web developers, and “hackers” (in the pro-social sense of the term) into journalism, both in newsrooms and beyond. Farida Vis, The University of Sheffield (UK); a social media researcher, she could speak to data visualizations, including as a practitioner in this area as well as a scholar. References Anderson, C. W. (2012). Towards a sociology of computational and algorithmic journalism. New Media & Society. doi:10.1177/146144481246513 Bell, E. (2012). Journalism by numbers. Columbia Journalism Review, 51(3), 48-49. boyd, d., & Crawford, K. (2012). Critical questions for Big Data: Provocations for a cultural, technological, and scholarly phenomenon. Information, Communication & Society (iFirst online), 1–18. Cohen, S., Li, C., Yang, J., & Yu, C. (2011). Computational journalism: A call to arms to database researchers. In Proceedings of the 5th Biennial Conference on Innovative Data Systems Research. Cohen, S., Hamilton, J. T., & Turner, F. (2011). Computational journalism. Communications of the ACM, 54(10), 66-71. Flew, T., Spurgeon, C., Daniel, A., & Swift, A. (2012). The promise of computational journalism. Journalism Practice, 6(2), 157–171. Lewis, S. C. (2012). The tension between professional control and open participation: Journalism and its boundaries. Information, Communication & Society, 15(6), 836-866. Manovich, L. (1999). Database as symbolic form. Convergence: The International Journal of Research Into New Media Technologies, 5(2), 80-99. Manovich, L. (2011). Trending: The promises and the challenges of big social data. In M. K. Gold (Ed.), Debates in the digital humanities (pp. 460-475). Minneapolis, MN: The University of Minnesota Press. Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., & Byers, A. H. (2011). Big Data: The next frontier for innovation, competition, and productivity. McKinsey Global Institute, 1-137. Oboler, A., Welsh, K., & Cruz, L. (2012). The danger of big data: Social media as computational social science. First Monday, 17(7-2). Retrieved from http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/3993/3269 Parasie, S., & Dagiral, E. (2012). Data-driven journalism and the public good: “Computer-assisted-reporters” and “programmer-journalists” in chicago. New Media & Society. doi:10.1177/1461444812463345 Seth C. Lewis • Proposed special issue in Digital Journalism • Page 6 Powers, M. (2012). "In forms that are familiar and yet-to-be invented": American journalism and the discourse of technologically specific work. Journal of Communication Inquiry, 36(1), 24-43. doi:10.1177/0196859911426009 Schudson M (2010) Political observatories, databases & news in the emerging ecology of public information. Dædalus 139(2): 100–109. Weber, W., & Rall, H. (2012). Data visualization in online journalism and its implications for the production process. In 16th International Conference on Information Visualisation (IV) (pp. 349-56). IEEE.