Information Seeking Behavior in Computer and Network Security Mailing Lists Lance Hayden University of Texas at Austin, 2105 Sage Creek Loop, Austin, Texas 78704. Email: lhayden@ischool.utexas.edu. My Research Interest Email represents one of the most successful applications of network technology, particularly in the area of social, academic, and professional communication and collaboration. In the field of computer and network security, email lists represent one primary channel of information regarding security issues, vulnerabilities in technology (and exploitation of those vulnerabilities), and defensive tools and practices for improving computer and network security. This study uses content analysis as research tool with which to explore research questions regarding the information seeking behaviors of users of these mailing lists. By understanding the information seeking behaviors of members of these mailing lists we may be able to gather insights that enable more effective use of email and other computer mediated communications mechanisms to improve collaboration within the field. Such improvements could improve the capability to detect and respond to network security incidents, such as worms and computer attacks, more rapidly and effectively. Computer Security Mailing Lists General Description A variety of mailing lists exist devoted to the topic of computer and network security. Some are highly specific to a particular issue or technology; while others provide general “clearinghouse” services for discussion and queries regarding topics of interest to the security professionals, researchers, and hackers. In addition to operational matters, these lists may also provide social benefits to individuals across wide geographic areas; and a forum for rivalry, debate, and even hostility among individuals. Specific Lists and Samples In choosing appropriate email list archives for these studies, three lists were selected for their apparent similarities in function and format, as well as their increasing scale of users and subscribers. The first archive is of an email list maintained by a small group of network security consultants within a large technology company. The purpose of the list is for collaboration and communication between the consulting practice engineers on technical security matters that impact their daily work. The second archive is of a similar list used for collaboration and information exchange, but open to everyone in the technology company regardless of role or position. The third archive is of a public email list dedicated to broad security issues and open to anyone with email access. The archives available represent years of email traffic. The small consulting practice list alone includes email from 1999 through 2003 and contains 3677 messages, and the other lists cover similar periods in time, but are much larger (contain many more messages as well as being large digital files). To facilitate analysis, study will be limited to the same twelve month sample for each list archive. A sample of this size should provide ample units of analysis (in this case individual email messages), while making the analysis process more efficient. Theory and Research Questions One challenge for the study that has already plagued this researcher is limiting the focus of analysis so as to maximize value and efficiency of the findings. There are many interesting analyses that may be conducted on the data, even while focusing only on the information seeking behavior of the list participants. Choosing appropriate questions, variables, and methods has proven difficult. Two areas of research have proved interesting to me, while also providing the benefit of guiding theory and previous studies. Zipf’s Principle of Least Effort may provide a framework of theory through which to view the results of the analysis. Zipf’s theory corresponds to general stereotypes of the “hacker” or engineer, and the way they approach problems, although the theory will be limited by the dated nature of available research based on Zipf’s ideas. In contrast, much more current research into network theory and “small worlds” of social and technical hubs and links may prove complementary in this study. In the case of this research the limiting factors may indeed prove to be the popularity and occasional faddishness of the research, although respected studies have been conducted that will be reviewed during this analysis. Tools and Methods Appropriate tools and methods for this study are still under consideration for the three mailing lists. Computer aided coding and analysis will be utilized for the first email archive, with the possibility of additional analysis tools being introduced in subsequent studies. A variety of tools exist for the analysis of text, including VBPro, which is the current first choice for this study. However, there exist a variety of text processing and analysis tools available under the Gnu Public License that are of potential interest. At the time of this writing I am reviewing and finalizing decisions on the tools, research questions, and variables that will make up the final analysis. Ethical Considerations for the Study An important component of this research is ethical considerations surrounding content analysis and email. Some consider such analysis to not be intrusive, and therefore of limited ethical concern from a human research subjects perspective. Other advisors to the author, however, strongly disagree and have suggested that the nature of the artifacts in question (personal email messages) require the consideration of their authors as the subjects of research. In the case of the small consulting team mailing list, this is less problematic as the list is owned and maintained by the management of the team. All participants in the list do so in full knowledge that the list is corporate property. In this case the list owner has approved the use of the list in this study, with certain restrictions placed upon the information that may be revealed publicly. In the case of the remaining two email archives, the ownership and privacy issues involved are unresolved. It will be necessary to further review these considerations prior to each subsequent phase of the study. Planned Schedule and Milestones for the Study This research study is currently in progress as part of ongoing doctoral courses. The following milestones are anticipated: Summer 2004: Identification of archives, possible tools, and research questions, and variables September 2004: Finalize tool choice and research questions. Determine variables and begin literature review. October 2004: Complete literature review; conduct analysis of first mailing list archive (small security consulting practice). November 2004: Present initial findings as part of ASIS&T poster session. November – December 2004: Complete analysis of findings. Results and Discussion Results of those portions of the study completed at the time of the ASIS&T Annual Meeting will be provided in conjunction with this poster session. It is anticipated that the consulting team mailing list analysis will be complete by that time. REFERENCES Case, D.O. (2002). Looking for information: A survey of research on information seeking, needs, and behavior. San Diego, CA: Academic Press. Krippendorf, K. (2004). Content analysis: An introduction to its methodology (2nd ed.). Thousand Oaks, CA: Sage. Neuendorf, K. (2002). The content analysis guidebook. Thousand Oaks, CA: Sage. Weber, R.P. (1990). Basic content analysis (2nd ed.) (Sage University Paper Series on Quantitative Applications in the Social Sciences, series no. 07-049). Newbury Park, CA: Sage.