Seeking Information from Government Resources: A Comparative Analysis of Two Communities’ Web Searching of Municipal Government Web Sites Frank Lambert, Ph.D. Assistant Professor School of Library and Information Science Kent State University P.O. Box 5190 314 University Library Kent, OH 44242 flamber1@kent.edu Keywords e-Government informatics; Web log analysis; Web search; Government information. INTRODUCTION The published Web search literature lacks the analysis of large amounts of unobtrusively collected data such as those found in Web logs to determine the types of information sought through municipal government Web sites and how these data might differ or be similar when comparing two municipal government Web sites. This study analyzes a large purposive sample of three years’ worth of query data that resulted from hundreds of thousands of Web searches submitted by information seekers through two urban municipal government Web sites; one for London, Ontario (www.london.ca) and the other for Kitchener, Ontario (www.kitchener.ca). Conceptual Framework Broder’s (2002) model for IR augmented for the World Wide Web (WWW) frames this study. However, this study moves beyond cursory classifications of the queries collected through the City of London’s and Kitchener’s Web sites (Cf. Broder, 2002; more recently Jansen, Booth, and Spink, 2008). This is done by categorizing conceptually queries to address the research questions presented in this paper. Rationale for the Study This study addresses Spink and Jansen’s (2004) assertion that “further single Web site studies are needed to replicate and extend the previous studies” cited by the authors and other studies that have been published since 2004. (p. 25). One of the benefits of this current study is that it analyzes a large, unobtrusive purposive sample of Web query data from two urban municipal government Web sites submitted through specially designed search bars created by an online community information organization in London, Ontario, called mycommunityinfo.ca (MCI). RESEARCH QUESTIONS In light of this paper’s research goals explained thus far and the current state of the literature focusing on Web log query analysis, this study attempts to address the following research questions: •What types of conceptual information are being sought in an online environment through municipal government Web sites using keyword querying? Does this information being sought change over time? •What major differences in the types of information sought through the government Web sites of two urban municipalities located less than 100 kilometres from one another may be discerned through this analysis? •Do information seeking patterns through municipal government Web sites differ if at all when compared to relatively similar published studies? It also is hoped that the findings of this study will provide a perspective of information seeking that may make other municipal governments consider further the design and content of their respective Web sites. METHODS Three years of query data covering March 2006-February 2009 were supplied for this study. Once the queries were cleaned the most frequently occurring top 100 queries were gathered, grouped, and classified based on their manifest or visible surface content (Babbie, 2008) based on categories from similar, past published studies (Lambert, 2010a; 2010b). Once the categories were deemed sufficient to ensure mutual exclusivity, a graduate student then classified a representative systematic sample to confirm first the content validity of the categories and second to test the reliability of the categories for coding (Babbie, 2008). The end result was an intercoder rate of agreement of 89%, very close to Miles and Huberman’s recommendation of 90% intercoder agreement. (1994) FINDINGS Mean Number of Terms per Query The mean number of terms used per query for the top 100 most frequently occurring queries and for all cleaned queries submitted March 2006-February 2009 was calculated to compare the descriptive statistical findings of past Web log studies to this study. Wang, Berry, and Yang (2003) report an average of two words per query submitted overall by users. Beitzel et al. (2007) found users submitted on average 2.2 to 2.7 terms per query session. Wang, Berry, and Yang’s and Beitzel et al.’s studies match more closely the mean number of terms per query for all queries shown in Figure 1; the exception being that Beitzel et al.’s mean terms over 6 months is higher than the mean terms for all queries submitted over three years. This suggests that the length of time that these data are collected is not a variable that affects necessarily the average number of terms submitted per query. This is supported further by Lambert’s (2010a) original study that included an analysis of City of London among others for one year only. He found that the mean words per query for all cleaned queries and the top 100 most frequently occurring queries were 2.1 and 1.28, respectively. In other words, there has been very little change over time and there was very little change because of the greater quantity of data collected. A considerably larger proportion (21.6%) of Kitchener’s queries is in the top 100 compared to London’s (16.5%). In Figures 2 and 3 below, the slope of the categories’ for each respective city’s distribution curve is quite different. Since the frequency of the queries is represented equally by the frequency of the respective categories, the categories’ distribution is still an accurate representation of this weighing of the proportion of the top 100 queries. This indicates that information seekers using Kitchener’s Figure 1: Descriptive Statistics of Queries for London and Kitchener Municipal Government Web Sites, March 2006-Feb. 2009 Web site are querying using the same terms over again more frequently than information seekers using London’s Web site. Kitchener has a lower proportion of distinct queries (29%) in relation to the total number of cleaned queries than does London (35%). This indicates that in Kitchener there is a greater variety of distinct queries submitted beyond the top 100 most frequently occurring queries (or, in the “long tail” of a distribution) than there is in London. Thus, query submission in Kitchener shows “extremes,” for lack of a better word; most frequently submitted terms occur with more repetitive frequency, and less frequently submitted terms occur with less repetitive frequency in comparison to information seekers’ querying behaviours in London. Online Information Seeking Government Web Sites through Municipal As Figures 2 and 3 demonstrate clearly, queries pertaining to “Recreation, Entertainment, & Leisure” formed the bulk of queries, accounting variably for roughly 25% of the top 100 most frequently occurring queries submitted through each city’s Web sites. This happened regardless of the effects of the recent Great Recession (Danizewski, 2009; Waterloo Region’s jobless rate jumps to 9.9 per cent while national rate dips, Feb. 6, 2010). An information seeking category such as “Municipal Government Business,” for which one might assume information seekers would use municipal government Web sites, is quite reasonably popular compared to many of the other categories in the figures above. However, this category’s popularity is dependent very much on the community. Closely related to this category is “Solid Waste Collection & Recycling.” Arguably, this may be considered a part of “Municipal Government Business,” but it is such a highly ranking query that it deserved its own category. This indicates a fairly significant reliance on these sites to find information about governments’ operations and policies. Some examples of these information inquiries are concerned with ‘bylaws’, ‘zoning’, ‘development charges’, ‘social services’, ‘garbage pickup’, and other similar queries that are simply too numerous to list. What really distinguishes these data from other Web log analysis studies is how they differ at a conceptual level from what is sought on other government Web sites. The findings presented by Chau, Fang, and Sheng (2005) in their analysis of queries submitted through the Utah state government Web site show a considerably different tone and tenor of what the Web site’s users are seeking; for example, the top three most frequently occurring queries they discerned were ‘dmv’, ‘tax forms’, and ‘sex offenders’ (p. 1369). This indicates on its face that information seekers have different desires and expectations of the types of information from different levels of government. DISCUSSION As might be expected, the brief findings presented above demonstrate a number of differences and similarities between the communities of London and Kitchener. However, this helps offer some insight on the design of municipal government Web sites in terms of their architecture and in terms of their information content. provided on the main page? Further research into this aspect of human-computer interaction may reveal aspects of online searching behaviour that should be taken into account in the design of Web sites. Baker (2009), for instance, proposed an improved content analysis approach following previously completed usability studies to refine usability scrutiny. Emergence of Help-Seeking Mismatches CONCLUSION A relatively large number of Web searchers using the London and Kitchener municipal government Web sites seem to perceive these resources are a relatively important source for “Recreation, Entertainment, & Leisure” information. For instance, ‘movies’, ‘restaurants’, ‘bars’, ‘shopping’, and ‘malls’ are all examples of very frequently occurring navigational and informational queries that are part of this category submitted specifically through municipal government Web sites. However, neither Web site contains any relevant pages that would pertain to these activities. Those who are querying London’s and Kitchener’s Web sites are experiencing what Dewdney and Harris (1992) define as a help-seeking mismatch where “the types of help that might be expected from an agency are not those which it provides.” (p. 23) Thus, users looking for this some types of this information as outlined above are not using always the best source to retrieve this type of information. The same is true for the category “Work, Employment, & Training”. If it is indeed very common that these information seeking mismatches occur on a regular basis on other municipal government Web sites, then it might behoove municipal governments to evaluate better how their Web sites are used to help minimize this unintentional information barrier. This does not mean that municipal government must provide the actual information content, but it might consider providing access to this type of information through Web links at the very least. The online local information inquiries of a particular population or populations, and how these inquiries are being addressed, are anything but simple and predictable. Sources of online information such as two urban municipalities with some reasonably matching demographics and that are in close proximity to one another demonstrates often that those community’s Web searching will vary often depending on how the information seeker perceives the scope of information and the potential utility of the information that those sources provide. Effectiveness and Importance of Well Designed Home Pages According to Herrera-Viedma and Pasi, the likelihood of a successful search for information based on a point-andclick access paradigm is dependent largely on the design and related information provided on the starting Web page. One of the ironies for the City of London particularly is that, within certain limitations, one can say its Web site designers are doing their best to live by this spirit. For all three years under examination for London, the query ‘spectrum’ occurred most frequently. ‘Spectrum’ is a guide of recreation and leisure programs organized, funded or subsidized heavily, and staffed year ‘round by the city through its Parks and Recreation department. It is a very popular program with activities for children, adolescents, and adults. Confirmed by querying the Internet Archive (www.archive.org), since 2006 London has had a link to the program on its home page to allow citizens to view the program offerings and register for the desired programs online. The question that is raised, then, is why are keyword queries being submitted through the Web site to find ‘spectrum’ when a relatively prominent link has been Municipal governments have followed the lead of the federal and provincial governments in offering more services and information through the WWW. This has occurred despite the fact that municipal governments already tended to interact with a country’s citizens much more closely than do national governments. However, if in its electronic delivery of services to citizens municipal governments particularly may make more informed efforts as to what information their citizens are seeking, then that would enhance the role of government in person’s lives by empowering them to seek and find the appropriate government related information that they require. Acknowledgements: The author wishes to thank Melissa Higey for her assistance with data preparation and analysis and the City of London, Ontario, for access and use of the query data. REFERENCES Babbie, E. (2008). The basics of social research (4th ed.). Belmont, CA: Wadsworth. Baker, D. (2009). Advancing E-Government performance in the United States through enhanced usability benchmarks. Government Information Quarterly, 26(1), 82-88 Beitzel, S.M., Jensen, E.C., Chowdhury, A., Frieder, O., and Grossman, D (2007). Temporal analysis of a very large topically categorized Web query log. Journal of the American Society for Information Science and Technology, 58(2), 166-178. Broder, A. (2002). A taxonomy of web search. SIGIR Forum, 36(2), 3-10. ACM Chau, M., Fang, X., and Sheng, O. (2005). Analysis of the query logs of a web site search engine. Journal of the American Society for Information Science and Technology, 56(13), 1363-1376. City of Kitchener (Feb. 2007). Demographic profile/labour force profile: Fast Facts. City of Kitchener. Retrieved April 19th, 2007, from http://www.kitchener.ca/pdf/fast_facts.pdf. Daniszewski, Hank (2009, Oct. 10). London's unemployment rate rising. The London Free Press. Retrieved June 19, 2010, from http://cnews.canoe.ca/CNEWS/Canada/2009/10/10/113661 01-sun.html. Dewdney, P. and Harris, R. M. (1992). Community information needs: The case of wife assault. Library and Information Science Research, 14, 5-29. Herrera-Viedma, E. and Pasi, G. (2006). Soft approaches to information retrieval and information access on the Web: An introduction to the special topic section. Journal of the American Society for Information Science and Technology, 57(4), 511-514. Jansen, B.J., Booth, D.L., and Spink, A. (2008). Determining the informational, navigational, and transactional intent of Web queries. Information Processing & Management, 44, 1251-1266. Lambert, F. (2010a). Online community information: The queries of three communities in southwestern Ontario. Information Processing & Management. 46(3), 343-361. Lambert, F. (2010b). Web searching to meet everyday information needs: A comparative longitudinal study of queries submitted to an online community information system. Prato Community Informatics Research Network (CIRN) Conference 2010: Tales of the Unexpected: Vision and Reality in Community Informatics. CIRN -DIAC Conference: Monash University Centre, Prato, Italy 27-29 October 2010. Larry Stillman and Ricardo Gomez, eds. Miles, M.B. & Huberman, A.M. (1994). Qualitative data analysis: An expanded sourcebook, 2nd ed. Thousand Oaks, CA: Sage Publications. Spink, A. and Jansen, B.J. (2004). Web search: Public searching of the Web. Kluwer Academic Publishers: Dordrecht. Wang, P, Berry, M.W. and Yang, Y. (2003). Mining longitudinal Web queries: Trends and patterns. Journal of the American Society for Information Science and Technology, 54(8), 743-758. Waterloo Region’s jobless rate jumps to 9.9 per cent while national rate dips (Feb. 6, 2010). theRecord.com. Retrieved June 27, 2010, from http://news.therecord.com/article/667039.