A. McCoy, S. Velasquez, K. Patton. Understanding Online Search Engines Exploring the history, the users and how they work. 1 Introduction Search engines are an everyday part of modern life. We search for information on a whim, for work, for school, and other various reasons. This phenomenon has become so commonplace we have even turned the name of one of the most widely used search engines into an action verb for searching the web, causing the term “Google” to be part of everyday conversation. Pew Institute reported in “Search Engine Use 2012,” that 73% of all Americans use search engines, and has increased by over 1/3 since 2002 (Purcell, Brenner, & Rainie, 2012). Additionally, 54% of U.S. adults report utilizing a search engine at least once a day. The number of users of the World Wide Web (WWW) is large. The Internet Systems Consortium reported in their July 2012 domain survey that there were over 900 million domains in the WWW (Internet Systems Consortium, 2012). On top of that, there are hundreds of search engines to find and organize these domains in a useful way. With the Web growing so rapidly, and so much information being produced, it is important to understand how individuals find the information they need. History of Search Engines Although the World Wide Web seems like it has been around forever, the history of search engines is actually quite young. The first tool used to search the Internet was the program Archie, which was created in 1990 by computer science students, Alan Emtage, Bill Heelan, and J. Peter Deutsch, at McGill University (Asadi & Jamali, 2004; Seymour, Frantsvog, & Kumar, 2011). The name was derived from the word archive without the “V”. The program worked by downloading directory listings from all files A. McCoy, S. Velasquez, K. Patton. Understanding Online Search Engines Exploring the history, the users and how they work. 2 located on FTP sites, and arranging them in a database that could be searched by name. In 1991 Mark McCahill from the University of Minnesota created Veronica and Jughead, which searched file names in the Gopher index system (Seymore et al., 2011). Veronica facilitated the keyword search of menu titles, and Jughead obtained menu information from the Gopher servers. After the Web materialized, robots were created to keep up with all the emerging websites (Seymore et al., 2011; Vlachynsky, 2010). The first robot, Wanderer, was developed in 1993 by Matthew Gray (Asadi & Jamali, 2004). The function of Wanderer was to count and index all the pages of the Web in Wandex, but it was not intended as a search tool (Seymore et al., 2011; Vlachynsky, 2010). Also in 1993, at the University of Geneva, Oscar Nierstrasz created a series of scripts that would mirror and rewrite the pages of the Web in a standard format which developed into the W3Catalog (Seymore et al., 2011). The next search engine of the Web was also developed in 1993 and was called Aliweb (Seymore et al., 2011). Aliweb depended on web administrators to notify the system and submit an index file instead of using a robot to index the information. This helped with reducing bandwidth overload and provided more information to users for searching. The problem with this system was that many administrators failed to keep information up to date, which resulted in an incomplete and small database (Vlachynsky, 2010). After this failed attempt, in December 1993 Jump Station was released (Seymore et al., 2011). This system again used a robot to find webpages. It was the first Web tool A. McCoy, S. Velasquez, K. Patton. Understanding Online Search Engines Exploring the history, the users and how they work. 3 to combine crawling, indexing, and searching. The index could be searchable by keyword with a simple Web form, although the results were not ranked. Although the project was innovative, it did not sustain due to lack of funding (Vlachynsky, 2010). Ranking became the next feature that search engines started touting. RBSE, created by NASA, was one of the first to start ranking websites (Vlachynsky, 2010). However, it was not intended as a public use tool. Excite was a more complicated tool that could detect relationships between words, which improved searching. WebCrawler, released in 1994, further enhanced searching by scanning the whole webpage for keywords instead of just titles or descriptions (Seymore et al., 2011; Vlachynsky, 2010). Lycos was the first advanced search engine that offered features such as a large index, links, pieces of the websites, and ranking (Seymore et al., 2011; Vlachynsky, 2010). Similar systems followed like AltaVista, AskJeeves, Infoseek, Magellan, Northern Light, and OpenText (Vlachynsky, 2010). In 1995 Yahoo! joined the group. The flaw with Yahoo! is that it operated on its web directory and not on the full webpages, which was exclusive and cost money to be included. 1996 saw competition between search engines when Netscape offered a deal for a single search engine use in their web browser. However, interest was so large that Netscape contracted with the five major search engines on a rotation for $5 million per year, including Excite, Infoseek, Lycos, Magellan, and Yahoo! (Seymore et al., 2011). Less than 5 years later Google joined the group as a leader (Seymore et al., 2011). Google became very popular due to their ability to rank websites based on association with backlinks. Over the years Google has perfected its algorithm using backlinks, relevancy, age, and many other indicators. They have also been favored due A. McCoy, S. Velasquez, K. Patton. Understanding Online Search Engines Exploring the history, the users and how they work. 4 to their simple design. As of 2010, Google accounted for about 2/3 of the searches in the United States (Vlachynsky, 2010). By 2012, 83% of users reported Google as their most common search engine, with Yahoo! in second at 6% (Purcell et al., 2012). The last popular search engine to date has been Bing, which was released in 2009 by Microsoft (Seymore et al., 2011). By July of that year Yahoo! and Microsoft signed a deal where Yahoo! Search would use the Bing technology. How Do They Work? Search engines in the Web and pre Web Internet worked differently (Vlachynsky, 2010). For the pre Web internet, users had to know the exact names of files because the search engines of the past were only indexers, listing files on an FTP server. This caused problems as the number of files increased, slowing down searches and causing confusion between similarly named files. Initial Web search engines were based on the old FTP retrieval method of indexing. This worked initially because there were only a few websites on the Web. However, as the number of websites exponentially grew, it was harder and harder to keep the indexes up to date (Asadi & Jamali, 2004). Common search engines work by using web crawlers to follow all links and collect information for their indexes. Some store part of the websites in the system for quick and easy retrieval, where others store every aspect of every page which is useful if pages should change or update (Kuyoro, 2012). Modern search engines have evolved from specific keyword searches to a combination of Boolean operators to proximity searches to concept-based searching. Evaluation A. McCoy, S. Velasquez, K. Patton. Understanding Online Search Engines Exploring the history, the users and how they work. 5 Because of both the way they are indexed and the method used for retrieval, most online search engines are only somewhat effective since the crawlers used for most of the indexing can only interpret text, and pictures and graphics are lost unless they are accompanied by a caption (Taylor & Joudry, 2009). This automated method of indexing also fails to discern a site’s purpose, history, policies, and bias. As previously mentioned, retrieval for online search engines is usually completed by using keyword searching. Sites that have more instances of the searched term or more users who’ve chosen the site when using similar terms are ranked higher, no matter if the content is fitting or not. This type of retrieval can work well for topical searches, but falls short when a specific document or site is desired. Despite this logical lack of effectiveness, perceived effectiveness by the users of online search engines is quite a different story. In a study conducted by the Pew Research Center, 91% of respondents said that when using an online search engine they always or almost always find the information for which they are searching (Purcell, 2012). This is no small thing, given that 91% of online adults use a search engine to find information. There is no doubt this method of search is widely used and growing more accessible by the day. Websites are also capable of changing minute-by-minute as information changes happen or news unfolds. This makes it possible users of online search engines to cull from the most up-to-date resources available. It has already been established that keyword searching is not the most effective method of search, but online search engines also suffer from a lack of ability to tell the difference between homographs or connect synonymous search terms (Taylor & Joudry, 2009). Another, much more ominous weakness is also emerging: targeted A. McCoy, S. Velasquez, K. Patton. Understanding Online Search Engines Exploring the history, the users and how they work. 6 searching. Companies are beginning to appreciate the amount of people using online search engines on a regular basis and the boon it would be to get their products or services in front of all these potential customers. These companies are paying to get their results listed first, no matter how relevant they are to the user’s search. This profitmaximizing trend by the search engines is causing the quality of results to suffer (Ahuja, 2010). Because the evolution of search engines now allows them to remember users and modify their experiences, which includes directing users to certain sites and organizing search results based on past usage, search engine development and assessment is ongoing, requiring companies to constantly evaluate search engine users. Understanding what they need, want, and how they use the internet is critical in keeping search engines fresh and competitive. Exploring Search Engine Users Student Users Ismail (2011), studied ways identify the information needs of novice researchers in order to create a supportive research environment. His subjects for the study were first year postgraduate students. He considered these students to be early stage researchers because their experiences are confined to conducting small scale research projects for class assignments and final year projects during their undergraduate studies, which is considered limited. Students overwhelmingly prefer to use search engines, even when it means sometimes being overwhelmed by the results or not knowing how to discern the credibility of the sites that were found (Ismail, 2011). Participants in Ismail’s (2011) A. McCoy, S. Velasquez, K. Patton. Understanding Online Search Engines Exploring the history, the users and how they work. 7 study typically began their search process by searching for keywords or subject matter. Their second choice was the title or author of a specific document. These same participants did not spend much time looking for resources, and quickly became discouraged if they either found too many resources or could not find enough. The results of this study confirm that students need more instruction and guidance on search engine usage and web searches (Ismail, 2011). Georgas (2013) explored the use of both federated searches and Google in a comparison study of student information seekers at Brooklyn College. Federated searching is the use of a program that enables users to search several databases at a time using a single search term. This method was once thought to be the library’s answer to Google because it allowed a one-stop shopping method for users (Georgas, 2013). However, federated search technology is a costly tool, and as Google continues to provide a constantly evolving free and inexpensive search method, libraries must be able to justify the cost of providing this service. The question is, which method do students really prefer? Georgas (2013) examined literature that addressed if students prefer federated searching or Google; if students are able to identify relevant research resources using both a federated search tool and Google; and if students possess adequate information literacy skills to use each of these search tools effectively. One study asked librarians to respond about their students’ preferences. The librarians surveyed stated that federated searching did have its drawbacks such as not providing seamless searching, being slow, and needing improvements. Even so, much like the instructors at Brooklyn College, they thought it was the best rival to Google (Georgas, A. McCoy, S. Velasquez, K. Patton. Understanding Online Search Engines Exploring the history, the users and how they work. 8 2013). However, there is often the problem of information overload with search engines such as Google. A study by the Research Libraries Group found that federated searching was viewed as a good tool for students to “get started finding stuff,” and not a good tool for “advanced research.” Other studies mentioned by Georgas (2013) looked at what students think about federated searching; student feedback on the implementation of cross-database searching; and satisfaction in individual databases and online search engines. Students stated that they actually preferred using a federated search tool over Google, as they found it more efficient and would recommend it to a friend. Students want efficiency and ease-of-use, but they realize the limitations of Google. General Users o Most users confident in their search abilities o only 6% say they are not too or not all confident. o gotten so much information in a set of results that you feel overwhelmed (38%) o According to pew all age groups, races, and sexes use search engines. More common daily use by young and educated. User Perceptions of Search Engines Beyond how search engine users utilize search engines, many have opinions about how search engines perform and employ their private information. The Pew Institute reported that most users are happy with their search engine experiences, and two-thirds feel that search engines are fair and unbiased, with younger users having A. McCoy, S. Velasquez, K. Patton. Understanding Online Search Engines Exploring the history, the users and how they work. 9 more faith in the purity of their results (Purcell et al., 2012). However, only 28% of users feel that the information provided in searches is accurate or trustworthy. Over half of adult searchers said that their search results had gotten more useful and relevant over time, although it is not known if this is a result of search engines collecting information and tailoring search results, or if searchers’ skills improve as they learn what search methods are most effective. Many users are not excited about search engines collecting user information and tailoring searches based on this information. Sixty-five percent of participants in the Pew report view personalized search results as a bad idea, citing the exclusion of potentially useful or important information as the main reason (Purcell et al., 2012). When asked about their thoughts on targeted advertising as a result of personalized searching, 68% of internet users have negative feelings. Future The future of online search engines and how they will affect users is unknown. Changes in the world of online search engines are swift, making their future difficult to predict. However, there are a few possibilities rising to the top. The push to create a Semantic Web, where information on the internet would be semantically defined and connected to relevant data, would give online search engines a much more structured source from which to pull, drastically improving search results (Taylor & Joudry, 2009). Additionally, as noted by Palatnik (2007), social bookmarking is driving up the effectiveness of online search. Search engines are beginning to factor these usercreated tags into their results. In a similar vein, Google now “recruits hundreds of individuals to manually assess the quality of content on specific URLs” (Purtell, 2012). A. McCoy, S. Velasquez, K. Patton. Understanding Online Search Engines Exploring the history, the users and how they work. 10 These assessments are also used to alter the algorithms used by online search engines. Both of these mark a trend back toward human-indexed content, the method employed by the successfully indexed collections. References Ahuja, B. (2010). The future of search engines if paid search is given more importance than organic search. Search Engine Journal. Retrieved from http://www.searchenginejournal.com/future-of-search-engines-if-paidsearch/24394/ Asadi, S., & Jamali, H. R. (2004). Shifts in search engine development: A review of past, present and future trends in research on search engines. Webology, 1(2). Retrieved from http://www.webology.org/2004/v1n2/a6.html Georgas, H.(2013). Google vs. the Library: Student Preferences and Perceptions When Doing Research Using Google and a Federated Search Tool. portal: Libraries and the Academy 13(2), 165-185. Internet Systems Consortium. (2012). Internet Domain Survey, July, 2012:Number of Hosts advertised in the DNS. Retrieved from http://ftp.isc.org/www/survey/reports/current/ Ismail, M., & Kareem, S. (2011). Identifying how novice researchers search, locate, choose and use web resources at the early stage of research. Malaysian Journal of Library & Information Science, 16(3), 67-85. Kuyoro, S. O., Okolie, S. O., Kanu , R. U., & Awodele, O. (2012). Trends in Web-Based Search Engine. Journal of Emerging Trends in Computing and Information Sciences, 3(6), 942-948. A. McCoy, S. Velasquez, K. Patton. Understanding Online Search Engines Exploring the history, the users and how they work. 11 Palatnik, P. (2009). Are social powered search engines the future of search? Search Engine Journal. Retrieved from http://www.searchenginejournal.com/are-socialpowered-search-engines-the-future-of-search/4912/ Purcell, K., Brenner, J. & Rainie, L. (2012). Search engine use 2012. Pew Internet and American Life Project. Retrieved from http://www.pewinternet.org/Reports/2012/Search-Engine-Use-2012.aspx Purtell, M. (2012). Reverse engineering human rating to predict the future of search. Search Engine Journal. Retrieved from http://www.searchenginejournal.com/predict-the-future-of-search/52930/ Seymour, T., Frantsvog, D., & Kumar, S. (2011). History of search engines. International Journal of Management & Information Systems, 15(4), 47-58. Taylor, A. & Joudry, D. (2009). The organization of information. Westport, CT : Libraries Unlimited. Vlachynsky, M. (2010). Principles and History of the Web Search Engines. Retrieved from http://www.econoir.sk/web/stuff/Principles-History-Web-Search-Engines.pdf