Chapter 3 Search Before Google Briefly describe search engines before Google • Innovations (introduction of something new) • Mistakes or things that these search engines lacked Archie and Veronica • First internet search engines • Has a similar architecture to today’s search engines • Crawled sources, built indexes, and had search interfaces • *Setbacks: had to know file name and address; also lack semantics Archie and Veronica • Focused on indexing page titles • Who do you think used these search engines and why? • Only user techs and they would use them for academics purposes Wanderer and WebCrawler • Object was to index as many sites as possible • Focused on the importance of links in relevance to another site • What are some of the limitations of having search engines that just focused on indexing as many sites as possible? • It consists of way too much junk and not enough substance of use Wanderer • Challenged: “many Webmasters felt the Wanderer ate up too many processing and bandwidth cycles as it indexed a site’s contents.” So with this problem, Matthew Gray (Wanderer founder), decided to tweak their crawler and ventured on using a breadth algorithm, mainly because they could span many sites before drilling them down (and a more efficient process). WebCrawler • This was the first search engine to index all the text on web pages. • This was the opposite of Archie and Veronica, which indexed only page titles. Lycos • Originated from CMU • Innovations: analysis of anchor text to get better meaning of existing page • Mistakes: continued to expand but lacked technology to keep up Excite • Created by 6 Stanford Alumni • First search engine to transcend classic keyword search • Statistical analysis of word relationships • Known for hyper-searching the web Excite • • • • • Other innovations: My Excite Free e-mail Setbacks: Veered away from search AltaVista • Came from DEC (Digital Equipment Corp.) had a superfast Alpha processor and wanted to prove its strength with a new search engine • Known as the Google of its era • Main objective was to give publicity to the DEC • Consisted of the biggest index AltaVista • Could have had 1000 crawlers running at once vs. Google would initially had only 3 crawlers running. • Setbacks: • Could not generate money from searches • Could not keep up with the PC business • Only used first generation search Tracking the History of DEC, Compaq, and AltaVista • DEC: They wanted to use AltaVista as a means to sell more hardware. Later realized that AltaVista needed capital and public currency to grow on, so they invested in making it a public company. AltaVista at this time was a threat in the search world and ran closely alongside AOL and Yahoo. DEC lately handed over AltaVista to Compaq. Cont’d • AltaVista: tried to be a great portal, and in the process, neglected to improve the search technology. • In relevance to Google: Google initially focused completely on coming up with good search technology. Cont’d • Compaq: saw in AltaVista immediate cash production. Rod Schrock, a Compaq executive, turned AltaVista within a year, into a Yahoo clone (w/ e-mail, directories, comparison shopping, topic boards, and advertising). Compaq sold AltaVista to CMGI, but soon after they moved, CMGI was losing its value and AltaVista, due to the value loss, had lost its sparked and fell back to the old ways of a search engine (a search box, a blinking cursor, and scads of white space). Yahoo • Created by Stanford engineering students • Crawled Internet sites to help win a basketball fantasy league • What made Yahoo different from other search engines? • Their directory stood out- it organized the Web in a fashion that made sense to techies and first-time Web surfers Yahoo (cont’d) • Another thing is that it made links to competitors sites in case a searcher could not find what he or she was looking for/ they also listed “what’s hot” prominently on its home page, thereby driving extraordinary amounts of traffic to other sites Yahoo (cont’d) • Interesting info: directory vs. search debateYahoo’s organization of the Web was in a more simplified version for any user and it was geared towards the quality of the search. Most people earlier on didn’t know what to do when a search box was put in front of them, but as users transcended from exploration to expectation, search made more sense. So in 1995, Yahoo added search to its directory. Yahoo (cont’d) • What was Yahoo’s main objective? • Wanted to connect people to the Web Tom Koogle (Yahoo’s first CEO) states “The Net is all about connection, but you can’t connect people without good navigation.” Yahoo provided that connection • Innovations: What’s Hot/ Hubs