Search Engines, Optimisation (SEO) and Web Search. Overview and Critique By Alessandro Ballarin - autumn 2002 Abstract What are people truly seeking when they approach a web search engine? And also, what does determine ‘relevant information’? In the field of Information Retrieval restricted to the web, these are complex questions, if we only think that two or more people might consider different answers relevant when they process exactly the same query. Search engines attempt to use automated scientific methods and techniques to deliver relevant results to match a search query. It is a complex process that sometimes works wonderfully and other times can be frustrating for any user. But, is there anything that can be changed to improve it? With this in mind, an understanding of the nowadays available search technologies and a brief analysis of the search engine optimisation (SEO) emergent field, aimed to improve users’ web search, has been a main focus of this research. This paper gives an overview of how search engines work, attempts a critical approach of various developments of the search environment and put together a proposal on how better make use of the relevancy systems. Search engines rank by their relevance scores, and even for those that offer other advanced search options, relevance ranking is the default. Why not an option? Finally, it is widely believed search engines are in an excellent position to continually adapt the relevance techniques to try and achieve even better success, it could be a big step ahead for all of us as users to have a more transparent and better regulated environment where to search information and a way to personalise the subjective concept of relevancy in our web searches. Page 1 of 16 Contents Abstract …………………………………………………………………………………………………...1 Contents…………………………………………………………………………………………………...2 Discussion notes ….……………………………………………………………………………………...3 Question for discussion…………………………………………………………………………………..4 1. How search engines work .................................................................................................. 5 1.1 1.2 1.3 1.4 1.5 1.6 2 What is SEO?..................................................................................................................... 9 2.1 2.2 2.3 2.4 2.5 3 Overview .................................................................................................................... 5 Spiders and indexes ................................................................................................... 5 Search interface ......................................................................................................... 6 Search features .......................................................................................................... 7 Search Results ........................................................................................................... 8 Relevancy algorithms ................................................................................................. 8 Overview .................................................................................................................... 9 Origins and establishment of SEO industry ................................................................. 9 Optimising .................................................................................................................10 Link analysis approach ..............................................................................................10 Spamming .................................................................................................................10 How to improve web searches?.........................................................................................11 3.1 3.2 The need of a regulation ............................................................................................11 An idea:ranking algorithm customisation....................................................................12 Conclusions…….….…………………………………………………………………………………….13 Bibliography…………...…………………………………………………………………………………14 Page 2 of 16 Discussion notes Nowadays we all are aware that search engines have come a long way since their first appearance on the web back in the mid-1990s. Today’s search engines are not only far more likely to deliver accurate results, but they will also make use of images, audio files, databases in the process of delivering them. Even though we give search engines so little information we still expect the “ideal search engine” would understand exactly what we meant and give back exactly what we were looking for. We all are aware, also in this, that search engines have long way to go. With search currently the second most popular internet activity after e-mail, the race towards creating the "ultimate" search engine is being highly contested in a packed marketplace. Possibly, the future of search lies in creating personal profiles of searchers. Search company Inktomi (inktomi.com) has just started experimenting with personalised searches within the corporate arena, using information that is already available about employees to create customised searches within a closed environment. However, privacy and technological issues stand in the way for this kind of solution. As the web gets bigger, an entire industry is springing up around the business of how to get a website listed high on a search engine results page, now that advertising has become a regular feature on such results pages. Some companies, such as LookSmart, specialise in pay-for-performance content. Meanwhile, companies such as iProspect and NetBooster specifically help advertisers find marketing strategies for getting to the top of the results listing, practising what is called SEO, standing for search engine optimising. For some search providers, such as Convera (formerly Excalibur Technologies) and iphrase (iphrase.com), the future of search lies in the automation of human thought and linguistic processes. Iphrase and Convera's products aim to abstract the meaning of documents rather just look at their syntactic properties, such as matching keywords. Many are sceptical about the success of such "semantic engines". "The history of trying to bridge the syntactic-semantic cut in artificial intelligence has been a history of ignominy," says Anil Seth, postdoctoral fellow in theoretical neurobiology at the Neurosciences Institute in San Diego. "Semantics cannot simply be encoded or decoded from a syntactic foundation. Too many other factors, such as culture and natural language, get in the way." Generally search engines move in the direction of improving the relevant information given to the users in the results listings by continuously updating the ranking algorithms, also in order to avoid attempt of cheating their way into the mythical 'top ten position'. But for all their improvements in relevance ranking techniques, there are plenty of searches where the techniques fail significantly. The technology will continue to improve. But while the science of relevance ranking may retrieve ever better possibilities, finding accurate and comprehensive answers will remain an art for some time to come1 (Greg R. Notess). 1 Greg R. Notess “The never-ending quest: search engine relevance” from ONLINE magazine – May 2000 (www.notess.com). Greg R. Notess is a reference librarian at Montana State University. Page 3 of 16 Questions for discussion 1. User, as consumer, should certainly have adequate rights and protection also within the context of web search. Between Pay-for-placement, adverts and unethical or excessive SEO techniques, user is not aware if results shown are the best possible out there. Could it be a ‘better regulated search environment’ a key solution to ensure transparency of results for users in web searches? 2. SEO stands for ‘search engine optimisation’. It essentially is the art and science of increasing a web site's visibility in the major search engines and directories across a strategically defined list of keyword phrases that relate to products, services, or information offered on the web site. Do you think Search Engine Optimisation (SEO) professionals would be increasing in number or totally disappear in 10 years time? Why? 3. What would you think to personalise the search engine ranking algorithm in each of your web searches? Maybe to include it as a super advanced feature for expert search users. Introduction The Web is mainly lacking in organisation and is so vast that even long time experienced searchers express great frustration in using it for finding information. In fact many people who use search engines don’t really understand what they are and don’t bother to learn how to take full advantage of their capabilities. Most of them would simply type one or two words into a search form and are unpleasantly surprised when they are presented with thousands or even millions of pages. Easily people get the impression that is possible to ask anything to the favourite search engine and get the relevant answer in seconds. They are also not aware at all of how the transparency and the relevancy of results listing, obtained matching the search query, could be affected by a pay-for-placement policy adopted by the search engine or search engine optimisation (SEO) issues. Main focus here in this report is to acknowledge and help the search users to have improvements in their web search experience by giving the know-how to get these improvements. Understanding what search engines do, how they do work, what is the effect of the search engine optimisation in the results listing is the way, the key for better web searches. In the next chapter a brief overview of available search engines technologies will be given, then in turn a chapter explaining what the SEO industry is there for, concluding with a theoretical approach on how a web search could be better improved under a user perspective. Page 4 of 16 1. How search engines work 1.1 Overview Generally speaking, search engines work by matching queries from users against indexes previously created from them, they then rank the relevant documents and end the process by displaying a results listing accordingly. In other words search engines are tools that let you explore databases containing the text from hundred of millions of web pages. They are designed to make search as easy as possible for users, or at least the major search engines2 are. But with millions and millions of web pages out there on the Web, and more being added all the time, how can search engines possibly collect them all? The answer to that question is through spiders even though they don’t collect them all, just a part of it 3, strictly depending from the different approaches search engines adopt. 1.2 Spiders and indexes Search engines so gather their data by the use of spiders (or crawlers), which are robot software designed to track down web pages, follow the links they might contain and add any new information they encounter to a master search engine database or index. Each of the search engines, as mentioned before, has its own way of doing things. Some, for example, program their spiders to search for only the titles of web pages and the first few lines of the text. Others could index each single word contained in the web page so that all of them will be searchable. The spiders’ work is, in some cases, complemented by the work of human beings, who spend time visiting, selecting and classifying web sites based on their content. An approach like this is used by Yahoo (yahoo.com) which still maintains its own staff of surfers to perform this function. The great thing of spiders is that they operate tirelessly around the clock, even though, some might take longer than others to visit web pages so that some search engines might index them and others may do it later; this is also one of the reasons why the same query to different search engines might give different results. There are also some problems regarding the indexing, examples are the use of application software like Shockwave, Flash and text in graphics which make it invisible to the indexing spiders, so that any page that make extensive use of these features will not be indexed properly unless it has also the text in HTML. 2 Major search engines are: Google, AllTheWeb, Yahoo, and MSN Search, followed by Lycos, AskJeeves and AOL Search. “The Major search engines” - October 12, 2002 - www.searchenginewatch.com/links/major.html 3 Statistic about the size of search engines at www.searchenginewatch.com/reports/sizes.html Page 5 of 16 1.3 Search interface Increasingly the personalisation of the web search is what all major search engines are trying to achieve at the moment.They do this by providing a set of configurable search pages, traditionally two: a simple form with a few checkboxes or radio buttons, and an advanced form with more complex options. Basic form The simplest form of a basic search interface is a small form with a text field and a Search (submit) button as the one shown in the screenshot below: This form usually includes a number of default fields hidden from end-users, such as whether the search engine should find pages that match all the search terms, or any one of them. Another common hidden field is one that determines whether the search engine will match pages containing all the words or any of the words. Advanced form The advanced search form will be usually used from only a small group of expert users. It can display many possible searching options, such as limiting a search to headers, limiting the search to a specific date-range, and whether to use substrings. The following screenshot shows the advanced search form of Yahoo: Page 6 of 16 This form would let search users choose whether to search the entire site, certain sections, or just the newest files. Usually the available options depend on the abilities of the search engine and indexer. 1.4 Search features Search engines have a number of search features for users. The following could be some of the most common: Boolean operators and emerging standards (+ for required word, - for disallowed word, and "quotes" for exact phrase) Graphical interface using radio buttons, checkboxes, etc. Possibility of specific search zones (e.g.: Geography, Criminology, Computing sciences, etc.) Date range searching Parentheses or multiple text-search fields Field searching (host name, title, URL, date, size, metadata). Language: limiting searches to documents in specific languages, or even cross-language retrieval (e.g.: translating searches to match other vocabularies). And more … Page 7 of 16 For advanced features, some search engines also advertise natural-language queries, thesaurus and synonym lists, graphic results visualization, concept mapping and so forth. In practice, most web search users are used to process queries without them but for a more complete and accurate search experience begin using them would not be a bad idea. 1.5 Search Results The final step for a search engines after a search user query has been matched to the indexed database and the most relevant documents are found is the display a search results listing. This consists mainly in a list ordered in the way the search engine ranks its results. It is also, generally, possible to customise the results in order to arrange them in a clear, useful and personalised manner. This screenshot shows an example of results listing of web search for “web page ranking” on Google: Generally all results are listed in order of relevancy, they include a the page title which is a link to that web page, also some other information depending from the different search engines like, the first few lines of the text, the modified date, the URL, and probably an evaluation of how closely a page matches the search term based (not shown in Google), again, according to different criteria and ranking algorithms. 1.6 Relevancy algorithms They are the methodology or the procedures by which search engines calculate and rank search results. Usually called ranking algorithms, they are different for each search engine but all seem to follow to some Page 8 of 16 degree the same pattern. They could depend from a wide variety of factors including domain name, matching keywords appearing near the top of the web pages, spiderable content, submission practices, html code, and link popularity. They are generally top secret for each search engine and periodically changed for various reasons. The main reason is to be aware from the job of extreme SEO professional which try to reverse their ranking algorithms in order to get client’s web sites well positioned in the top results listing page. 2 What is SEO? 2.1 Overview SEO, which stands for search engine optimisation, is effectively the process of manipulating ranking algorithms to improve search result positioning for a given web page. In performing a web search the majority of the users still naively believe any search engine simply would deliver from the web the best possible results/matches, according to their ranking algorithm. In fact, it does not take long and much to find out that chances are, many company sites with high rankings will have paid for the privilege, either directly to the search engine for a placement, or, through use of search engine optimisation (SEO) specialists, to be in the top results listing. 2.2 Origins and establishment of SEO industry SEO is one of new reality of the internet economy. A field that is probably been taking to light by the american company iProspect when it was founded in 1996 (iProspec.com).They, intuitively, have built a huge business on the basis that 80% of the web users would rely on the top six search engines (Yahoo, Google, MSN, AOL, Lycos and AltaVista) to look for information, and obviously, which company would not wish to be ‘positioned’ in their top results first page? The discipline of SEO is also difficult and tricky as through a simple lack of understanding of what the search engine "spiders" that trawl the web are looking for , the sites owners’ could risk a non-placement in the top six search engines with the relative consequences and loss of benefits. It is in fact being surveyed from the iProspect SEO pioneer company that a large number of search engines users will assign brand value or equity to a top ranked web site, disregarding the fact that the search engine’s mathematical algorithm is the cause. This demonstrates, especially for new internet users, that top search engines listings transmit brand equity, so that, for example, lesser known brands or reseller companies could increase their perceived brand equity just by improving their positioning in the top search matches.4 SEO experts say that to effectively optimise a site, it is important to understand that search engines essentially do two basic things: index text and follow links. If the site does not contain these, it could be considered ‘invisible’, from a web search perspective. Survey outcome at www.iprospect.com/branding-survey by Dr Amanda Watlingon , Fredrick Marckini – May 2002 4 Page 9 of 16 In USA the SEO industry is in a very advanced stage and also in Europe, the market already boasts of dedicated SEO professionals, including Sticky Eyes, NetBooster, Search Engineers and Web Gravity, just to mention some. 2.3 Optimising An understanding of the work of the search engine optimising industry helps in interpreting the “relevance ranking” the search engines deliver. Basically, any company that belongs a web site would like people to find it easily, especially for e-commerce companies, the more customers they can attract the better. The first usual step is to ensure the web site to be indexed and so included in the database of the directories and search engines. Optimising also synonymous of ‘positioning’ consist to go far beyond just getting a website included in the database of directories and search engines, it consist in bringing it all the way to the “top ten” results listing! 2.4 Link analysis approach In a way search engines do all their best to assure relevancy of documents has got anything to do with extreme optimisation. In fact Google was the first search engine to make use of ‘link analysis’ which now plays an important role in all of the major search engines. The basic principle of link analysis is to rate highest those pages that most other pages point to using the search term. In other words, if 100 pages point to sbu.ac.uk homepage when they make a hyperlink on the word sbu.ac.uk, then that will rank higher than wmin.ac.uk (Westminster university homepage) if only 20 pages point to that web site. The strength and positive aspect of the link analysis approach, from a search user perspective, is that it makes it much more difficult to optimise inappropriately, since that would require changing other people's web pages. Previous relevance criteria were all determined by words and word patterns on the page itself. Link analysis looks at many other pages to see where they link. There are some drawbacks as well, in fact, link analysis approach seems to be an excellent ranking mechanism for some searches, but if we take in consideration new born sites, they have a distinct disadvantage. When someone puts up a new web site, he can submit it to several search engines, but it takes time for a search engine to spider new sites. And even after that site is indexed, there may not be other pages linking to the new site yet. 2.5 Spamming Spamming, in general, is an attempt to feed misleading information or different web pages from the actual one to search engines in order to gain favourable positioning. Search engines view spamming seriously, as it compromises the quality of their results and especially of their users results. Unfortunately there is no exact definition of what ‘spam’ is and what is not and it again depends from the different search engines, they in fact often change their own definition of spamming several times during a year. Some example of spams are here listed 5 : 5 From Inktomi spam policy at www.inktomi.com/products/web_search/guidelines.html Page 10 of 16 Pages which harm accuracy, diversity or relevance of search results Pages whose sole purpose is to direct the user to another page Pages which have substantially the same content as other pages Sites with numerous, unnecessary virtual host names Pages in great quantity, automatically generated or of little value Pages using methods to artificially inflate search engine ranking The use of text that is hidden from the user Giving the search engine a different page than the public sees (cloaking) Cross-linking sites excessively, to inflate a site's apparent popularity Pages built primarily for the search engines Misuse of competitor names Multiple sites offering the same content Pages which use excessive pop-ups, interfering with user navigation Pages that are deceptive, fraudulent or provide a poor user experience And more… 3 How to improve web searches? 3.1 The need of a regulation If we try to see from a user, and so consumer perspective, it could be argued that SEO unfairly skews search results in favour of companies, rather than users, but there seem to be a general consensus nowadays that ethical SEO can actually help consumers find the services they want without affecting the consumer’s rights. Page 11 of 16 I personally would not take that for granted and put the web search environment in discussion, as the use of search engines and web continues to grow, i believe that a lack of strict and specific regulation in this matter gives both search engines and SEO industry a too wide room for acting out of sight. Here I collected two very interesting views about the web search environment: Henrik Hansen, director of marketing for enterprise search with Inktomi (Inktomi.com), says his company's engineers actively work to combat unethical behaviour through increasingly sophisticated anti-spamming algorithms and regular human intervention by a team of editors, who check search results for accuracy and relevance. Danny Sullivan, a well known search specialist and editor of online publication (SearchEngineWatch.com), says "Paid listings are not going to go away for a long time; people are being shown too many paid links in comparison to editorial content. Search engines are going to have to provide a filter for search results as well as ads," The view of Henrik clearly indicate from a search engine perspective the need of a better regulated search environment, as ideally search engines job is to try to give us end users the best possible answers in terms of relevancy and not to worry in investment regarding setting anti-spamming policies. The Sullivan opinion also give some ideas of what could and should be regulated regarding non clearly highlighted pay-for-placement results ,or advertisements, otherwise how any user is suppose to know what is and what is not a ‘real’ relevant result in the listing proposed? and more from Danny Sullivan: “using techniques that try to trick the engines into doing what they want is not where companies should be putting their web development effort. Properly done, SEO can be highly effective, generating qualified traffic for site owners, improving search engine accuracy and delivering relevant, useful information to users." That is ideally the point where everyone would like to be, but can it really be possible without a specific regulation? 3.2 An idea: ranking algorithm customisation What I am going to propose here is just a hypothetical idea on how to personalise the subjective concept of relevancy during web searches. In order to do that it has to be clear the concept that there is no other means rather than us to better judge which information is more relevant to our queries. The way the search engines work at the moment, as previously referred, is to judge through a ranking algorithm which document are more relevant for queries. The ranking algorithm, say for example Google’s PageRank6 is a very complex formula which is unlikely any average user would be able to deeply understand. My idea would consist to let the user choose, in a customised fashion, the ranking algorithm for each web search. This will consist in giving an option between a number of different algorithms each of which would emphasise one or more different variables. That’s the way it could look like: 6 This paper is where i learnt how complex is PageRank: The PageRank Citation Ranking: Bringing Order to the Web (1998) Larry Page, Sergey Brin, R. Motwani, T. Winograd Page 12 of 16 ALGO 1: The order in which the keyword terms appear. If a keyword appears early in the web page to be ranked higher ALGO 2: The frequency of keyword. The more times a keyword appears, to be ranked higher ALGO 3: The occurrence of a keyword in the title. If the keyword that you enter appears in the document's title, or meta tag fields, to be ranked higher ALGO 4: Rare or unusual keywords. If they do not appear often in the index to be ranked higher than common terms. ALGO 5: Link popularity based algorithm. The more links point at that particular site or keyword to be ranked higher ALGO 6: Natural language processing based algorithm. Try to guess the meaning of the query to be ranked higher ALGO 7: Text analysis based algorithm. The text or content of the web page to be ranked higher ALGO 8: Latest date. The documents with latest update to be ranked higher ALGO 9: Size of the document. The documents with bigger(or smaller) size to be ranked higher And so forth… Where ALGOs are the different ranking algorithms for the user, to choose from. The idea is then extended to the fact that more ALGOs can be selected simultaneously for the same search, e.g.: a web search with a combination of ranking ALGOs 2, 3 and 7. And, furthermore, the user would have a range of different weights to select for each ALGO. In the previous example combination, the user could increase or decrease the weight given by the ALGOs 2, 3 and 7 by opting them to rank high or low. Simply, as i have already mentioned, this is just a hypothetical idea, which could go towards solutions to the nowadays extremely sophisticated problem of search engines and users to automate and personalise the concept of relevancy. Conclusions Throughout this report i tried to follow the same pattern i did, once I was assigned this research task. In order, acquired a broad understanding of search engines, knowledged of developments and commercial sides of the field and then sketched ideas or conclusions on what could or should be done. Those of us who use the "open web" as a research tool want timely and authoritative answers, without advertising or other kind of influence getting in the way of the best possible answer available. Using the Web effectively without general purpose search engines would be difficult, time consuming, and in many cases impossible. Once aware of that, question is: can the needs of all the communities (search engines, SEO, advertisers and search users) coexist? It is in everybody’s interest to make this happen, and I am positive that it will, but knowledge and continuing education for both information professionals and users is the key to continue to use general Page 13 of 16 purpose web search tools as enjoyable and effective resources. References Papers & Articles Rappoport, Avi “Site Search That Doesn't Stink” Internet World Conference December 11, 2001 A very clear presentation slides about search engines and web search issues. http://www.searchtools.com/slides/iw2001/index.html Notess, Greg R. “The Never-Ending Quest: Search Engine Relevance” ONLINE vol.24 n.3 – May 2000 An article about relevancy. http://www.infotoday.com/online/OLtocs/OLtocmay00.html Watlingon, Amanda & Marckini, Fredrick “Branding Survey” - May 2002. An interesting survey result demonstrating how psychologically powerful are search engines’ listing results on users. www.iprospect.com/branding-survey Grossan, Bruce “Search Engines.What they Are, How They Work, and Practical Suggestions for Getting the Most Out of Them” February 21, 1997 A very good paper to get started with all main issues regarding search environment. http://webreference.com/content/search/index.html Price,Gary “Web Search Engine FAQs: Questions, Answers, and Issues” Searcher Vol. 9 No. 9 — October2001 This is a complete article about features and secrets and all a user need to know when search engine are used. http://www.infotoday.com/searcher/oct01/price.htm Ensor, Pat “Toolkit for the expert web searcher” LITA A useful collection of resources about the search engines field http://www.lita.org/committe/toptech/toolkit.htm#engines Fifield, Craig “Effective Search Engine Design” SearchDay - November 7, 2002 Number 394 Daily newsletter from SearchEngineWatch about search engines design issues of Google, Yahoo and Lycos. http://searchenginewatch.com/searchday/02/sd1107-seusers.html SearchEngineWatch staff “The major search engines” October 12, 2002 An article about the top search engines and the ones to watch. http://www.searchenginewatch.com/webmasters/intro.html Page 14 of 16 Sullivan, Danny “Intro to search engine optimisation” October 14, 2002 A one screen article about SEO. http://www.searchenginewatch.com/webmasters/intro.html Sullivan, Danny “How Search Engines work” October 14, 2002 An introduction about how search engine work. http://www.searchenginewatch.com/webmasters/work.html Sullivan, Danny “How Search Engines rank web pages” October 14, 2002 An introduction about search engines’ automation of relevancy concept. http://www.searchenginewatch.com/webmasters/rank.html Price, Gary “Specialized Search Engine FAQs” Search - Vol. 10 No. 9 - October 2002 A very recent article on specialized resources offered by three major search engines: Google, AllTheWeb, and AltaVista. http://www.infotoday.com/searcher/oct02/price.htm Turau,Volker “Internationalization, Accessibility, and Ranking of Web Pages” Technischer Report 0598 – June 1998 A very interesting survey-based report about how could accessibility of web pages on the WWW be improved through new web designer techniques. http://www-1.informatik.fh-wiesbaden.de/~turau/reports/fortune.html Bianchini M., Gori M., Scarselli F. “PageRank: A Circuital Analysis” 2002 A research paper about pro and cons of PageRank. http://www2002.org/CDROM/poster/165.pdf Jansen, B. J.& Pooch, U. “A Review of Web Searching Studies and a Framework for Future Research” Journal of the American Society of Information Science – 2000 An extensive research paper on what has been done in the field of web search research. http://jimjansen.tripod.com/academic/pubs/wus.pdf Inman, Dave “Introduction to IR” Presentation slides overview of the various topics of Information Retrieval. http://www.scism.sbu.ac.uk/inmandw/ir/IRintro.ppt Larry Page, Sergey Brin, R. Motwani, T. Winograd “The PageRank Citation Ranking: Bringing Order to the Web” Stanford Digital Library Technologies Project – 1998 In this paper the PageRank is proposed and fully explained by their popular creators. http://citeseer.nj.nec.com/cache/papers/cs/7144/http:zSzzSzwwwdb.stanford.eduzSz~backrubzSzpageranksub.pdf/page98pagerank.pdf Lifantsev ,Maxim “Rank Computation Methods for Web Documents” -1999 In this paper the author give a review of the web ranking system available and propose the ‘voting model’, another way to estimate the relevancy on the web. http://citeseer.nj.nec.com/cache/papers/cs/14194/http:zSzzSzwww.ecsl.cs.sunysb.eduz SztrzSzTR76.pdf/lifantsev99rank.pdf Page 15 of 16 URLs A brief but complete tutorial on all search engines main issues. http://www.internet-handbooks.co.uk/izone/search/intro_search.htm A web guide about search tools, reviews, surveys and interesting stuff about search environments. http://www.searchtools.com Information page about PageRank, the Google’s ranking algorithm. http://www.google.com/technology/index.html Spam policy from Inktomi search company. www.inktomi.com/products/web_search/guidelines.html Books Glossebrenner Alfred & Emily “Search Engines for the world wide web” 3rd ed. Peachpit Press 2001. This recently revised book gives a detailed description of the major search engines. Page 16 of 16