Hyper Searching the Web (Outsource), “Smarter” Meta-Search (Themes & Outsource)

advertisement
Hyper Searching the Web
 Search Engines: Basic Search (Index), Cluster Search (Themes), Meta-Search
(Outsource), “Smarter” Meta-Search (Themes & Outsource)
 Basic Search Engines: ex. AltaVista, InfoSeek, Lycos, Excite, HotBot, Google,
etc; maintains an index for every word found; processes through crawling,
indexing and returning results; different ranking systems used- most used
heuristics (easiest solution) counts # of keywords that appear/ Google uses
PageRank
 No idea of searcher’s intent so “best” result is hard to achieve
 Problems with synonymy and polysemy ex. Car and automobile/ jaguar
 One solution” store semantic relations- only can help synonmy
 Can’t identify concepts/ author intent ex. IBM site does not say “computer”
 Cluster Search Engine: ex. Site “Clusty”/ cluster results into categories/themes
 Can show results that would be ranked lower in another search engine- due to
different meanings in word, can show the less searched-for websites
 Meta Search Engine: DogPile, Surf Wax, Copernic
 Sends searcher’s query to a database of search engines
 Claimed to not be any better than database; often the referenced search engines
are small, free, commercial; users can create their own on Google of up to 5,000
URLs as “database”
 Smarter Meta-Search engine ex. Clever project (n/a online yet)
 Includes clustering and linguistic analysis
 Uses hyperlinks to locate hubs and authorities “a respected authority is a page that
is referred to by many good hubs; a useful hub is a location that points tp many
valuable authorities”
 The Clever Project: obtains a list of webpages from a standard index & follows
hyperlinks to increase own database
 -resulting collection= “root set”/ -each page gets numerical hub and authority
score
 The Clever Project: similar to PageRank in determining method- guesses &
constant calculations (useful by-product: cluster sites)
 Adds to competition because competitors don’t have to acknowledge their
competition through hyperlinks
 Clever vs Google
 Google: gives initial rankings, keeps pages independent of queries, faster, looks
forward “link to link”
 Clever: root sets per keyword, page priority through query context, forwards and
backwards “hub authority”, sometimes too broad ex. Fallingwater
Download