Getting the Most from Joe Barker Winter/Spring 2006 jbarker@library.berkeley.edu An Infopeople Workshop This Workshop Is Brought to You By the Infopeople Project Infopeople is a federally-funded grant project supported by the California State Library. It provides a wide variety of training to California libraries. Infopeople workshops are offered around the state and are open registration on a firstcome, first-served basis. For a complete list of workshops, and for other information about the project, go to the Infopeople website at infopeople.org. Introductions Name Library What you work in Excluding Google, the ONE web search/info tool you use the most? Course Overview ?? Why Web and Google timeline Today's Search Engine Choices Similarities in searching Comparing results more deeply What's different, rare, cool, promising The Post-Google Expanding Web Blogs and RSS feeds Tags and wikis The Best Not-Web-Search Services from Search Engines Basic Web Search Skills Search techniques “ ” to search words as phrases common words usually ignored + or enclosing in " " forces common words to be searched - excludes a word or phrase in quotes Basic Boolean OR must be capitalized if used as Boolean operator AND (all your words) is assumed default between words no need to type AND Advanced Boolean AND NOT to exclude a word or phrase in quotes NEAR (words within 16 words of each other) ( ) to group terms joined by OR or NEAR not available in Google or Ask.com Examples in Basic Search Tips and Advanced Boolean Explained POST-Google? Search Engine Timeline The Web before Google 1995-1998 Google's rise to the top 1999-2000 Google reign's supreme 2001-2002 Google sets the pace 2002-2004 The Post-Google Web 2005 Google wows - book digitization program, Scholar, Earth, Moon Google's purity in question motives shifting to profit and portal? huge profit from adWords™ - how important is information? new products are not web search: gmail, desk-top search, maps, blogsearch lawsuits over book digitization - Google on the defensive Size wars become meaningless, ugly Yahoo 22 billion "web objects" Google won't disclose size anymore, teases us to guess Search business taking new directions Microsoft/Gates - reinvent search and outdo Google and Yahoo Exalead, Gigablast, Ask.com - fresh look at search, search services rumblings: Google's simplicity is old-fashioned the "Brady Bunch of search" - unbelievably good Google Video and Books searches designed as marketplace sites Stage is set on all sides for a different web future When did you start using Google? Do you think all search engines give you the same results pages? Using Bookmarks in Class 1. Go to: bookmarks.infopeople.org 2. Look for the class bookmark file 3. Click on it so it shows on the screen 4. With the class bookmark file showing in Internet Explorer, click the Favorites menu, choose Add to Favorites… 5. Notice the name in the Name: box so that you can use the Favorites list to get back to the class bookmarks for the rest of the day Search Engine Choices Today The Major Search Engines Google – claims largest first with popularity ranking for relevance famous clean look ads never mixed in results Yahoo! Search – claims largest uses Inktomi, Yahoo! directory, and Overture (paid results) Directory now subordinate to search Ask.com – claims best results good search features using Teoma technology renewed search emphasis – hired Gary Price quality search results natural language searching MSN Search – promises to revolutionize search MicroSoft's response to Google/Yahoo successes plans to "re-invent" search, outdo Google Exercise 1 Comparing the top 10 results from the "Big Four" DISCUSSION: Overlap Between Search Engines If you think you found what you want from Engine A, why look further? Smaller Search Engines with Good Features Not “Googlized” – unconventional Much smaller – 2-4 billion web pages Gigablast by Matt Wells from old Infoseek good features & results Exalead French entry into search engine wars many features packed into the screen growing – now 4 billion documents How Does Size Matter? For most information, relevancy ranking matters most popularity ranking is a form of relevancy ranking look at the first 10-20 results try another search engine For the obscure, hard-to-find, larger may be best more comprehensive requires a distinctive word or phrase to zero in on pages you need How do I find distinctive words? Sometimes a two-step search: 1. do some searches to collect unique jargon use smaller search engines with good suggested terms use directories to learn enough about language for aspects of your subject learn until you can be specific 2. then choose your search engine You may have to read some web pages first What's a reasonable dose of Neurontin? search neurontin learn it's gabapentin, the conditions it treats, side effects, drug interactions; dosage varies with condition search gabapentin OR neurontin with specific names of condtions, side effects might choose a search engine that suggests related searches Big search engines dump the haystack on your head. To find a needle, be specific. The Size Numbers Don't Mean Much What is being counted? More than web pages "Web objects" images? blog messages? feeds? chats? personals? Depth of crawl may be just as important how much of a page does a search engine make full-text searchable? how much of a website does it index? how many links in a page full of links will it crawl? See Cheat Sheet #1 General Search Engine Features Comparison Chart Easy Comparison Searching Tools Switch bookmarklets javascript programs that instruct your browser to do something 1. 2. go to a different search engine copy, paste, and run your current search used like Bookmarks or Favorites add them like any other bookmark click on them to make them work Bookmarklets for many other purposes find with a Google search such as bookmarklets libraries Exercise 2 Installing and Using "Switch Bookmarklets" for Deeper Comparison Searching Switch Bookmarklets Good for Basic Searches Default AND between words " " for phrases + or " " to search common words Problems arise in more advanced searching OR will not always switch accurately some limiter commands not standardized rare and unique features cannot be switched Search Engines with Full Boolean Searching: OR, AND, AND NOT, ( ) Yahoo, MSN, Exalead, Gigablast Must put parentheses around ORed terms AND before parentheses search engines AND (web OR internet) to exclude, use AND NOT web search engines AND NOT (google OR yahoo) But in Google, Ask.com OR and - only search engines web OR internet web search engines -google –yahoo You can switch OR and other Boolean searches: Google Ask.com Yahoo MSN Exalead Gigablast Useful Limiter Commands Focus on primary “aboutness” of a page intitle: intitle:tutorial “web search” inurl: inurl:tutorial “web search” Limit to a domain or search within a site site: site:org hurricanes site:noaa.gov hurricanes Limit to a non-HTML format filetype:filetype:ppt web search tutorial More in Cheat Sheet #2 Search Engine Limiter Commands Comparison Chart Limiters Not Entirely Consistent Cannot always switch single limiters usually ok except in Gigablast intitle:mileage “hybrid cars” site:gov OR with limiters switch among similar search engines asthma site:edu OR site:org asthma AND (site:edu OR site:gov) Some search engines' limiters not like the others Gigablast title: for intitle: suburl: for site: type: for filetype: Yahoo hostname: to limit to a site hosthame:www.infopeople.org everybody else uses site:www.infopeople.org See examples in Cheat Sheet #2 (Limiter Commands) Rare, Cool Search Features Stemming – Google, Exalead other word endings automatic librarian skill may retrieve libraries librarians skills skilled to turn on/off +librarian +skill activate in Exalead Preferences (default=OFF) Hyphen power – Google, Ask.com hyphen retrieves hyphen, space, single word Google: out-do finds out-do, out do, outdo Ask.com: out-do finds out-do, out do Clustering, search suggestions Gigablast – Giga Bits, Related Searches, Reference pages Ask.com – Narrower/Broader Search Terms Exalead – Related Terms box, extracts within results Google – synonym search ~FAQ finds help, manual Google – define:word or expression finds web definitions Exercise 3 Exploring Cool and Less Standardized Search Engine Features Attempts to Customize Relevancy Ranking MSN Search results Ranking sliders in Search Builder mindset.research.yahoo.com (beta) slider above search results Exalead date sort setting in Advanced Search oldest to newest, newest to oldest - what date? MSN's prefer: and Exalead's opt: requests a word without requiring it, gives it preference Choosing Search Engines Wisely Size do you want comprehensive? Ranking popularity or something else? Want suggestions? try using clustering of results look at narrower/broader terms Need Boolean beyond OR? rarely better than several simpler searches NEAR (within 16 words) in Exalead Is a search engine the best place to start? directory? need to learn how to be specific? Post-Google Expanding Web Blogs and RSS feeds Tags WIKIs Media searches Personalized spaces and services Trend toward web virtual communities, sharing Do you blog or follow any RSS feeds? Blogs – "the Blogosphere" Blog – short for "web log" a webpage of brief entries, arranged chronologically the fastest growing medium since the WWW 15 - 50 million blogs now new blog born every second (over 30 million/year!) Simple to use and manage free minimal technical skill required to make and run one Useful to track topics and websites you want to stay on top of if you want current information - outreach in libraries, by publishers, from gov'ts opinions - consumer, product, company, reviews - new stuff news - most news sites put out a feed of what's added insider commentary - individuals and group forums Noise and spam too amateur diaries, notes, sharing, some ads, gunk, "splogging" Exercise 4 When you reach a stop sign , please wait for group discussion Features in Most Blogs Blog title, brief description or slogan Messages (called "postings") most recent first usually short sometimes snippet, with link to full posting comments can be added to messages in the blog In margins, usually: "search this blog" box links to archive or previous/recent postings categories or subjects in which postings are grouped about the author or the blog links to other sites of interest Ability to SUBSCRIBE to the blog with RSS feed means to keep up on what's new in blogs you like RSS Feeds Can Save You Time RSS = "Really simple syndication" Subscribe and receive current postings from blogs no need to go each blog and look for what's new Useful if you want to keep up on what the blog is saying RSS requires a reader to read and manage it RSS feeds written in XML code, not easy to understand bloglines.com offers the best feed reader free, from AskJeeves Any Website Can Offer an RSS Feed Most blogs have them keep current with blogs you like Many other web pages with frequent updates let you subscribe to an RSS feed many news sites Dilbert.com many journals alert to current tables of contents many government agencies consumer reviews Look at Bloglines' "Most Popular Feeds" click Directory tab in Bloglines some are not from blogs, some are from blogs Finding RSS Feeds Search for RSS feeds bloglines.com – browse feeds or search feeds MSN Search feed: limiter requires terms to be in RSS feed documents feed:reviews "coffee grinder" cuisinart blogsearch.google.com – postings from blogs with feeds Yahoo advance search – limit file type to RSS/XML pubsub.com – clipping/alert service for blogs/feeds Search for blogs and subscribe to their feeds Finding Blogs Search as for web pages include the term blog in search use subject directories Yahoo! use or Google directories web search engines like Google blogs/feeds mixed in with web pages Follow referrals from other blogs Search special RSS feed or blog databases many to choose from, rapidly evolving most find feeds or blog postings see Bookmarks Finding RSS Feeds and/or Blogs - Specialized searches Exercise 5 Finding RSS feeds and blogs Strategies for Finding Quality Feeds and Blogs Look for "voices" that you respect monitor a feed for a while follow its links to other blogs search by author in Google Advanced Blog Search Subscribe to new feeds for a while ruthlessly drop what you don't always read Try new ones Don't try to read everything! Tags – Social Bookmarks Go to del.icio.us Click (in Bookmarks) more tags (under popular tags) Tag cloud – size shows how many are tagging How do you create a Tag in del.icio.us? create an account install an "Add to del.icio.us" bookmarklet when viewing a web page you want to tag, click on this bookmarklet assign as many tags as you wish up to 20 Tags – Collaborative Categorization of Websites, Unstructured Cataloging TAGS - shared "votes" for pages databases of "votes" for pages popularity ranking in the raw subset of the billions of pages on the web useful to find other links on a theme or topic weighted by how many other people have bookmarked a page in that database del.icio.us gives you a webpage of your links lets you see who else tags a site tagcloud.com analyzes your RSS feeds FURL.net keeps a copy of every page you TAG searchable research organizer, pages safely cached What's a Wiki ? A web application where users add, edit content wiki = "quick, rapid" in Hawaiian group projects, global possibilities shared wisdom and discoveries open dialogue, sometimes anonymous Focus on content, not page layout software mostly Open-source or other freeware wikipedia.org encyclopedia – many examples list of wikis en.wikipedia.org/wiki/List_of_wikis Search engine searches to locate wikis wiki [your subject] inurl:wiki [your subject] Learn more en.wikipedia.org/wiki/WIKI Search Engines' Not-Web-Search Services Search engines compete with usefulness lure you to the site hope you’ll click on an ad Two Kinds of Not-Search Services Specialized searchable databases subject directories government pages maps, businesses by locality, directions shopping, travel news publications – journals, books, subscriptions some multi-media Collections of links to resources with helpful context and organization educational information sites financial, investment information some videos, music, other media Cheat Sheet #3 - Search Engine Not-Web-Search Services Large Subject Directories Selected sites, "cataloged" into subject categories to learn about a topic, find key pages to get a more reliable subset of web pages Yahoo Directory (dir.yahoo.com) 3-4 million selected, subject-organized sites searchable, category searches sometimes integrated in Yahoo Search Others use Open Directory (dmoz.org) 5+ million selected, subject-organized sites Gigablast (gibablast.com) Exalead (exalead.com) integrate categories into search results Google (directory.google.com) stand-alone search enhanced by Google popularity ranking can sort alphabetically Maps, Directions, Local Searches Google Local/Maps (local.google.com) maps search default – enter an address or zip code Find Businesses link – yellow pages+ search Get Directions link – driving directions features draggable maps streets/roads, satellite, hybrid views hyperlinked steps in directions give thumbnail maps US, Canada, some UK and Japan business search has links to web pages, user reviews Yahoo Maps (maps.yahoo.com) arrows to move, no draggability streets/roads, current traffic/construction option US, Canada Yahoo Local (local.yahoo.com) "city page" directory-type hierarchy, sub-categories New Yahoo Local Maps beta (maps.yahoo.com/beta) integrated, draggable, like Google doesn't work reliably now, it seems Education, Reference Yahoo Education (education.yahoo.com) specialized directory of resources reference books education sections (K-12, college, grad sch) courses & degrees homework help career training3 school ratings sample tests searchable MSN Encarta (encarta.msn.com) some comparable resources 2-hour free subscription can be renewed with a new search Governments, Finance Gigablast Gov Search (gov.gigablast.com) 34+ million pages gov, mil, us, other domains with gov't info or links suggestions from clusters Reference page link collections can be useful Google Uncle Sam (www.google.com/unclesam) gov, mil, us with google ranking Yahoo! Finance (finance.yahoo.com) current, historical stocks, funds, bonds, charts, research reports, loan rates upgrades/downgrades, company profiles, IPOs, portfolios domestic, foreign useful help screens, glossary, definitions MSN Money (moneycentral.msn.com) similar service aims, less user-friendly Multi-Media on the Web Google Video Search (video.google.com) archived videos - TV, contributed, personal search text descriptions, closed captioning view stills, snippets of available closed captioning icon in image can be viewed using Flash in browser View this show - TV listings, several channels Yahoo Video Search (video.search.yahoo.com) millions of videos from web pages links to web pages with videos various players required (Real, Quicktime, WindowsMedia) Yahoo Podcast Search search, browse, download or listen works Exalead (podcasts.yahoo.com) in browsers or MP3 players (exalead.com) in web search results click AUDIO, VIDEO to extract links (in bar above results) Exercise 6 Exploring Some of the Best Not-Web-Search Services and Databases Current News Google News (news.google.com) 4500+ sources, English language customize entry page to topics you want RSS available for selected topics Yahoo News (news.yahoo.com) 7000+ sources, English language, media personalize home page to sections you want personalize sources in sections MSN Newsbot beta (newsbot.msnbc.msn.com) 4800+ sources augment MSNBC news by tracking what is read, aims to provide what wanted No real "winner" – depends on your needs Images Google Images (images.google.com) 1.3+ billion search words near images, in surrounding text Adv. Search to limit by image size, filetype:, color/BW and site: accepts OR and " " Yahoo Images (images.yahoo.com) 1.5 billion search similar to Google's, different images Adv. Search to limit no filetype: No real "winner" – search both Shopping and Traveling Froogle (froogle.google.com) searchable, some categories, price range settings merchant listings, web pages sometimes more results – sometimes helpful Yahoo Shopping search or browse by category refine with related product suggestions shopper reviews, price and product comparisons saved products aggregates your shopping in "myYahoo" Gigablast Travel (shopping.yahoo.com) (travel.gigablast.com) 5.4+ million pages of travel information some useful advice, things to see, places to stay not just selling travel and trips Yahoo Trip-Planner (travel.yahoo.com/trip) trips from Yahoo's millions of members Finding People, Groups Yahoo People Finder (people.yahoo.com) phone/address search white pages + individual submissions, corrections email search largely Yahoo members, contributors, old directories useful Advanced Search: limit by location, SmartName flexibility organization emails No free web people finder very reliable or comprehensive Yahoo Groups (groups.yahoo.com) support groups, communities searchable, directory-like browsability RSS often available in public groups Google Groups (groups.google.com) current groups not very large or active historic file of Usenet newsgroups to 1981, outdone by blogging Publications on the Web Google Scholar (scholar.google.com) scholarly article journal citations - many sources some free, some fee, some unavailable online links to pages which cite a citation links to publishers, some holding libraries integration with many libraries' online journals OpenURL links free for libraries who submit list of ejournals See article in Searcher, v. 13 (Oct 2005), pp. 39-46 Google Book Search/Google Library (print.google.com) full text of publisher-allowed books, snippets of others plan to digitize millions of books in public domain lawsuits opt-out solution hardly better than Amazon without more o.p. books Yahoo Subscriptions beta in Advanced Search (fee - can search, compare for free) Yahoo promises a full-text book search Why So Many Databases and Services? Competition between search engines Most search engines emphasize media portals with things to do appeal to people's escapist/entertainment needs search secondary to clicking, sharing, feeling good Google's needs huge success with search measured in ad profits to remain in the lead as others innovate with media new ways to "make all information freely available" Technological advances things made possible by cheap technology, inventions, genius Google's smartest technology finds ads you are likely to want to view PR, "me too," and greed search engines get paid if you click ads, find ads in your path all media portals and Google's "information" services have ads Imagine a world with interesting ads lurking everywhere Exercise 7 Make your own Cheat Sheet Write down seven things from today that you especially want to remember Circle the ONE you really want to make time for Course Evaluation www.infopeople.org/WS/eval