ppt - Infopeople

advertisement
Getting the Most from
Joe Barker
Winter/Spring 2006
jbarker@library.berkeley.edu
An Infopeople Workshop
This Workshop Is Brought to You By the
Infopeople Project
Infopeople is a federally-funded grant project
supported by the California State Library. It
provides a wide variety of training to California
libraries. Infopeople workshops are offered around
the state and are open registration on a firstcome, first-served basis.
For a complete list of workshops, and for other
information about the project, go to the
Infopeople website at infopeople.org.
Introductions
 Name
 Library
 What you work in
 Excluding Google, the ONE web
search/info tool you use the
most?
Course Overview
??
 Why
 Web
and Google timeline
 Today's Search Engine Choices
 Similarities
in searching
 Comparing results more deeply
 What's different, rare, cool, promising
 The Post-Google Expanding Web
 Blogs
and RSS feeds
 Tags and wikis
 The Best Not-Web-Search Services from
Search Engines
Basic Web Search Skills
 Search techniques
“ ” to search words as phrases
common words usually ignored
+ or enclosing in " " forces common words to be searched
- excludes a word or phrase in quotes
Basic Boolean
OR must be capitalized if used as Boolean operator
AND (all your words) is assumed default between words
 no need to type AND
 Advanced Boolean
AND NOT to exclude a word or phrase in quotes
NEAR (words within 16 words of each other)
( ) to group terms joined by OR or NEAR

not available in Google or Ask.com
Examples in Basic Search Tips and Advanced Boolean Explained
POST-Google?
Search Engine
Timeline
 The Web before Google 1995-1998
 Google's rise to the top
1999-2000
 Google reign's supreme 2001-2002
 Google sets the pace
2002-2004
The Post-Google Web 2005 Google wows - book digitization program, Scholar, Earth, Moon
 Google's purity in question
 motives shifting to profit and portal?
 huge profit from adWords™ - how important is information?
 new products are not web search:
gmail, desk-top search, maps, blogsearch
 lawsuits over book digitization - Google on the defensive
 Size wars become meaningless, ugly
 Yahoo
22 billion "web objects"
 Google
won't disclose size anymore, teases us to guess
 Search business taking new directions
 Microsoft/Gates - reinvent search and outdo Google and Yahoo
 Exalead, Gigablast, Ask.com - fresh look at search, search services
 rumblings: Google's simplicity is old-fashioned
 the "Brady Bunch of search" - unbelievably good
 Google Video and Books searches designed as marketplace sites
Stage is set on all sides for a different web future
 When did you start using Google?
 Do you think all search engines give
you the same results pages?
Using Bookmarks in Class
1.
Go to: bookmarks.infopeople.org
2.
Look for the class bookmark file
3.
Click on it so it shows on the screen
4.
With the class bookmark file showing in
Internet Explorer, click the Favorites menu,
choose Add to Favorites…
5.
Notice the name in the Name: box so that
you can use the Favorites list to get back to
the class bookmarks for the rest of the day
Search Engine
Choices Today
The Major Search Engines
 Google – claims largest



first with popularity ranking for relevance
famous clean look
ads never mixed in results
 Yahoo! Search – claims largest


uses Inktomi, Yahoo! directory, and Overture (paid results)
Directory now subordinate to search
 Ask.com – claims best results


good search features using Teoma technology
renewed search emphasis – hired Gary Price
 quality search results
 natural language searching
 MSN Search – promises to revolutionize search


MicroSoft's response to Google/Yahoo successes
plans to "re-invent" search, outdo Google
Exercise 1
Comparing the top 10 results
from the "Big Four"
DISCUSSION:
Overlap Between Search Engines
 If you think you found what you
want from Engine A, why look
further?
Smaller Search Engines with Good
Features
 Not “Googlized” – unconventional
 Much smaller – 2-4 billion web pages
 Gigablast
 by
Matt Wells from old Infoseek
 good features & results
 Exalead
 French
entry into search engine wars
 many features packed into the screen
 growing – now 4 billion documents
How Does Size Matter?
 For most information, relevancy ranking
matters most
 popularity
ranking is a form of relevancy ranking
 look at the first 10-20 results
 try another search engine
 For the obscure, hard-to-find, larger may
be best
 more
comprehensive
 requires a distinctive word or phrase to zero in on
pages you need
How do I find distinctive words?

Sometimes a two-step search:
1. do some searches to collect unique jargon



use smaller search engines with good suggested terms
use directories to learn enough about language for aspects
of your subject
learn until you can be specific
2. then choose your search engine

You may have to read some web pages first
What's a reasonable dose of Neurontin?






search neurontin

learn it's gabapentin, the conditions it treats, side effects,
drug interactions; dosage varies with condition
search gabapentin OR neurontin with specific names of
condtions, side effects
might choose a search engine that suggests related searches
Big search engines dump the haystack on your head.
To find a needle, be specific.
The Size Numbers Don't Mean Much
 What is being counted?
More than web pages
"Web objects"
 images?
blog messages? feeds? chats? personals?
 Depth of crawl may be just as important
 how
much of a page does a search engine make
full-text searchable?
 how much of a website does it index?
 how many links in a page full of links will it crawl?
See Cheat Sheet #1
General Search Engine Features Comparison Chart
Easy Comparison Searching Tools
 Switch bookmarklets
 javascript programs that instruct your
browser to do something
1.
2.

go to a different search engine
copy, paste, and run your current search
used like Bookmarks or Favorites


add them like any other bookmark
click on them to make them work
 Bookmarklets for many other purposes

find with a Google search such as
bookmarklets libraries
Exercise 2
Installing and Using
"Switch Bookmarklets" for
Deeper Comparison Searching
Switch Bookmarklets Good for Basic Searches
 Default AND between words
"
" for phrases
 + or " " to search common words
 Problems arise in more advanced
searching
OR will not always switch accurately
some
limiter commands not standardized
rare and unique features cannot be switched
Search Engines with Full Boolean Searching:
OR, AND, AND NOT, ( )
 Yahoo, MSN, Exalead, Gigablast
 Must put parentheses around ORed terms
 AND before parentheses
search engines AND (web OR internet)
 to
exclude, use AND NOT
web search engines AND NOT (google OR yahoo)
 But in Google, Ask.com
 OR and - only
search engines web OR internet
web search engines -google –yahoo
 You can switch OR and other Boolean searches:
Google  Ask.com
Yahoo  MSN  Exalead  Gigablast
Useful Limiter Commands
 Focus on primary “aboutness” of a page
 intitle:
intitle:tutorial “web search”
 inurl:
inurl:tutorial “web search”
 Limit to a domain or search within a site
 site:
site:org hurricanes
site:noaa.gov hurricanes
 Limit to a non-HTML format
 filetype:filetype:ppt web search tutorial
More in Cheat Sheet #2
Search Engine Limiter Commands Comparison Chart
Limiters Not Entirely Consistent
 Cannot always switch
 single
limiters usually ok except in Gigablast
intitle:mileage “hybrid cars” site:gov
 OR
with limiters switch among similar search engines
asthma site:edu OR site:org
asthma AND (site:edu OR site:gov)
 Some search engines' limiters not like the others
 Gigablast
 title:
for intitle:
 suburl: for site:
 type: for filetype:
 Yahoo
 hostname: to limit to a site hosthame:www.infopeople.org
 everybody else uses site:www.infopeople.org
See examples in Cheat Sheet #2 (Limiter Commands)
Rare, Cool Search Features
 Stemming – Google, Exalead

other word endings automatic
librarian skill may retrieve libraries librarians skills skilled

to turn on/off
+librarian +skill
 activate in Exalead Preferences (default=OFF)

 Hyphen power – Google, Ask.com

hyphen retrieves hyphen, space, single word
Google: out-do finds out-do, out do, outdo
Ask.com: out-do finds out-do, out do
 Clustering, search suggestions

Gigablast – Giga Bits, Related Searches, Reference pages

Ask.com – Narrower/Broader Search Terms

Exalead – Related Terms box, extracts within results

Google
– synonym search ~FAQ finds help, manual

Google
– define:word or expression finds web definitions
Exercise 3
Exploring Cool and Less Standardized
Search Engine Features
Attempts to Customize Relevancy Ranking
 MSN Search

results Ranking sliders in Search Builder
 mindset.research.yahoo.com

(beta)
slider above search results
 Exalead date sort setting in Advanced Search

oldest to newest, newest to oldest - what date?
 MSN's

prefer:
and Exalead's opt:
requests a word without requiring it, gives it preference
Choosing Search Engines Wisely
 Size
 do you want comprehensive?
 Ranking
 popularity or something else?
 Want suggestions?
 try using clustering of results
 look at narrower/broader terms
 Need Boolean beyond OR?
 rarely better than several simpler searches
 NEAR (within 16 words) in Exalead
 Is a search engine the best place to start?
 directory?
 need to learn how to be specific?
Post-Google Expanding Web
 Blogs and RSS feeds
 Tags
 WIKIs
 Media searches
 Personalized spaces and services
Trend toward web virtual communities, sharing
Do you blog
or follow any RSS feeds?
Blogs – "the Blogosphere"
 Blog – short for "web log"


a webpage of brief entries, arranged chronologically
the fastest growing medium since the WWW
 15 - 50 million blogs now
 new blog born every second (over 30 million/year!)
 Simple to use and manage
 free
 minimal
technical skill required to make and run one
 Useful

to track topics and websites you want to stay on top of

if you want current




information - outreach in libraries, by publishers, from gov'ts
opinions - consumer, product, company, reviews - new stuff
news - most news sites put out a feed of what's added
insider commentary - individuals and group forums
 Noise and spam too

amateur diaries, notes, sharing, some ads, gunk, "splogging"
Exercise 4
When you reach a stop sign
,
please wait for group discussion
Features in Most Blogs
 Blog title, brief description or slogan
 Messages (called "postings")
 most recent first
 usually short
 sometimes snippet, with link to full posting
 comments can be added to messages in the blog
 In margins, usually:
 "search this blog" box
 links to archive or previous/recent postings
 categories or subjects in which postings are grouped
 about the author or the blog
 links to other sites of interest
 Ability to SUBSCRIBE to the blog with RSS feed
 means to keep up on what's new in blogs you like
RSS Feeds Can Save You Time
 RSS = "Really simple syndication"
 Subscribe and receive current postings from
blogs
 no
need to go each blog and look for what's new
 Useful if you want to keep up on what the blog
is saying
 RSS requires a reader to read and manage it
 RSS
feeds written in XML code, not easy to
understand
 bloglines.com offers the best feed reader
 free,
from AskJeeves
Any Website Can Offer an RSS Feed
 Most blogs have them
 keep current with blogs you like
 Many other web pages with frequent updates let
you subscribe to an RSS feed
 many
news sites
 Dilbert.com
 many journals alert to current tables of contents
 many government agencies
 consumer reviews
 Look at Bloglines' "Most Popular Feeds"
 click Directory tab in Bloglines
 some are not from blogs, some are from blogs
Finding RSS Feeds
 Search for RSS feeds
 bloglines.com
– browse feeds or search feeds
 MSN Search feed: limiter
 requires terms to be in RSS feed documents
feed:reviews "coffee grinder" cuisinart
 blogsearch.google.com
– postings from blogs with
feeds
 Yahoo
advance search – limit file type to RSS/XML
 pubsub.com
– clipping/alert service for blogs/feeds
 Search for blogs and subscribe to their feeds
Finding Blogs
 Search as for web pages
 include the term blog in search
 use subject directories
 Yahoo!
 use
or Google directories
web search engines like Google
 blogs/feeds
mixed in with web pages
 Follow referrals from other blogs
 Search special RSS feed or blog databases
 many to choose from, rapidly evolving
 most find feeds or blog postings
 see Bookmarks
Finding RSS Feeds and/or Blogs - Specialized searches
Exercise 5
Finding RSS feeds and blogs
Strategies for Finding Quality
Feeds and Blogs
 Look for "voices" that you respect
monitor a feed for a while
follow its links to other blogs
search by author in Google Advanced Blog
Search
 Subscribe to new feeds for a while
ruthlessly drop what you don't always read
 Try new ones
 Don't try to read everything!
Tags – Social Bookmarks
 Go to del.icio.us
 Click
(in Bookmarks)
more tags (under popular tags)
 Tag
cloud – size shows how many are tagging
 How do you create a Tag in del.icio.us?
 create
an account
 install
an "Add to del.icio.us" bookmarklet
 when
viewing a web page you want to tag, click on
this bookmarklet
 assign
as many tags as you wish up to 20
Tags – Collaborative Categorization of
Websites, Unstructured Cataloging
 TAGS - shared "votes" for pages
 databases
of "votes" for pages
 popularity
ranking in the raw
 subset of the billions of pages on the web
 useful
to find other links on a theme or topic
 weighted by how many other people have
bookmarked a page in that database
 del.icio.us gives you a webpage of your links
 lets you see who else tags a site
 tagcloud.com analyzes your RSS feeds
 FURL.net keeps a copy of every page you TAG
 searchable research organizer, pages safely cached
What's a Wiki ?
 A web application where users add, edit content
 wiki = "quick, rapid" in Hawaiian
 group projects, global possibilities
 shared wisdom and discoveries
 open dialogue, sometimes anonymous
 Focus on content, not page layout
 software mostly Open-source or other freeware
 wikipedia.org encyclopedia – many examples
 list of wikis
en.wikipedia.org/wiki/List_of_wikis
 Search engine searches to locate wikis
wiki [your subject]
inurl:wiki [your subject]
 Learn more
en.wikipedia.org/wiki/WIKI
Search Engines'
Not-Web-Search Services
 Search engines compete with
usefulness
lure
you to the site
hope you’ll click on an ad
Two Kinds of Not-Search Services
 Specialized searchable databases
 subject
directories
 government pages
 maps, businesses by locality, directions
 shopping, travel
 news
 publications – journals, books, subscriptions
 some multi-media
 Collections of links to resources with helpful
context and organization
 educational
information sites
 financial, investment information
 some videos, music, other media
Cheat Sheet #3 - Search Engine Not-Web-Search Services
Large Subject Directories
 Selected sites, "cataloged" into subject categories


to learn about a topic, find key pages
to get a more reliable subset of web pages
 Yahoo Directory



(dir.yahoo.com)
3-4 million selected, subject-organized sites
searchable, category searches
sometimes integrated in Yahoo Search
 Others use Open Directory

(dmoz.org)
5+ million selected, subject-organized sites
 Gigablast (gibablast.com)
 Exalead (exalead.com)
 integrate categories into search results
 Google (directory.google.com)
 stand-alone
search
enhanced by Google popularity ranking
 can sort alphabetically

Maps, Directions, Local Searches
 Google Local/Maps
(local.google.com)
 maps search default – enter an address or zip code
 Find Businesses link – yellow pages+ search
 Get Directions link – driving directions
 features
 draggable maps
 streets/roads, satellite, hybrid views
 hyperlinked steps in directions give thumbnail maps
 US, Canada, some UK and Japan
 business search has links to web pages, user reviews
 Yahoo Maps (maps.yahoo.com)
 arrows to move, no draggability
 streets/roads, current traffic/construction option
 US, Canada
 Yahoo Local (local.yahoo.com)
 "city page" directory-type hierarchy, sub-categories
 New Yahoo Local Maps beta (maps.yahoo.com/beta)


integrated, draggable, like Google
doesn't work reliably now, it seems
Education, Reference
 Yahoo Education (education.yahoo.com)
 specialized directory of resources
 reference
books
 education sections (K-12, college, grad sch)
 courses & degrees
 homework help
 career training3
 school ratings
 sample tests
 searchable
 MSN Encarta (encarta.msn.com)
 some comparable resources
 2-hour free subscription can be renewed with a new
search
Governments, Finance
 Gigablast Gov Search (gov.gigablast.com)
 34+ million pages
 gov, mil, us, other domains with gov't info or links
 suggestions from clusters
 Reference page link collections can be useful
 Google Uncle Sam (www.google.com/unclesam)
 gov, mil, us with google ranking
 Yahoo! Finance (finance.yahoo.com)
 current, historical
 stocks,
funds, bonds, charts, research reports, loan rates
 upgrades/downgrades, company profiles, IPOs, portfolios
 domestic,
foreign
 useful help screens, glossary, definitions
 MSN Money (moneycentral.msn.com)
 similar service aims, less user-friendly
Multi-Media on the Web
 Google Video Search
(video.google.com)
 archived videos - TV, contributed, personal
 search text descriptions, closed captioning
 view
stills, snippets of available closed captioning
  icon in image can be viewed using Flash in browser
 View this show - TV listings, several channels
 Yahoo Video Search
(video.search.yahoo.com)
 millions of videos from web pages
 links to web pages with videos
 various
players required (Real, Quicktime, WindowsMedia)
 Yahoo Podcast Search
 search,
browse, download or listen
 works
 Exalead
(podcasts.yahoo.com)
in browsers or MP3 players
(exalead.com)
 in web search results
 click AUDIO, VIDEO to
extract links (in bar above results)
Exercise 6
 Exploring Some of the Best
Not-Web-Search Services
and Databases
Current News
 Google News
(news.google.com)
 4500+
sources, English language
 customize entry page to topics you want
 RSS available for selected topics
 Yahoo News
(news.yahoo.com)
 7000+
sources, English language, media
 personalize home page to sections you want
 personalize sources in sections
 MSN Newsbot
beta
(newsbot.msnbc.msn.com)
 4800+
sources augment MSNBC news
 by tracking what is read, aims to provide what wanted
No real "winner" – depends on your needs
Images
 Google Images (images.google.com)
 1.3+ billion
 search words near images, in surrounding text
 Adv. Search to limit by image size, filetype:, color/BW
and site:
 accepts OR and " "
 Yahoo Images (images.yahoo.com)
 1.5 billion
 search similar to Google's, different images
 Adv. Search to limit
 no
filetype:
No real "winner" – search both
Shopping and Traveling
 Froogle



(froogle.google.com)
searchable, some categories, price range settings
merchant listings, web pages
sometimes more results – sometimes helpful
 Yahoo Shopping

search or browse by category



refine with related product suggestions
shopper reviews, price and product comparisons
saved products aggregates your shopping in "myYahoo"
 Gigablast Travel



(shopping.yahoo.com)
(travel.gigablast.com)
5.4+ million pages of travel information
some useful advice, things to see, places to stay
not just selling travel and trips
 Yahoo Trip-Planner (travel.yahoo.com/trip)

trips from Yahoo's millions of members
Finding People, Groups
 Yahoo People Finder
(people.yahoo.com)
 phone/address search
 white pages + individual submissions, corrections
 email search
 largely Yahoo members, contributors, old directories
 useful Advanced Search:
 limit by location, SmartName flexibility
 organization emails
No free web people finder very reliable or comprehensive
 Yahoo Groups (groups.yahoo.com)



support groups, communities
searchable, directory-like browsability
RSS often available in public groups
 Google Groups


(groups.google.com)
current groups not very large or active
historic file of Usenet newsgroups to 1981, outdone by blogging
Publications on the Web
 Google Scholar
(scholar.google.com)
 scholarly article journal citations - many sources

some free, some fee, some unavailable online
 links
to pages which cite a citation
 links to publishers, some holding libraries
integration with many libraries' online journals
 OpenURL links free for libraries who submit list of ejournals
See article in Searcher, v. 13 (Oct 2005), pp. 39-46

 Google Book Search/Google Library (print.google.com)
 full
text of publisher-allowed books, snippets of others
 plan to digitize millions of books in public domain
lawsuits
 opt-out solution

 hardly
better than Amazon without more o.p. books
 Yahoo Subscriptions beta
 in Advanced Search (fee - can search, compare for free)
 Yahoo promises a full-text book search
Why So Many Databases and Services?
 Competition between search engines
 Most search engines emphasize media
 portals
with things to do
 appeal to people's escapist/entertainment needs
 search secondary to clicking, sharing, feeling good
 Google's
 needs
huge success with search measured in ad profits
to remain in the lead as others innovate with media
 new ways to "make all information freely available"
 Technological advances
 things
made possible by cheap technology, inventions, genius
 Google's smartest technology finds ads you are likely to want
to view
 PR, "me too," and greed
 search
engines get paid if you click ads, find ads in your path
 all media portals and Google's "information" services have ads
Imagine a world with interesting ads lurking everywhere
Exercise 7
 Make your own Cheat Sheet
 Write
down seven things from today
that you especially want to
remember
 Circle the ONE you really want to
make time for
Course Evaluation
www.infopeople.org/WS/eval
Download