Search - Computer Science

advertisement
Search
1
Contents
•
•
•
•
•
•
•
•
•
Introduction to Search
Google Account/Gmail
Results page
The Google search algorithm
Preferences
Forming a query
Operators
Other Google search applications
Other Google applications
2
Introduction
• What search engine(s) do you use?
• What searches have you done lately?
• Google is not necessarily the best search engine
– http://www.nytimes.com/2009/07/09/technology/personaltech/09pogue.html
• However, if you can use Google well, you will know what
to look for in other engines
• For the rest of this lecture, let's get to know Google well,
both its search and related capabilities.
• From now on these lecture notes will disappear from the
screen….
3
Google Account/Gmail
• A few Google facilities are available only to those with a
Google account or gmail.
– Blogger, YouTube comments, Google Groups, Web History,
Google Docs
• You can sign up for these facilities at
www.google.com/accounts/login.
– You can have an account without having gmail, using another
email address.
• Pros of Gmail
– Available through any browser, 1GB storage, groups mails by
subject, excellent search, good spam filter, nice add-ins.
• Cons of Gmail
– Lots of ads, possible privacy problems, no self-delivery,
conversations cannot be split up.
4
The Results Page
Sponsored (PPC) Links
Search Toolbar
Results
5
How are Google’s search results
ordered?
• Many (200+) factors considered by Google’s Algorithm
• This algorithm is very important to businesses
– Every business wants its page at the top!
– Entire books, courses devoted to Search Engine Optimizaton:
The art/science of getting your website at the top of results
• Algorithm is tweaked by Google daily
• Most important factor in algorithm: Page Rank
– Page Rank is named after Larry Page, one of its inventors. Larry
and Sergey Brin founded Google.
• Google blew away the competition because it had Page
Rank; no other search engine did.
6
What is Page Rank?*
• Very roughly, the page rank of a page is the number of
pages that link to that page
– See those pages with the operator link:
• http://www.google.com/intl/en/help/operators.html#link
7
What else is in Google's
Algorithm?
• Google will rank "your" page high in the results of a
search for some keywords if:
– The keywords are in prominent places on your page
• Prominent = page title, headings, meta tags, text
• Meta tags are part of the HTML of a page, not viewable by a user
– The pages, pointing to your page, have many links pointing to
them, i.e. have high page rank.
• This is part of the definition of Page Rank
– The keywords are close together on your page
– The keywords are in the anchors of the pages pointing to your
page
• Now we'll move on to learning how to use Google search
8
Search History
• Each browser keeps a history of the pages you have
searched in that browser, on that computer.
– Helpful to autocomplete when you type
– This has nothing to do with google.
• You can disable or clear that history feature.
• Firefox:
– Clear: Tools/Clear recent history
• Click details and clear only the browsing and download history
– Disable: Tools/Options/Privacy/Clear history when Firefox closes
• Other browsers:
http://www.google.com/support/websearch/bin/answer.py?answer=465
9
Web History
• This is a history of your google searches.
• You can turn it on when you register your Google
account, and access it at www.google.com/history
• You can search it, remove items from it, see trends in
your google search history
10
Preferences
• On Google's home page, at the upper right corner, click
settings/search settings
• Note languages, number of results
• Query suggestions
– Uses advertisements as well as your web history
• SearchWiki: Customize your searches!
11
Forming a Query
• What you don’t want: Use the minus sign, – Ari Shapiro –NPR
– Omits all the Ari Shapiros not at NPR
• A phrase: Use quotes “…”
– “Dan Shapiro”
– omits all the "Dans" and the "Shapiros"
• Something you must include: Use a plus +
– Len Shapiro +”Portland State University”
– Shows only the Len Shapiros at PSU
• Show only certain filetypes: filetype:
– Google filetype:ppt
• Look within a certain domain: site:
– Cyberculture site:www.pdx.edu
• Do this and more on the Advanced Search page
– www.google.com/advanced_search
12
Operators on Keyword(s)
• Don't put a space after the colon!
• Intitle:"tree removal"
– Displays only sites whose titles include the phrase "tree removal"
• Define:personable
– Gives definitions from the web
13
Operators on URLs
• Don't put a space after the colon!
• Cache:www.cs.pdx.edu/~len
– To see an old copy of my web page
• Related:www.portlandfoodanddrink.com
– To find similar pages
14
Miscellaneous Operators
•
•
•
•
Phonebook:Leonard Shapiro Portland Oregon
Movie:97223
Weather:portland,oregon
Special number searches
– UPS, FedEx, USPS, VIN, ISBN
• www.google.com/intl/en/help/features.html
15
Which Principles does Google
embody?
• Data Rules: The value of an application is increased by
the scale and dynamism of the data it manages
• The long tail: Small products/ideas make up the great
majority of all products/ideas.
• Use your Users: Enable users to contribute content to
the application
• The Power of Groups: Apply the wisdom of users to
solve problems
• Enable Community: Enable users to share their
experiences in your application
• Folksonomy, not Taxonomy: Utilize user-generated
tags to classify items, instead of expertly generated
categories
16
Other Google Search
Applications
• Important: Use your search skills
• Images
– Tiger -white
•
•
•
•
•
•
•
Videos: youtube and more!
News (location: , source: )
Shopping
Groups (author: group: insubject: )
Scholar
Finance
Blogs
17
Other Google Applications
• Maps, www.maps.google.com
– Similar to mapquest or maps.yahoo.com, but has satellite/terrain views
• Earth
– Must download as a separate application
– As if you could fly anywhere and see below you
• Documents & Spreadsheets
– Similar to Microsoft Word & Excel
– Not quite compatible
• E.g., if you load a Word document into G, it may not display completely
• Google Docs was meant to be a new word processor, not a copy of Word
– http://tinyurl.com/2k32sh
– Documents stored on the Web
– Easy to share with others
– Price is right
• Gmail
– We've seen this before
18
More Google Applications
• Calendar
– Stored on the web
– Easy to share with others
• Google Alerts (under even more)
– Be informed (email) when anything (keyword based) happens
• Picasa: Photo editing, sharing
• Directory: directory.google.com
– Another way to search the web
– Oldie but goodie
19
Other search engines
• Wolfram alpha: www.wolframalpha.com
– Computational knowledge engine
– Try ../examples
• A recent article ranking customer satisfaction with search
engines: http://tinyurl.com/lvnca2
20
Download