Introduction to Google Search Engine

advertisement
The Google Search Engine
Christy Gavin
Spring, 2009
What is an internet search engine?
A computer program that retrieves data from
the internet.
Google’s Mission
To organize the world’s information and make it universally
accessible and useful.
Google Fast Facts
•Invented by Larry Page and Sergey Brin
•Online 1998
•Named after “googol”
•Verb in the OED, 2006
•Added the one trillionth web address, July 2008
•500 million searchers a year
How does Google find data?
Uses a spider program called GoogleBot
What Does Google Do Once It Arrives At a Site?
How GoogleBot finds your website:
• You submit URL to Google
• You submit a sitemap to Google
• GoogleBot finds a link to your site
GoogleBot’s crawler retrieves websites
and stores them in Google’s index.
Google’s index works much like an index
found in the back of book.
A Short Introduction to Gender by Rae Connell
Google Web Citation
Google’s Advertising Philosophy
•Search results free of paid ads.
•Avoid flashy banner ads and pop-ups.
Google’s “keyword-targeted text ads.”
You do not see the ad unless your search
relates to the ad’s topic.
Google’s Way:
Sell ads to make money but…
don’t mix ads with user’s search results.
eating disorders
The Secret World of Relevance Ranking
Relevance ranking is the measure of how well the
search results answer the question or search.
How does Google rank
the relevance search results?
The heart of Google ranking is PageRank.
PageRank works like a voting system.
Page A links to page Z. This is a vote for page Z.
The more pages that link to a page. . .
that page receives a high PageRank.
Google also looks at the page that casts the vote.
If page A has received lots of votes, A will increase the
importance of page Z.
So to be included in Google’s top ranked results a page:
• must have lots of votes, and must have
• votes cast by pages that have received many votes of their
own.
Other ways Google ranks relevance
•Density = frequency
•Proximity = closeness of keywords
•Prominence = titles, links, tags
Google’s recipe for retrieving relevant results
is based on a combination of:
•PageRank
•Density of keywords
•Proximity of keywords
•Prominence of keywords
Google’s top results do not indicate
That these are the he top results
with the quality of the website.
1. Enter your topic in Google and Yahoo.
2. Compare the top 5 results in each search engine.
3. Do both search engines retrieve the same websites?
4. Do both search engines provide targeted ads?
Identify the 3 most important keywords (concepts).
Byron Hurt takes pains to say that he is a fan of rap, but over
time, says Mr. Hurt, a 36-year-old filmmaker, ''I began to
become very conflicted about the music I love.'' A new
documentary by Mr. Hurt, '‘Rap: Beyond Beats and Rhymes,''
questions the violence of women in much of rap music.
Excerpt from the New York Times
Keeping It Together: The Double Quote
Use the double quote:
________________________________
distinct individual:
“ kanye west”
organization:
“american medical society”
company
“general motors”
quote:
“to be or not to be”
Use double quotes for bound phrases:
“model minority”
“artificial intelligence”
“big bang”
We model friendship formation as a selection process constrained by individuals'
ability to make friends. Blacks are generally the most cohesive racial category,
although when whites are in the minority, they display stronger selective mixing
than do blacks when blacks are in the minority.
Search Engines and Stop words
Search engines ignore common or overused words:
a, the, of, for, how, who. . .
Keyword phrase:
The way to the school is hard when walking in the rain.
Stored keyword phrase:
* way * * school is hard when walking * * rain.
To include stop words in your search you can either use:
1. double quotes:
“The way to the school is long and hard when walking in the rain.”
2. + before each stop word:
+the +who
Avoid using double quotes with topic searches!
Google’s recipe for retrieving relevant results is based on a combination of:
•PageRank
Keeping It Together: The Double Quote
Use the double quote:
________________________________
distinct individual:
“ kanye west”
organization:
“american medical society”
company
“general motors”
quote:
“to be or not to be”
Use double quotes for bound phrases:
“model minority”
“artificial intelligence”
“big bang”
We model friendship formation as a selection process constrained by individuals'
ability to make friends. Blacks are generally the most cohesive racial category,
although when whites are in the minority, they display stronger selective mixing
than do blacks when blacks are in the minority.
Search Engines and Stop words
Search engines ignore common or overused words:
a, the, of, for, how, who. . .
Keyword phrase:
The way to the school is hard when walking in the rain.
Stored keyword phrase:
* way * * school is hard when walking * * rain.
To include stop words in your search you can either use:
1. double quotes:
“The way to the school is long and hard when walking in the rain.”
2. + before each stop word:
+the +who
Avoid using double quotes with topic searches!
Download