read - Eduard Heindl

advertisement
The Google Age
The Path: Search Engines goto
Artifical Intelligence
Professor Dr. Eduard Heindl
Topics





Googles history
The Google technology
Why is Google special
Why is Google so powerfull
The future of Google
Eduard Heindl, Heindl Internet AG
Google Stone Age
 1995 meeting of
Sergey Brin (23) and
Larry Page (24)
 1996 BackRub
System starts at
Stanford University
 100.000$ support by
Andy Bechtolsheim
 7. September 1998
Start Google Inc.
*September 21, 1999, the beta label came off the website.
Eduard Heindl, Heindl Internet AG
Googles Philosophy
perfect search engine
defined by co-founder Larry Page
as something that:
"understands exactly what you mean
and gives you back exactly what you
want."
Eduard Heindl, Heindl Internet AG
Live of a Query
Quelle: http://www.google.com/corporate/query.html
Eduard Heindl, Heindl Internet AG
The PageRank
 Google sorts by PageRank
 The more links point to a
document, the higher is the
rank
 But not all links are equal,
the PageRank of the referee
counts too!
 A recursives problem
 „solving an equation of
more than 500 million
variables and 2 billion
terms“ (source: Google)
Eduard Heindl, Heindl Internet AG
A
E
D
H
I
L
C
B
K
M
G
F
Link
O
N
The Link Matrix
A
B
C
D
E
F
G
H
I
K
L
M N
O
A
0
0
0
0
0
0
0
1
0
0
0
0
0
0
B
0
0
0
0
0
0
0
0
0
0
0
0
0
0
C
0
0
0
0
1
0
0
0
0
0
0
0
0
0
D
0
0
0
0
0
0
0
0
0
0
0
0
0
0
E
0
0
0
0
0
1
0
0
0
0
0
0
0
0
F
0
0
0
0
1
0
0
0
0
0
0
0
0
1
G
0
0
0
0
0
0
0
0
0
0
0
0
0
0
H
0
2
0
0
0
0
0
0
0
1
0
0
0
0
I
0
0
0
1
0
0
0
1
0
1
0
0
0
0
K
0
0
0
0
0
0
0
0
0
0
0
1
0
0
L
0
0
0
0
0
0
0
0
1
0
0
0
0
0
M 0
0
0
0
0
0
0
1
0
0
1
0
0
0
N
0
0
0
0
0
0
1
0
0
0
0
1
0
0
O
0
0
0
0
0
0
0
0
0
0
0
0
1
0
Eduard Heindl, Heindl Internet AG
A
E
D
H
I
L
C
B
K
M
G
F
Link
O
N
What is Intelligence
 Know the best source
 Google's technology uses the collective
intelligence of the web to determine a
page's importance1
 There is no human involvement or
manipulation of results1
 'The ultimate search engine would be
smart; it would understand everything in
the world,' says Page.2
[1] http://www.google.com/corporate/tech.html
[2] http://www.aaai.org/AITopics/assets/AIalerts/alert.12.18.02.html
Eduard Heindl, Heindl Internet AG
Why is Google special
Eduard Heindl, Heindl Internet AG
Domainnames of the top 5oo












Yahoo
Go
Goo
Gooooal
Cool
Room
Moon
Wanadoo
Football
Book
Cartoon
OO Objekt Oriented
http://www.alexa.com/site/ds/top_500
Eduard Heindl, Heindl Internet AG










Goodday tool
School
Choose
Look
Kategorie
Gold
Gov
Pogo
Bingo
God
 Google
The largest Engine
 The computational power of Google: a cluster with
100.000 nodes using 6 PetaByte harddisc storage (The
largest computational capacity on earth)*
 Google stores „every“ document of the web, more than
30 TeraByte
 4.285.199.774 documents, why?
 232=4.294.967.296 limit for longinteger!
(2005: changed)
 >1.000 querys per second
 First time in history, a company name is used for a
global verb: „googeln“
Footnote: a PetaByte = 250 Byte = 1024 TeraByte
* John Markoff in der NYT vom 13. April 2003
Eduard Heindl, Heindl Internet AG
Research
 Incredible fast
growth of R&D
expense!
 More than 100
Ph.D.´s work at
Google,
„industry's most
unorthodox
portfolio of
human capital “*
*NYT, June 6, 2004
Eduard Heindl, Heindl Internet AG
Research
Zeitgeist
 Google knows the
trends
 Displayed by
county
 Displayed by topics
 Statistical value
worth 200.000.000
querrys a day
www.google.com/press/zeitgeist.html
Eduard Heindl, Heindl Internet AG
Google News
 A robot reads the newspaper and writes



Google News sorts
using 2400 sources top
news within different
areas
Search within the news
is available
Best, just behind
Washington Post,
online journalismus
EPpy Award
Eduard Heindl, Heindl Internet AG
The Robot is not perfect
 Car driving, car race and a injury
Eduard Heindl, Heindl Internet AG
A small Difference
 Technik und Techno
Eduard Heindl, Heindl Internet AG
Google Ads
 Advertisement made simple
 Everybody can place an
advertisement related to any word
 Pay per click
 Words with a low clickrate are
stopped
 More often clicked ads are higher
ranked
Eduard Heindl, Heindl Internet AG
Good Ads bad Ads
 This ad was clicked
by 1,2% of the
user!
 And that one
received 50%
more! (1,9%)
 Ads optimization
for less money
than a pizza
Eduard Heindl, Heindl Internet AG
Another example
 Which ad sourced
more visitors?
 0,7%
 2,8%
 The data are highly
significant, reason:
3000 clicks were
counted
Eduard Heindl, Heindl Internet AG
Google Adsense
 The easy way to
make money
 Pay per click
 Return up to XX€1 per
thousand visitors
 Presentation of ads
depends on content
 Significant higher
click through rate
than classic banner
ads
[1]Google does not allow content-partners to present there income
Eduard Heindl, Heindl Internet AG
Und so lautet der
Beschluß
Daß die
Maschine
etwas lernen muß
And the decision is:
the machine has to learn something
Eduard Heindl, Heindl Internet AG
Learning
 Learning means: give results and get
better
WWW
Search Engine
Eduard Heindl, Heindl Internet AG
User
Search Engine Version 0.1
 Keywords before 1995
 FIZ Karlsruhe
 Patent search
Search
engine
Content
Bib
liothekar
Eduard Heindl, Heindl Internet AG
User
Search Engine I. Generation
 Full text search ~ 1995
 Lycos
 Altavista
WWW
Eduard Heindl, Heindl Internet AG
Search
Engine
User
Search Engine II. Generation
 New Algorithm
 Link structur
 Text cluster
WWW
Search Engine
Preprocessing
Eduard Heindl, Heindl Internet AG
User
Search Engine III. Generation
 Text understanding
 feedback
 Neuronal Algorithm
Search Engine
WWW
read
IQ
understand
Eduard Heindl, Heindl Internet AG
User
Forces
Search Engine
gives link
user
Nutzer
returns
use link
happy
SEO
Search Engine
Optinizer
improve
algorithm
use link
New
Search Engine
Web page
unfaitful
Receives
advertisement
money
Eduard Heindl, Heindl Internet AG
Nutzer
user
Not satisfied
Search
Engine
Optimized
Cycle
Smart
user
improve
algorithm
Search System
Receive
advertisement
money
Eduard Heindl, Heindl Internet AG
Content
distributer
Querrys
 Is the whole world represented
within the WWW?
WWW
 All documents are a human view
of the world
 A lot of documents are
incomplete or only copies
 It is hard to validate the content
by context
 But there is no other huge
digital source of knowledge
Eduard Heindl, Heindl Internet AG
Information within the Internet
 Multilingual content
 Few languages are relevant, 50% is
english
 Data are highly redundant
 Advantage if inconsistent
 Multimedia data (Images, movies)
 needs complex analysis
 Image-text relation
 Allows system to learn from image
Eduard Heindl, Heindl Internet AG
How to read
 Simple reading
read
Eduard Heindl, Heindl Internet AG
 Problem: many pages use
complex, inkoherent
structure (Table!)
 Problem: fast changing
content
 Datebase should use the
link structure of the WWW
Understand the World?
understand
 Does the system need
background knowledge?
 Can the system learn by
user habits?
 Is it neccessary to
understand the data
structure?
 Which algorithm is
efficent for learning?
Eduard Heindl, Heindl Internet AG
What is Intelligence
IQ
Eduard Heindl, Heindl Internet AG
 Knowledge for
successful actions
 Knowledge processing
 Best knowledge usage
 Knowledge expand by
additional information
 New knowledge
production
Future
Hard to predict
Eduard Heindl, Heindl Internet AG
The Google Wall
Supplier
Advertisement
tunnel
Media
Adwords
Eduard Heindl, Heindl Internet AG
Google
Adsense
Customer
Efficient Markets
 Interface to Information
Supplier
Customer
Optimized contribution
Eduard Heindl, Heindl Internet AG
Strategic Risic
Darkness within the Internet!
 What happens if:
 Google stopps?
 Hacker attack
 Physical attack
 Some countries receive manipulation
 Censorship
 Results interchanged
 Change of Ownership (e.g. Microsoft)
Eduard Heindl, Heindl Internet AG
The Stone Age
 An Age
If there is a special matter between man
and material
Information age: there is a special system between
man and information, how should we call this epoche?
Eduard Heindl, Heindl Internet AG
Google goes Public
2.718.281.828 Shares, price 0,01 $
DON’T BE EVIL
Don’t be evil. We believe strongly that in the long term, we will be
better served—as shareholders and in all other ways—by a
company that does good things for the world even if we forgo
some short term gains. This is an important aspect of our culture
and is broadly shared within the company.
Risks Related to Our Business and Industry
We face significant competition from Microsoft and Yahoo.
Eduard Heindl, Heindl Internet AG
How much is Google worth
 Aproximation (short term)
 200.000.000 Search results a day
 Value per result 5ct
 Anual return $ 3,6 G
 Maximum value (long term)
 500 Mio. Google user
 Save 5 min a day = $ 1,0
 Anual return = $ 182,5 G
+++ current value +++ about 80 G$ +++ stock
Eduard Heindl, Heindl Internet AG
Additional Reading
 This Lecture: heindl.de/google







google.com/about.html
google.com/ads
google.com/adsense
labs.google.com
labs.google.com/papers.html
google.indicateur.com
searchenginewatch.com
The End
Eduard Heindl, Heindl Internet AG
Download