The Google Age The Path: Search Engines goto Artifical Intelligence Professor Dr. Eduard Heindl Topics Googles history The Google technology Why is Google special Why is Google so powerfull The future of Google Eduard Heindl, Heindl Internet AG Google Stone Age 1995 meeting of Sergey Brin (23) and Larry Page (24) 1996 BackRub System starts at Stanford University 100.000$ support by Andy Bechtolsheim 7. September 1998 Start Google Inc. *September 21, 1999, the beta label came off the website. Eduard Heindl, Heindl Internet AG Googles Philosophy perfect search engine defined by co-founder Larry Page as something that: "understands exactly what you mean and gives you back exactly what you want." Eduard Heindl, Heindl Internet AG Live of a Query Quelle: http://www.google.com/corporate/query.html Eduard Heindl, Heindl Internet AG The PageRank Google sorts by PageRank The more links point to a document, the higher is the rank But not all links are equal, the PageRank of the referee counts too! A recursives problem „solving an equation of more than 500 million variables and 2 billion terms“ (source: Google) Eduard Heindl, Heindl Internet AG A E D H I L C B K M G F Link O N The Link Matrix A B C D E F G H I K L M N O A 0 0 0 0 0 0 0 1 0 0 0 0 0 0 B 0 0 0 0 0 0 0 0 0 0 0 0 0 0 C 0 0 0 0 1 0 0 0 0 0 0 0 0 0 D 0 0 0 0 0 0 0 0 0 0 0 0 0 0 E 0 0 0 0 0 1 0 0 0 0 0 0 0 0 F 0 0 0 0 1 0 0 0 0 0 0 0 0 1 G 0 0 0 0 0 0 0 0 0 0 0 0 0 0 H 0 2 0 0 0 0 0 0 0 1 0 0 0 0 I 0 0 0 1 0 0 0 1 0 1 0 0 0 0 K 0 0 0 0 0 0 0 0 0 0 0 1 0 0 L 0 0 0 0 0 0 0 0 1 0 0 0 0 0 M 0 0 0 0 0 0 0 1 0 0 1 0 0 0 N 0 0 0 0 0 0 1 0 0 0 0 1 0 0 O 0 0 0 0 0 0 0 0 0 0 0 0 1 0 Eduard Heindl, Heindl Internet AG A E D H I L C B K M G F Link O N What is Intelligence Know the best source Google's technology uses the collective intelligence of the web to determine a page's importance1 There is no human involvement or manipulation of results1 'The ultimate search engine would be smart; it would understand everything in the world,' says Page.2 [1] http://www.google.com/corporate/tech.html [2] http://www.aaai.org/AITopics/assets/AIalerts/alert.12.18.02.html Eduard Heindl, Heindl Internet AG Why is Google special Eduard Heindl, Heindl Internet AG Domainnames of the top 5oo Yahoo Go Goo Gooooal Cool Room Moon Wanadoo Football Book Cartoon OO Objekt Oriented http://www.alexa.com/site/ds/top_500 Eduard Heindl, Heindl Internet AG Goodday tool School Choose Look Kategorie Gold Gov Pogo Bingo God Google The largest Engine The computational power of Google: a cluster with 100.000 nodes using 6 PetaByte harddisc storage (The largest computational capacity on earth)* Google stores „every“ document of the web, more than 30 TeraByte 4.285.199.774 documents, why? 232=4.294.967.296 limit for longinteger! (2005: changed) >1.000 querys per second First time in history, a company name is used for a global verb: „googeln“ Footnote: a PetaByte = 250 Byte = 1024 TeraByte * John Markoff in der NYT vom 13. April 2003 Eduard Heindl, Heindl Internet AG Research Incredible fast growth of R&D expense! More than 100 Ph.D.´s work at Google, „industry's most unorthodox portfolio of human capital “* *NYT, June 6, 2004 Eduard Heindl, Heindl Internet AG Research Zeitgeist Google knows the trends Displayed by county Displayed by topics Statistical value worth 200.000.000 querrys a day www.google.com/press/zeitgeist.html Eduard Heindl, Heindl Internet AG Google News A robot reads the newspaper and writes Google News sorts using 2400 sources top news within different areas Search within the news is available Best, just behind Washington Post, online journalismus EPpy Award Eduard Heindl, Heindl Internet AG The Robot is not perfect Car driving, car race and a injury Eduard Heindl, Heindl Internet AG A small Difference Technik und Techno Eduard Heindl, Heindl Internet AG Google Ads Advertisement made simple Everybody can place an advertisement related to any word Pay per click Words with a low clickrate are stopped More often clicked ads are higher ranked Eduard Heindl, Heindl Internet AG Good Ads bad Ads This ad was clicked by 1,2% of the user! And that one received 50% more! (1,9%) Ads optimization for less money than a pizza Eduard Heindl, Heindl Internet AG Another example Which ad sourced more visitors? 0,7% 2,8% The data are highly significant, reason: 3000 clicks were counted Eduard Heindl, Heindl Internet AG Google Adsense The easy way to make money Pay per click Return up to XX€1 per thousand visitors Presentation of ads depends on content Significant higher click through rate than classic banner ads [1]Google does not allow content-partners to present there income Eduard Heindl, Heindl Internet AG Und so lautet der Beschluß Daß die Maschine etwas lernen muß And the decision is: the machine has to learn something Eduard Heindl, Heindl Internet AG Learning Learning means: give results and get better WWW Search Engine Eduard Heindl, Heindl Internet AG User Search Engine Version 0.1 Keywords before 1995 FIZ Karlsruhe Patent search Search engine Content Bib liothekar Eduard Heindl, Heindl Internet AG User Search Engine I. Generation Full text search ~ 1995 Lycos Altavista WWW Eduard Heindl, Heindl Internet AG Search Engine User Search Engine II. Generation New Algorithm Link structur Text cluster WWW Search Engine Preprocessing Eduard Heindl, Heindl Internet AG User Search Engine III. Generation Text understanding feedback Neuronal Algorithm Search Engine WWW read IQ understand Eduard Heindl, Heindl Internet AG User Forces Search Engine gives link user Nutzer returns use link happy SEO Search Engine Optinizer improve algorithm use link New Search Engine Web page unfaitful Receives advertisement money Eduard Heindl, Heindl Internet AG Nutzer user Not satisfied Search Engine Optimized Cycle Smart user improve algorithm Search System Receive advertisement money Eduard Heindl, Heindl Internet AG Content distributer Querrys Is the whole world represented within the WWW? WWW All documents are a human view of the world A lot of documents are incomplete or only copies It is hard to validate the content by context But there is no other huge digital source of knowledge Eduard Heindl, Heindl Internet AG Information within the Internet Multilingual content Few languages are relevant, 50% is english Data are highly redundant Advantage if inconsistent Multimedia data (Images, movies) needs complex analysis Image-text relation Allows system to learn from image Eduard Heindl, Heindl Internet AG How to read Simple reading read Eduard Heindl, Heindl Internet AG Problem: many pages use complex, inkoherent structure (Table!) Problem: fast changing content Datebase should use the link structure of the WWW Understand the World? understand Does the system need background knowledge? Can the system learn by user habits? Is it neccessary to understand the data structure? Which algorithm is efficent for learning? Eduard Heindl, Heindl Internet AG What is Intelligence IQ Eduard Heindl, Heindl Internet AG Knowledge for successful actions Knowledge processing Best knowledge usage Knowledge expand by additional information New knowledge production Future Hard to predict Eduard Heindl, Heindl Internet AG The Google Wall Supplier Advertisement tunnel Media Adwords Eduard Heindl, Heindl Internet AG Google Adsense Customer Efficient Markets Interface to Information Supplier Customer Optimized contribution Eduard Heindl, Heindl Internet AG Strategic Risic Darkness within the Internet! What happens if: Google stopps? Hacker attack Physical attack Some countries receive manipulation Censorship Results interchanged Change of Ownership (e.g. Microsoft) Eduard Heindl, Heindl Internet AG The Stone Age An Age If there is a special matter between man and material Information age: there is a special system between man and information, how should we call this epoche? Eduard Heindl, Heindl Internet AG Google goes Public 2.718.281.828 Shares, price 0,01 $ DON’T BE EVIL Don’t be evil. We believe strongly that in the long term, we will be better served—as shareholders and in all other ways—by a company that does good things for the world even if we forgo some short term gains. This is an important aspect of our culture and is broadly shared within the company. Risks Related to Our Business and Industry We face significant competition from Microsoft and Yahoo. Eduard Heindl, Heindl Internet AG How much is Google worth Aproximation (short term) 200.000.000 Search results a day Value per result 5ct Anual return $ 3,6 G Maximum value (long term) 500 Mio. Google user Save 5 min a day = $ 1,0 Anual return = $ 182,5 G +++ current value +++ about 80 G$ +++ stock Eduard Heindl, Heindl Internet AG Additional Reading This Lecture: heindl.de/google google.com/about.html google.com/ads google.com/adsense labs.google.com labs.google.com/papers.html google.indicateur.com searchenginewatch.com The End Eduard Heindl, Heindl Internet AG