Database of Intentions January 15, 2008 Discussion Leader: Professor Shivnath Babu Scribe: Minerva Thai READING OVERVIEW Database of Intentions – introductory chapter of a book by John Batelle author lived in Silicon Valley and headed an internet-based company at the time internet economic interest (advertisement, e-commerce) was prevalent stock market crashed in 2001. 2001 was also the year of the 9/11 incident; Batelle’s company crashed too discovered Google’s Zeitgeist and grew interested in its functions *Discussion leader showed the website and asked for the definition of “zeitgeist” definition – the general intellectual, moral and cultural climate of an era (on site) leader clicked link to see 2007 Year-End graphs (matched social trends) graphs showed months of searches and frequency of searches ex. “Virginia Tech” term spiked in April due to the shooting leader asked “What is required to generate the graphs? searched terms are logged, time of search, and location of search location determined by IP address “rising” terms are those searched more frequently than in previous year “falling” terms are those searched less frequently than in previous year Zeitgeist ex. “most searched-for candidates” : how is the graph made? input must be made to alert Google of which candidates are running leader asked “What are the uses of Zeitgeist?” marketing – companies can determine what’s popular to base ads on ex. Britney Spears became more popular towards 2007’s end keep in mind that the data will be past information patterns – predictions may be made based on popularity patterns ex. similar to noting earthquakes and their patterns of occurrence additional comments on the Zeitgeist graphs are possible explanations of spikes Batelle realized that Google/Zeitgeist was more technologically advanced than the Mac decided to write a book about search *Discussion leader asked “What’s the difference between the terms Internet and Web?” Internet – connection of networks and machines globally Web – data available on all machines, named for the connections of links/pages *Student asked “Is there unlimited space on the internet?” space is determined by the providing servers and the space on them space denoted by kilobytes, megabytes, gigabytes, terabytes, and petabytes 2¹º bytes (1024), 2²º bytes, 2³º bytes, etc respectively can run out of space but the cost is lowering for space so it is easier to expand functions like zipping files (WINZip) are also contributing to more data Batelle shows the sides of Google that most people do not think about DISCUSSION POINTS Database of Intentions (DBI) “What is it? How does Google get it?” database – stored area of information; is the DBI the same as the Zeitgeist? refer to Ch. 1, page 6, paragraph 3, sentence 2 for definition leader asked “what does ‘every path taken as a result’ mean?” indicates the link that is clicked “Is it specific to Google?” No, any search site has a DBI ex. Amazon (saves your purchases, browses, etc) “How can Google use it?” refer to points used in Ch. 1, page 2, paragraph 3, last few questions leader directed class to questions in Ch. 1, page 12, opening paragraph ex. Japanese teenagers – unanswerable; lack of age group discernment guesses can be made based on histories of searches of teenagers ex. Pop star’s selling – determinable through e-commerce sites Amazon may track purchases and frequency ex. Suburban moms – unanswerable; search does not mean answer guesses can be made based on # of clicks and paths taken signing into an account may add to Google’s information on a user other media searches such as videos/images determine data as well “How can Google ‘abuse’ it?” ex. tracking terrorism by finding out which searches were done where ex. user searching AIDS may experience an increase in insurance premium Google might use the search query to inform the user’s health insurance “What prevents Google from ‘abusing’ it?” Electronic Communications Privacy Act protects email from other users’ eyes Google “machine” can still read your emails (most apparent through ads) ex. Leader’s friend’s wife was leaving for India and sent an email an ad appeared on that same page about filing divorces “What technological/social changes/advances make it possible to store/query DBI?” cost of storage has decreased over the past 10 years there are more users on the internet (higher bandwidth); more homes w/internet services such as YouTube and Facebook have increased internet popularity more users have trust in online companies and the internet “Search as a problem is about 5% solved” – Udi Manber, CEO of A9.com leader asked “Are you happy with how Google is now? What’s wrong with it?” Relevance – harder due to spam sites & malicious internet servers Skill – ideas must be broken up into specific keywords to generate results ex. perfect article for a paper may not contain a searched term Not human – unable to process certain concepts/ideas which a person can ex. searching “jaguar” the animal results in Jaguar the car Interesting terms Alexandra, Boswell (constant companion and observer) - page 1 technology vs. media business – pages 3 & 4 Clickstream (which user clicked on which particular link) Patriot Act – page 14 Turing test (AI test; a passing machine would seem human) – page 16 directory-based search – page 17 second generation web applications – pages 7, 11, 14