Homework 4

advertisement
Homework 4
• Final homework
• Deadline: Sunday April 20, 11.59 PM
• In this homework you have to write a short essay on how
Google can handle new types of data on the Web like
video, social networks, data from RFID chips, data about
businesses on Google Local, maps, etc. More
information is given on the following slides.
• There is no limit on the number of words
• Organization and Flow of ideas matter the most
Homework 4
• Google claims that its mission is to organize the world's
information and to make it universally accessible and
useful. Over the past few years, Google has had to deal
with increasing heterogeneity of data on the Web. The
Web now includes many data types apart from HTML text:
– Video
– Images. For example: photographs, scanned pages of books,
satellite images
– Data about people and relationships from social networks
(Facebook.com, del.icio.us)
– Data from Radio Frequency ID (RFID) chips (page 254 in the
textbook)
– Blogs (page 266 in the textbook)
– Data on the Semantic Web
Homework 4
• We saw how Google handles HTML pages with text and
links. In this homework, you should answer the following
three questions for three new types of data:
– What is a good search interface?
– How should crawling and indexing be done?
– How should the search queries be processed?
• The three new types of data you consider should include
(i) video and (ii) data from social networks. You can
choose the third data type. Here are some additional
issues that you can consider for extra credit.
–
–
–
–
–
Ranking algorithms for search results (e.g., PageRank)
How to make money, e.g., through advertising
Fraud
Performance metrics
Metasearch
Download