Real-time Search

advertisement
‫היה מנוע? –‬
‫מגמות וחידושים במנועי חיפוש‬
‫חיפוש זמן‪-‬אמת‬
‫‪Real-time Search‬‬
‫אריאל פרנק‬
‫טלי שרון‬
‫מחלקה למדעי המחשב‬
‫אוניברסיטת בר‪-‬אילן‬
‫‪ariel@cs.biu.ac.il‬‬
‫‪Sharon-IT‬‬
‫‪taly@sharon-it.com‬‬
‫‪A. Frank & T. Sharon‬‬
Search (Engines) Technical Buzzwords
•
•
•
•
•
•
•
2
Real-time (web, search, conversation)
Multimedia RT (video, visual)
Mobile (local, modes, diversity)
Personalized (history/privacy)
Social (networks, circle, trusted)
Human (crafted results, Q&A)
Semantic (NLP,
context-aware,
personal knowledge)
Contents
•
•
•
•
3
What is “Real-time Search”?
Real-time Search Engines
Multimedia Real-time Search Engines
(Google) Mobile Search
A. Frank & T. Sharon
What is “Real-time” … :-?)
4
A. Frank & T. Sharon
http://www.acm.org/crossroads/xrds2-3/gfx/realtime.gif
What is “Real-time Web” (timing)?
• Microblogging (several seconds):
– tweet, status update, post, talk, chat, ...
– e.g., Twitter, FriendFeed, Facebook, MySpace, …
• Social Media Sites (many seconds):
– realize, locate, upload, share, expose, comment, …
– e.g., Digg, Delicious, Flickr, YouTube, Yahoo Answers, …
• Blogosphere (several minutes):
– realize, check, write, post, break news, …
• News Media Sites (many minutes):
– gather, investigate, verify, write, edit, report, …
5
What is “Real-time Search” (timeliness)?
 Is the emphasis on real-time search?
 “…finding the right answer to your question based on what’s
available right now, about the subject you care about right
now. Real-time search is finding the ‘Right Answer, Right
Now.’” (Kimbal Musk)
 Or is it on real-time search?
6
 “...looking through material that literally is published in
real-time. In other words, material where there’s practically
no delay between composition and publishing.” (Danny
Sullivan)
 “I want to know what’s happening in the world right now,
I’m interested in gaining access to data that’s been created
(written, on video or in photographs) in the last few seconds,
not data that has simply been made available in the last few
seconds.” (Phil Bradley)
Really “Real-time Browse” (discovery)?!
• Real-time search/browse enables discovery of
what's “hot” right now – the trending topics.
• A key quality measure for real-time search
results is provision of a meaningful summary
of the results.
• Leads to use of hot topics as new search terms.
• Enables to swing through the real-time
information jungle, exploring related concepts
as they emerge.
7
8
http://inchoo.net/wp-content/uploads/2009/07/real-time-search.jpg
Unique “Real-time Search” Challenges
• Real-time “Information Explosion” –
– rapid pace and sheer amount of real-time data being
continuously produced and pushed.
– instantaneous filter/index/update of the “firehose feed”.
– need delicate balance of weighing relevance with immediacy
and popularity, to prevent “information overload”.
• Susceptible Authorities –
– dependence on the fact that sources who say that they’re at
the scene of a specific event or on top of things, actually are.
• Signal to Noise Ratio –
9
– Spam: once a subject becomes a hot topic/trend some
piggyback to get across their particular marketing messages.
– Duplication: much information is replicated, not really new.
– Mining: how to extract meaning of the “real conversation”.
Contents
•
•
•
•
10
What is “Real-time Search”?
Real-time Search Engines
Multimedia Real-time Search Engines
(Google) Mobile Search
A. Frank & T. Sharon
Any Real-time Search Engines?!
11
Classification of RT SEs
• Microblog-search:
– Twitter, FriendFeed, FaceBook, CrowdEye, Twazzup,
Topsy, Sency, TweetMeme, Twitscoop, Tinker, dailyRT,
Almost.at, TipTop, TweetGrid, Ellerdale
• Mega-search:
– Collecta, OneRoit, Scoopler, Twingly, Yauba,
SocialMention, Itpints, Thoora, IceRocket, Wowd
• Meta-search:
– Stinky Teddy, LeapFish, Scour, Faxo
• Part-of-Traditional-SEs:
– Google, Bing, Yahoo
12
Twitter Search Interface
13
Twitter Example
14
Twitter “Advanced Search”
15
Twitter Search Operators
16
Twitter Is the Future of News?!
17
• Recent analysis reveals that Twitter is remarkably
effective at spreading "important" information.
• A multi-part analysis of Twitter reveals that it's a
surprisingly interconnected network and an effective
way to filter quality information.
• “...No matter how many followers a user has, the tweet
is likely to reach [an audience of a certain size] once
the user's tweet starts spreading via retweets, that is,
the mechanism of retweet has given every user the
power to spread information broadly [...] Individual
users have the power to dictate which information is
important and should spread by the form of retweet
[...] In a way we are witnessing the emergence of
collective intelligence.” (Kwak et al.)
Collecta Search Interface
18
Collecta
• Results are continuously updated until the user
pauses the search; This allows for a constant
stream of results from all over the web.
• A preview pane allows the person searching to
look at the article, comment, or news report
without leaving the window.
• Collecta allows for more than one search at a
time in the same window.
19
Collecta “Search Options”
20
Collecta “Share Search”
21
Stinky Teddy Search Interface
22
Stinky Teddy MSE Sources
•
•
•
•
•
•
23
Twitter (microblog)
Collecta (blogs, comments, news)
OneRiot (real-time pulse of the web)
Bing (web, news, image)
Yahoo! (web, news image)
VideoSurf (videos with thumbnail previews)
Stinky Teddy Example
24
Google Awareness 
25
Google’s Real-time Search
26
• Google’s real-time search results are an extension of
Google's Universal search results.
• Triggered algorithmically based on query volume.
• When a topic or keyword reaches a predefined
frequency, Google's algorithm automatically begins
using the real-time search page.
• Real-time search features are based on more than a
dozen new search technologies that enable monitoring
billions of documents and processing
hundreds of millions of real-time
changes each day.
Google’s Real-time Search Sources
• Shows news headlines, blog headlines, and social
media comments, along with Google's regular results.
• Blogs and news sites that publish regularly are
normally crawled many times per day due to the
frequency of new content added.
• Social media comments currently come from Twitter,
Facebook, FriendFeed, Google Buzz, Jaiku, MySpace,
TwitArmy, and Identi.ca via API data feeds.
• Timestamps are included so that users can choose the
timeliest results.
27
Google’s “News results”
28
Sample Signals for Google RT Search
29
• A sudden spike in the prevalence of a word or
combination of words in a message.
• A message on a commonly discussed topic that
includes unusual phrasing, shifts in language and other
deviations from predicted behavior.
• A Twitter user who attracts many followers, and
whose tweets are often "re-tweeted" by other users,
is assumed to have more authority.
• Facebook users gain authority as their friends multiply,
particularly if those friends also have many friends.
• The geo-location of someone sending a message from
a local event is more valuable than those of someone
hundreds of miles away.
Google’s “Latest results”
30
Google’s “Show options”
31
Google’s “Twitter Timeline”
32
Google’s Twitter Replay
33
Googlism 
34
A. Frank & T. Sharon
The “Israeli journalist gag order”?!
35
Google Suggest!!
36
• Completes search terms as they are typed into
the search box.
• Generates suggestions for popular “automatic
search completions”.
• Notices increases in the amount of search for
combinations of certain terms.
• Uses data about the overall popularity of
various searches to help rank the refinements
it offers.
• Can be used as a real-time search radar.
Google gets to know “Anat Kamm”?
37
“Change in Search” for Anat Kamm
38
Real-time Search?!
39
7th Eye’s article on Anat Kamm
40
Yedioth Aharonoth relevant column
41
Graffiti: “Google Anat Kamm”
42
Contents
•
•
•
•
43
What is “Real-time Search”?
Real-time Search Engines
Multimedia Real-time Search Engines
(Google) Mobile Search
A. Frank & T. Sharon
Multimedia Real-time Search Engines
44
Classification of MM RT SEs
• Image/Photos-search:
– NachoFoto, yfrog, PicFog, IceRocket
• Music/Songs-search:
– Twittify (Twitter+Spotify), Qloud
• Video/Movies-search:
– TwitVid, yfrog, Twitmatic, IceRocket
• Visual-search:
– Spezify, Surchur, TwitterVision
45
NachoFoto Search Interface
46
NachoFoto Internals
•
Has 4 different types of image search queries:
1.
2.
3.
4.
•
Employs 4 main factors to influence its image search results:
1.
2.
3.
4.
47
Static search terms (e.g., “apple”, “girl with balloon” )
Dynamic keywords whose meaning doesn’t change but whose images
undergo significant change with time (e.g., Olympics)
Dynamic keywords whose meaning and images change significantly
with time (e.g., “9/11”)
New keywords (e.g., “iPad”).
Freshness factor (i.e., how new is the image?)
Image density of a webpage (websites that interlink their photos are
given higher ranking than those who don’t)
Inward links (websites that link internally, especially with relevant
anchor text, are given higher ranking)
Domain authority (domains with fresh, family-friendly images are
given higher priority).
NachoFoto Example
48
Twittify
49
TwitVid Search Interface
50
TwitVid
• TwitVid, which offers a popular tool for adding video
to Twitter, aggregates all TwitVid and YouTube video
links being shared on Twitter.
• Ranks videos based on relevancy to a search term as
well as popularity and "buzz" on Twitter.
• The newest, most relevant videos are ranked highest.
• Offers is a unique video analytics tool that lets
members track their tweeted videos online by the day,
week, month or total number of times viewed.
51
TwitVid Example
52
Spezify Search Interface
53
Spezify Example
54
Spezify
• Presents results from a large number of
websites in different visual ways.
• Gives a good overview of a subject so as to be
able to find useful information.
• Mixes all media types and make no difference
between blogs, videos, microblogs and images.
• Everything communicates and helps building
the bigger picture.
55
Contents
•
•
•
•
56
What is “Real-time Search”?
Real-time Search Engines
Multimedia Real-time Search Engines
(Google) Mobile Search
A. Frank & T. Sharon
Google Mobile Search
• Sensor-rich smartphones are redefining
nowadays what a “query” means.
• Search can be carried out by several new
modes: gesture, voice, location and sight.
• Use location of your mobile phone for “Search
with My Location” or for showing you “What's
Nearby“ on Google Maps.
• Google Goggles is a visual search application
that lets you search for objects using images
rather than words, using your camera phone.
57
Google’s “Gesture Search”
58
Google’s “Voice Search”
59
Google’s “Search with My Location”
60
"What's Nearby" for Google Maps
61
Google Goggles Search Options
62
Google Goggles on Mobile
63
General References
• Danny Sullivan, What Is Real Time Search? Definitions &
Players, Search Engine Land, 9 July 2009,
http://searchengineland.com/what-is-real-time-search-definitions-players-22172
• Kimbal Musk, RE: What Is Real Time Search? Definitions &
Players, 9 July 2009, http://blog.oneriot.com/content/2009/07/re-whatis-real-time-search-definitions-players/
• Phil Bradley, Search Engines: Real-time Search, Ariadne
Issue 61, 30 October 2009,
http://www.ariadne.ac.uk/issue61/search-engines
• Ron Jones, Real-time Search 101, ClickZ, 19 April 2010,
http://www.clickz.com/3640099
• Mark Drummond, What's the Job of a Real-time Search
Engine?, Search Engine Watch, 29 April 2010,
http://searchenginewatch.com/3640197
• Nicholas Carr, Real-time Search, Technology Review,
May/June 2010, http://www.technologyreview.com/computing/25079/?a=f
64
Specific References
• Christopher Mims, Why Twitter Is the Future of News,
Technology Review, 30 April 2010,
http://www.technologyreview.com/blog/guest/25128/?nlid=2
946&a=f
• Ido Kenan, Google Anat Kamm: Interesting Gag Order
Circumvention Methods, Ido Kenan’s blog, 9 April 2010,
http://www.room404.net/eng/?p=305
• Amit Singhal, Relevance Meets the Real-time Web, Google
Blog, 12 July 2009,
http://googleblog.blogspot.com/2009/12/relevance-meetsreal-time-web.html
• Vic Gundotra, Mobile Search for a New Era: Voice, Location
and Sight, Google Mobile Blog, December 7 2009,
http://googlemobile.blogspot.com/2009/12/mobile-searchfor-new-era-voice.html
65
General Videos
66
• Real-time Search Engines Rush to Fill New Need, WPN Videos,
10 July 2009, http://videos.webpronews.com/2009/07/10/realtime-search-engines-rush-to-fill-new-need/
• John Edet, What's the Latest with the Real time Search
Engines, November 2009, http://vimeo.com/7602248
• Nicholas Carr, Real-time Search, Technology Review,
May/June 2010,
http://www.technologyreview.com/video/?vid=556
• The Search for the Real Conversation, WPN Videos,
20 October 2009,
http://videos.webpronews.com/2009/10/20/the-search-forthe-real-conversation/
• What Is The Real Time Web?, 9 February 2010,
http://www.youtube.com/watch?v=hOlcD506z_Y
Specific Videos
• Google Real-time Search, 7 December 2009,
http://www.youtube.com/watch?v=WRkYmx4A9Do
• Google Mobile App for iPhone with Voice Search,
10 November 2008,
http://www.youtube.com/watch?v=GQ3Glr5Ff28
• Introducing Search with My Location, 11 September 2008,
http://www.youtube.com/watch?v=KMT7Deky9iY
• Real Time Search Event, 8 December 2009,
http://www.youtube.com/watch?v=oXHHkROejik#t=21m05s
• Google Goggles, 6 December 2009,
http://www.youtube.com/watch?v=Hhgfz0zPmH4
• Google Goggles Demo Up Close, 7 December 2009,
http://www.youtube.com/watch?v=GgcE_EQRpdA
67
Any Real-time Questions :-?)
68
A. Frank & T. Sharon
Download