Visualize Textual Travelogue with Location-Relevant Images

advertisement
Visualize Textual Travelogue with
Location-Relevant Images
Xin Lu 1, Yanwei Pang 1*, Qiang Hao 1, Lei Zhang 2
1 Tianjin University
* Corresponding Author
2 Microsoft Research Asia
November 3, 2009
Outline
• Motivation & Challenge
• Our Solution
– Framework Overview
– Data Source
– Demo
• Conclusions and Future Work
7/27/2016
2
Outline
• Motivation & Challenge
• Our Solution
– Framework Overview
– Data Source
– Demo
• Conclusions and Future Work
7/27/2016
3
What is a travelogue
• What is a travelogue
– Text/article that records one's travel experience
• Where can we/you find travelogues
– Blog, forum, Web2.0 community, etc.
• What's the travelogue's difference from other
text
– User-generated content (UGC), rather than
expert's articles large amount and booming
informative to other tourists
7/27/2016
4
Why We Visualize the Travelogue
• Travelogues are huge knowledge resources
– People share others’ experience by reading
travelogues online
• Textual Travelogues
– Long and noisy
– Probably be written in foreign languages
• Travelogue Visualizing
– Highlight the useful information
– Visualize the useful information
7/27/2016
5
Why Travelogue Visualization is
difficult
• Travelogue De-noising
– Location-oriented
– Context words
• Image Retrieval and Ranking
– Semantic Gap between texts and images
7/27/2016
6
Outline
• Motivation & Challenge
• Our Solution
– Framework Overview
– Data Source
– Demo
• Conclusions and Future Work
7/27/2016
7
Location
Extraction
Image
retrieval
Similarity
Measure
Image
Ranking
• log-based model
• log- tag model
• tag- based model
7/27/2016
8
Similarity Measures
• Log-Model & Log-tag Model
– Log refers to travelogue
– Context words
Great Wall: ancient times; stable;
impregnable pass; No.1 in the world
Sanya Bay: sea sight; beach; sea food
7/27/2016
9
Similarity Measures
• Tag-Model
– Tags also are UGC
– De-noise tags
Topic Space
7/27/2016
10
Data Source
• 100K travelogues (automatically)
– All written in Chinese
– Downloaded from Ctrip
– GPS data and English Name for the most popular
10K locations
• 2500K images (automatically)
– Flickr 950K (plenty of tags)
– Picasa 300K (little tags)
– *Google 1200K (includes snippets of the image)
7/27/2016
11
We add 1200K images (includes
snippets of the image) from Google
Location
Extraction
Location
Context Words
Image
Retrieval
Image
Ranking
• Flickr images are retrieved based
on location
• Google images are retrieved by
“context words+ location”, which
makes candidate images sets
more relevant to the travelogue
7/27/2016
12
http://202.113.2.198
Location
Extraction
Location
Aspects Mining
Image Retrieval
Image Ranking
Images are ranked based
on the following three
points:
• image quality
• log-tag similarity
• Image diversity
7/27/2016
13
Demo
7/27/2016
14
Demo
7/27/2016
15
Outline
• Motivation & Challenge
• Our Solution
– Framework Overview
– Data Source
– Demo
• Conclusions and Future Work
7/27/2016
16
Conclusions and Future Work
• Travelogue visualization benefit common
people
– Travelogue is more easily to understand
– People all over the world benefit from others’
experience to plan trips
• Future work
– Further narrow the semantic gap using visual
features
– Improve evaluation approaches
7/27/2016
17
LBSN’ 2009
Nov.3, 2009, Seattle, WA, USA
Thank you!
Download