Visualize Textual Travelogue with Location-Relevant Images Xin Lu 1, Yanwei Pang 1*, Qiang Hao 1, Lei Zhang 2 1 Tianjin University * Corresponding Author 2 Microsoft Research Asia November 3, 2009 Outline • Motivation & Challenge • Our Solution – Framework Overview – Data Source – Demo • Conclusions and Future Work 7/27/2016 2 Outline • Motivation & Challenge • Our Solution – Framework Overview – Data Source – Demo • Conclusions and Future Work 7/27/2016 3 What is a travelogue • What is a travelogue – Text/article that records one's travel experience • Where can we/you find travelogues – Blog, forum, Web2.0 community, etc. • What's the travelogue's difference from other text – User-generated content (UGC), rather than expert's articles large amount and booming informative to other tourists 7/27/2016 4 Why We Visualize the Travelogue • Travelogues are huge knowledge resources – People share others’ experience by reading travelogues online • Textual Travelogues – Long and noisy – Probably be written in foreign languages • Travelogue Visualizing – Highlight the useful information – Visualize the useful information 7/27/2016 5 Why Travelogue Visualization is difficult • Travelogue De-noising – Location-oriented – Context words • Image Retrieval and Ranking – Semantic Gap between texts and images 7/27/2016 6 Outline • Motivation & Challenge • Our Solution – Framework Overview – Data Source – Demo • Conclusions and Future Work 7/27/2016 7 Location Extraction Image retrieval Similarity Measure Image Ranking • log-based model • log- tag model • tag- based model 7/27/2016 8 Similarity Measures • Log-Model & Log-tag Model – Log refers to travelogue – Context words Great Wall: ancient times; stable; impregnable pass; No.1 in the world Sanya Bay: sea sight; beach; sea food 7/27/2016 9 Similarity Measures • Tag-Model – Tags also are UGC – De-noise tags Topic Space 7/27/2016 10 Data Source • 100K travelogues (automatically) – All written in Chinese – Downloaded from Ctrip – GPS data and English Name for the most popular 10K locations • 2500K images (automatically) – Flickr 950K (plenty of tags) – Picasa 300K (little tags) – *Google 1200K (includes snippets of the image) 7/27/2016 11 We add 1200K images (includes snippets of the image) from Google Location Extraction Location Context Words Image Retrieval Image Ranking • Flickr images are retrieved based on location • Google images are retrieved by “context words+ location”, which makes candidate images sets more relevant to the travelogue 7/27/2016 12 http://202.113.2.198 Location Extraction Location Aspects Mining Image Retrieval Image Ranking Images are ranked based on the following three points: • image quality • log-tag similarity • Image diversity 7/27/2016 13 Demo 7/27/2016 14 Demo 7/27/2016 15 Outline • Motivation & Challenge • Our Solution – Framework Overview – Data Source – Demo • Conclusions and Future Work 7/27/2016 16 Conclusions and Future Work • Travelogue visualization benefit common people – Travelogue is more easily to understand – People all over the world benefit from others’ experience to plan trips • Future work – Further narrow the semantic gap using visual features – Improve evaluation approaches 7/27/2016 17 LBSN’ 2009 Nov.3, 2009, Seattle, WA, USA Thank you!