Searching and browsing TED Talks like you never imagined

advertisement
Searching and browsing TED Talks
like you never imagined
José Luis Redondo García – redondo@eurecom.fr
GO! @peputo
Raphaël Troncy – raphael.troncy@eurecom.fr
@rtroncy
Mariella Sabatino – mariella.sabatino@eurecom.fr
@marSabs
Pasquale Lisena – pasquale.lisena@eurecom.fr
@PasqLisena
2014 HOT SPOTS
CHAPTERS
ENTITIES
RELATED TED’S
CHAPTERS
2006 .com
COURSES
1984 20/10/2014 UNDERSTANDING
ENVIRONMENT: A SYSTEM
APPROACH
THE MYSTERIOUS
FIELD OF
ENGINEERING
SYSTEMS
SYSTEMS
PRACTICE:
MANAGING
SUSTAINABILITY
2 New Consuming Paradigm
Users overwhelmed with audio-­‐
visual content Can the video be divided into meaningful fragments? How can users easily find related documents which complement the video 20/10/2014 LinkedUp VICI Challenge @ ISWC 2014 How can those fragments be properly described? What are the poten@ally relevant fragments ? 3 HyperTED
Media Fragment support w  Chapters w  Hot Spots Media Fragment annota9ons w  Named En@ty Extrac@on w  Topic Detec@on Hyperlinking w  With TED talks chapters w  With other educa@onal online resources 20/10/2014 LinkedUp VICI Challenge @ ISWC 2014 4 Media Fragments
A Media Fragment is a por@on of a mul@media resource. Temporal Fragments sec@ons along the @me dimension of the media hVp://www.w3.org/TR/media-­‐frags/ resource with a start and an end point. hVp://www.w3.org/TR/media-­‐frags/ 20/10/2014 LinkedUp VICI Challenge @ ISWC 2014 5 MF: Chapters
TED Talks have paragraphs: a human-­‐made subdivision of sub@tles 20/10/2014 LinkedUp VICI Challenge @ ISWC 2014 6 MF: Chapters
20/10/2014 LinkedUp VICI Challenge @ ISWC 2014 7 Annotations: Named Entities
“This is Nikita, a security guard from one of the bars in St. Petersburg.” NER “This is Nikita, a security guard from one of the bars in St. Petersburg.” PERSON FUNCTION Category: type in the NER task. LOCATION Example taken from the transcript of hVps://www.ted.com/talks/2089 20/10/2014 Natural Language Processing (NPL) Task à disambigua@ng URL in a knowledge base. E.g. hVp://dbpedia.org/resource/
Saint_Petersburg. LinkedUp VICI Challenge @ ISWC 2014 8 NER Extractors
•  Integrates different NER tools available on Web. •  Unify NER extractors in a common output. 20/10/2014 LinkedUp VICI Challenge @ ISWC 2014 9 Annotations: Named Entities
20/10/2014 LinkedUp VICI Challenge @ ISWC 2014 10 Annotations: Topics
Chapter 3 “I'm wearing a camera, just a simple webcam, a portable, baGery-­‐
powered projecIon system with a liGle mirror. These components communicate to my cell phone in my pocket which acts as the communicaIon and computaIon device. And in the video here we see my student Pranav Mistry, who's really the genius who's been implemenIng and designing this whole system...” Consumer electronics BaVery (electricity) Mobile computers Example taken from the transcript of hVps://www.ted.com/talks/paee_maes_demos_the_sixth_sense 20/10/2014 LinkedUp VICI Challenge @ ISWC 2014 11 MF: Hot Spots
1.  Clustering of consecu@ve chapters which talk about similar topics and en99es Hot Spot 1 Hot Spot 2 2.  Ordering of those fragments based on annota9on relevance (TF-­‐IDF) 3.  Filtering: Hot Spots are fragments whose rela@ve relevance falls under the first quarter of the final score distribu@on Hot Spots Chapters 20/10/2014 LinkedUp VICI Challenge @ ISWC 2014 12 MF: Hot Spots
1.  Clustering of consecu@ve chapters which talk about similar topics and en99es 2.  Ordering of those fragments based on annota9on relevance (TF-­‐IDF) 3.  Filtering: fragments whose rela@ve relevance falls under the first quarter 20/10/2014 LinkedUp VICI Challenge @ ISWC 2014 13 MF: Hot Spots
20/10/2014 LinkedUp VICI Challenge @ ISWC 2014 14 Hyperlink: Indexing TED Talks
Granularity level: •  Chapter Features indexed : •  Topics •  En@@es •  Ime code references (startNPT and endNPT) •  extractor confidence •  Resource iden@fier •  Full text transcript 20/10/2014 LinkedUp VICI Challenge @ ISWC 2014 15 Hyperlink: Indexing TED Talks
20/10/2014 LinkedUp VICI Challenge @ ISWC 2014 16 Hyperlink: Finding related Courses
Datasets: •  Open Courseware •  Open University Anchors used in search: •  En99es Too specific. •  Topics Courses about the same thema9c AOributes used in search: •  Title •  Descrip9on •  Subject, thema9c … 20/10/2014 LinkedUp VICI Challenge @ ISWC 2014 17 Hyperlink: Finding related Courses
20/10/2014 LinkedUp VICI Challenge @ ISWC 2014 18 Architecture
20/10/2014 LinkedUp VICI Challenge @ ISWC 2014 19 DEMO time
hOp://linkedtv.eurecom.fr/HyperTED 20/10/2014 LinkedUp VICI Challenge @ ISWC 2014 20 Conclusions
Media Fragment player: annotated fragments Hot Spots detec@on based on topics and en@@es Hyperlinking between chapters of TED talks + external educa9on resources Responsive and Web compliant UI Future: formal evalua@on of recommended content, beVer SPARQL queries and content indexing 20/10/2014 LinkedUp VICI Challenge @ ISWC 2014 21 Thank you!
Source Code: hOps://github.com/jluisred/HotSpots hOps://github.com/pasqLisena/MediaFragPlayerDemo José Luis Redondo García –
redondo@eurecom.fr
Raphaël Troncy –
Mariella
Sabatino –
raphael.troncy@eurecom.fr
Pasquale
Lisena –
mariella.sabatino@eurecom.fr
pasquale.lisena@eurecom.fr
GO! @peputo
@rtroncy
@marSa
@PasqL
bs
sena
Download