Exciting Media Limsoon Wong Institute for Infocomm Research Plan • I will discuss some of the advances on the handling and processing of native media – – – – New things that you can do with texts New things that you can do with images New things that you can do with audio New things that you can do with video New things that you can do with texts Take search engines to the next level • Search engines are getting less useful than before – too many hits – not all relevant – not “organized” [Hierarchal clustering from BIGontheNet] take Google to a new level – Yahoo! Finance [p-zoom] may have a leg up on… its competitors – Tech Web News by “I like BigOnTheNet's and Groxis's web search categorization technologies a lot” – Chris Shipley on WashingtonPost.com Intelligent information extraction, improve safety • Benefits • Extract chemical safety information from Materials Safety Data Sheets (MSDS) • Check for conformance to standards – OHD: Check 100% of MSDS (currently < 10%) with same manpower – Chemical Suppliers: Savings in distribution of MSDS as it is online – End users of chemicals : Better quality MSDS, improved safety How is it done? What’s needed to get it done? New things that you can do with images Make computers easier to use • Abstraction of image content allows interpretation and matching in semantic space Search photos by visual keywords • Visual query language allows specification of what and where How is it done? What’s needed to get it done? • Trained visual keywords for semantic detection and summarisation • Automatic indexing using such keywords Faces : Crowd : Buildings : Foliage : Let machines perceive as we do • Perceptual visual quality according to characteristics of human vision Higher PSNR! New perceptual metric perceives correctly. Comparison with MOS (Mean Opinion Score) • Outperform metrics in ITU-T VQEG test • Adoption in video coding results in efficiency & quality improvement (other systems make compromise betw. the two) PSNR metric New metric Pearson Correlation Spearman Correlation 0.66 0.83 0.69 0.81 Better accuracy Better consistency Protect authenticity and integrity of data in a robust way • Third Party Publication: Sign Once, Verify Many Ways New things that you can do with audio Let machines listen as we do Automatic speech recognition PSTN Business Logic Text-to-speech voice Speech enhancement, Noise reduction Multilingual voice mining, Speech & dialogue processing Text categorization, Natural language processing Semantic outputs To build a mobile audio industry … analogous to the graphics industry • Synthesis-directed analysis of sounds – how would you model a lion’s roar? • Algorithms for synthetic sound generation • Tools for sound model automation and support • Cross-platform audio synthesis engine with a small footprint and low compute requirements Make health monitoring less hazardous • Mobile and non-invasive health-monitoring devices for home use • Potentially large market for densely populated areas with low medical facilities & personnel • Aging population in developed countries means more need for automatic continuous monitoring to lower overall medical costs • Passive health monitoring devices less of health hazard and allow long-term usage • Sound-based automatic detection & classification of medical anomalies, e.g., Long term fetal heart sound monitoring New things that you can do with video Make home videos more fun • Select video frames • Cut to music • Decide on transitions Turn home video into high quality MTV automatically Prevent drowning, save lives • Drowning Early Warning System • tracks people in dynamic aquatic conditions • intelligently detect water crises situations Watch video any where, any time, on any device! Wireless Network And improve quality at the same time! More intelligent CCTV, improve homeland security Improve sophistication of our media industry Existing pains “Intrusive” ads •pops out during play! Nonintrusive virtual contents insertion Non-intrusive insertions •detects non-play segment •non-interfering insertion Tennis TV •manual Enhanced tennis TV •auto-tracking •super-resolution Tagged to camera h/w •costly •done once at source Software detection: •performed any-time •demographic Ads •cheap How is it done? What’s needed to get it done? • Robust Scene Modeling and Camera calibration – Given a 2D court model of 3D scene that camera is capturing, identify 3D object positions robustly and accurately • Super-Resolution Image Reconstruction from Video – Given a low resolution image sequence of an object far away from the camera, reconstruct a larger resolution image sequence – This is essentially an ill-posed problem, but we can apply domain info such as motion, pose, etc, to seek a good solution • Robust Object and Landmark Detection – Real-time – Geometric invariance • Deployment and Application Constrains – Real-time Looking even further... Media in 2010 according to Institute for the Future Dimensions of entertainment activities • • • • • • • • • • The event The process Popularization of research Practice & performance level Entertainment spaces Entertainment tools Sharing & social communication Consuming entertainment Creating entertainment The work of entertainment Key shifts shaping new entertainment • Mass to Personal – consumers will appropriate mass media tools for their own personal expression • Packaged to Self-generated – consumers will create the entertainment experiences they engage in • Episodic to Persistent – entertainment experience will be ongoing & will have no clear starting and stopping point • Virtual to Embodied – information, images, and experiences will be embedded into physical objects & physical world Mass to personal • Eg: Blogging---how the Web evolves from a publishing medium to a personal creativity tool – Blogs are written online, using tools accessed thru a Web browser, combining simplicity and immediacy of instant messaging, & broad accessibility of Web sites – Most personal blogs are essentially online diaries, sometimes devoted to specific subjects. They are short, updated frequently, and full of hyperlinks, & crafted to look like snapshots of what their authors are thinking and creating Packaged to self-generated • Eg: Fantasy sports leagues – In these leagues, individuals act as managers of professional sports teams. They pick players in live drafts & develop rosters for playing their team against other fantasy teams in their league – The real experience of Fantasy Leagues is the interaction among the various player managers—the player trading, email, & competition The digital world becomes a vibrant place for all sorts of interactions like trading, creation or production, & spectating Episodic to persistent • Eg: Massive multiplayer roleplaying games – Sony’s EverQuest has over 400,000 players, 70,000 of which are online at any given time, slaying dragons, traveling to cities, or trading with each other. Individual players acquire property, talents, and social identities over time – Players’ actions permanently affect the virtual world: a player’s property will exist even after they’re gone – Runs 24/7 & when players log off, game continues to evolve and grow, driven by social behavior of thousands of players that log in each day Virtual to embodied • Eg: Geocaching adventure game played worldwide – Founder creates a cache which holds a treasure, & posts real world location coordinates on a Web site – Seekers retrieve coordinates to cache site on the Web & use a GPS device to get within 20 feet of it – Then they use clues, hunting skills, special real-world skills (hiking, scuba diving, etc) to find cache – Finally, they sign logbook belonging to cache, recover treasure, & leave a treasure for next seeker In summary • The new “entertainment” media in 2012 will be oriented around personal media that – are generated by consumers rather than packaged and distributed by providers – provide persistent experiences that do not disappear with a switch of a button but linger over the course of daily life – are touch points in the physical environment that embody entertainment and set forth a new relationship among consumers, entertainment, and their broader daily life activities Implications • Focus on leveraging distinctive points of view • Co-create continuous experiences with consumers • Think of ways to target players who personalize mass events • Develop tools and processes for fusing physical & virtual campaigns • Focus on promoting user customization • Design for participants to “sell to their friends” • Develop products for persistent experiences • Include digital experience in physical products Thank you • • • • • • • • title: Exciting Media abstract: I will discuss some of the advances on the handling and processing of native media. In particular, I will look at - new things that you can do with video - new things that you can do with audio - new things that you can do with images evening of 6th april