How “MyLifeBits” Sees Multimedia… The Relevance for Multimedia When We Record Everything Personal ACM Multimedia 2004 12 October 2004 Gordon Bell gbell@microsoft.com Bay Area Research Center San Francisco, CA ACM MM2004: A great time for multi-mediators: audio & video become the principle data-types Overall Vision: Everything into or accessed via Cyberspace as we move from an analog to digital world Increasing technology dimensions: MHz, bytes, pixels, b/s Massive power in GPU, Bytes at each level, big & little pixels, wireless Thresholds enable new computer classes*, capabilities, & converged devices *Classes: platform, interface, & network @price level => Apps New classes PC rebirth as the personal & home mainframe: ambiance…entertainment PC rebirth for archiving everything MyLifeBits Phone-PDA|PC-Camera-etc. on body device In and around body devices Sensecam DARPA ASSIST Project BodyMedia Wireless sensor nets… network, interface, & size will create a class Challenges…especially to multimedia community Everything cyberizable will be in Cyberspace and covered by a hierarchy of computers! Continent World Body Region/ Cars… phys. nets Intranet Home… Campus buildings Fractal Cyberspace: a network of … networks of … platforms IP On Everything Cyberization: interface to all bits and process information Coupling to all information and information processors Pure bits e.g. printed matter Bit tokens e.g. money State: places, things, and people State: physical networks Industry’s evolutionary path Moore’s Law: ¿Que sera sera Goodness Grand Challengeland New systems, classes, & apps Evolution in performance, cost 2000 Time 2012 Hz, Bits, Bytes, Pixels, Bits/Sec and cost determine our future us… Computer components must all evolve at the same rate Amdahl’s law: one instruction per second requires one byte of memory and one bit per second of I/O Processor speed has evolved at 60% per year. Graphics Processing Unit offers one-time opportunity Big bang: 64 bit processors => VM & physical memory. Storage has evolved at 60%; now almost 100% Wide Area Network speed evolves at 60% Local Area Network speed evolved 26-60% Grove’s Law: Plain Old Telephone Service (POTS) evolved at 14%! … US: Now stuck <1 Mbps. Wireless. ROW: hundreds Kbps. US & 3rd world: nil Extrapolation from 1950s: 20-30% growth per year Tera Giga Storage Backbone Processing Memory ?? Mega Kilo 1 1947 Telephone Service 17% / year 1957 1967 1977 1987 1997 2007 CACM 1997 Predictions 1018 (exa) Secondary Memory 1012 (peta) 1012 (tera) Primary Memory 109 (giga) 106 (mega) 103 (kilo) Processing 1 1947 1967 1987 2007 2027 2047 Power PC Software Ecosystem Riding Moore’s Law Scale up and out 64-bit, 64-way Next Generation Secure Computing Base Convenience Wireless networking Always on Always with you Personal Computer in 2005-6 CPU: 4-6 GHz; 2 cores Memory: 2+ GB Disk: 0.5 - 1 TB GPU: 4x today Net: 1Gbps; 54Mbps wireless 2010 Platforms DeskTop / Phone … Home PDA, etc. Server Processor 50 GHz (3+Pc per chip 12.5 GHz Quad Multi-Core (programmed 50GHz everything) (12+ computing elements) Memory 50 GB. 15 GB 200 GB+ (NUMA)… TByte servers in lab!! Storage 3 TB 4 GB flash; 500 GB disk 5 EB (exabyte, 1018) Display 30” flat panel OLED… paper alt.? Network GPUs (Then & Now) Ardent Titan c1988 Best card c2004 1 pipe (4 results) 16 Pixel pipelines 32 MHz 500 MHhz 0.256 GB/s 35 GB/s (Mem BW) 0.1-0.2 MT/s 800 MT/s 0.017 Gpixel moves 8 Gpixels /s fill National Storage Roadmap 2000 100x/decade =100%/year ~10x/decade = 60%/year Storage Trends Source: Ed Grochowski, IBM Research Almaden The virtuous cycle of bandwidth supply and demand Increased Demand Standards IP Create new service Telnet & FTP EMAIL Increase Capacity (circuits & bw) Lower response time WWW Audio Video Voice! Grids Bell’s law of computer class formation to cover Cyberspace New computer platforms emerge based on chip density evolution Computer classes require new platforms, networks, and cyberization New apps and content develop around each new class Each class becomes a vertically disintegrated industry based on hardware and software standards Every decade a new, lower cost class of computers emerge defined by log (people per computer) Computing platform Interface to humans or other parts of world New networking and/or interconnect structure Mainframe Minicomputer Workstation PC Laptop PDA ??? year David Culler UC/Berkeley log (people per computer) New Role for Computing Number Crunching Data Storage productivity interactive year streaming information to/from physical world David Culler UC/Berkeley CMOS Trends: miniaturization and more Itanium2 (241M ) nearly a thousand 8086’s would fit in a modern microprocessor Actuation Sensing Processing & Storage Communication I SDQ SD PLL baseband filters mixer LNA David Culler UC/Berkeley Network Interface Platform Platform, Interface, & Network Computer Class Enablers “The Mini & Computer” Timesharing Mainframe PC/WS Web browser tube, core, SSI-MSI, disk, micro, floppy, PC, scalable drum, tape, timeshare disk, bit-map servers, batch O/S O/S display, mouse, dist’d O/S direct > batch terminals via commands WIMP Web, HTML POTS LAN Internet Network Interface Platform Platform, Interface, & Network Computer Class Enablers Web services Communicator Home nets Wireless (Phone <-> PDA based monitoring evolution?) Entertainm’t, infrastructure health, monitor minimal Clusters ala Phone/PDA, Multi … Beowulf; grid Gbyte, camera, GPS, body nets sense/effect TV/PC XML converge, Pocket sized Wireless sense/effect www servers, Periphery GPRS, WiFi, web services, monitoring Wired & Web services Lamda-nets networks wireless Corp. nets networks Conclusion: a new era New “computer” classes create new industries Web services (Grid) Virtually unlimited storage enables the lifetime store Networked computing takes over home entertainment High speed wireless networks Smart Personal Objects New classes require new breeds of software PC At An Inflection Point PCs Non-PC devices and Internet TV/AV Mobile Companions Consumer PCs The Dawn Of The PC-Plus Era, Not The Post-PC Era… devices aggregate via PCs!!! Communications Automation & Security Household Management Telephone, Television, and Radio… Evolution of media in the home Today: Yesterday: Analog storage Separate distribution networks Physical space limitations Tedious management and manual search Digital storage proliferation (CDs, DVDs, PVRs, MPEG & WMA/V) Digital cable, internet radio, analog phone Storage limitations & different stores for different stuff Tomorrow: All digital “PC” platforms Everything connected (IP) Unlimited storage Everything in a database SQL stereo Wfr L Spkr stereo CD 5 speakers Legacy Spkr IR LVCR egacy stereo Video* 5.1 digital Redundant DVD comp. Receiver Cassette egacy Set top Cable/ Satellite Ethernet Camera Mic stereo Video* Set top Media Center Computer Kbd Mse 5.1 digital SVHS-wide Cables/links Speaker 5+1 Plasma 2 or 3 Cable/Enet 2 IR 8 Stereo 4 5.1 digital 2 Comp./S-video 3 Plasma panel 1 Power 10 Kbd/mse 2 Monitor II (opt.) 4 Camera 2 Total 42 – 46 Things 18+remote Video* Plasma Panel *Video = composite or S-video Tivo for Radio Lifetime Personal Information Stores based on MyLifeBits Gordon Bell, Jim Gemmell, Roger Lueder The 1 TB Life 1TB gives 65+ years. 25,000 days at: 100 email messages a day (5KB each) 100 web pages day (50KB each) 5 scanned pages of paper a day (100KB each) 1 book every 10 days (1 MB each) 10 photos per day (400 KB JPEG each) 8 hours per day of sound - e.g. telephone, voice annotations, and meeting recordings (8 Kb/s) 1 new music CD every 10 days (45 min each at 128 Kb/s) 5 years to fill c2004 80 GB drives Want video? Buy more drives (1 TB/year gets 4 hours/day @ 1.5 Mb/s video) Everything goes in a database You need all the features of a database (Consistency, Indexing, Pivoting, Queries, Speed/scalability, Backup, replication) If you don’t use one, you will create one! Files are blobs, that sync with legacy file system & apps SQL MyLifeBits Software GPS import & Map display TV capture tool SenseCam Telephone capture tool MyLifeBits store Internet TV EPG download tool database Browser tool MyLifeBits Shell Screen saver PocketPC transfer tool PocketRadio player Radio capture & EPG MAPI interface Legacy email client files Legacy applications IM capture Voice annotation tool Text annotation tool Import files Memex As We May Think, Vannevar Bush, 1945 “A memex is a device in which an individual stores all his books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility” Full-text search, text & audio annotations, and hyperlinks I am data Statistics of use Capture and encoding I mean everything WWMX.org Personal Capture of Content Steve Mann in Cyberspace c1995 The personal capture space… Capture… In passing at personal level Network based Office, including web pages Legacy: Paper, photos, CDs, Video, Real time: Phone, meetings, On board aka on body Especially phone Smart rooms TV Personal carry always devices Organization… Automatic … Ease of organizing, annotating, retrieval On body sense & capture… Architecture – Universal or many devices or networked devices with hub Connection with any external sensors or networks… Purpose – meeting, experience capture, surveillance, memory prosthesis, health, … Sensors: camera, GPS/compass, voice | audio, text, stills | movies Displays: augmented reality, etc. Environment: temperature, light, etc. Physiological: BodyMedia (energy expended, acceleration, pulse | heart rate,…) The A/V/real time data Future: SenseCam new capture modes/devices Deja View MSR Cambridge SenseCam Quindi Meeting capture St. Jude Pacemaker Body Media IP On Everything “I sensed” Clarkson MIT c2001 Visually impaired UW 2004 Potentially useful trivia – but not normally photographed Advanced Soldier Sensor Information System and Technology (ASSIST) The objective is to exploit soldier-worn sensors to augment recall and reporting to enhance situational understanding. Demonstrate new capabilities that exploit information captured via soldier-worn sensors. Input streams from location, images, audio, and motion sensors -- logged and processed for reports and representations. Capture – active information capture and voice annotations. …prototype wearable capture units and supporting operational software for processing, logging and retrieval. Analysis – passive collection and automated activity/object recognition. … Personal devices Will the notebooks we all know and love to carry, take on a much smaller and or disintegrated form factor? Phone+ camera, GPS, personal store, “PC”, body area gateway Tablet or book? General purpose or n special appliances? One appliance, one function versus one appliance, multiple functions OQO & Tiquit Chameleon: PC/XP & CE phone 256 MB; 20 GB; 800 x 300 pixels; c2001 Smart Personal ObjecTs (SPOT) Services Network MSN Dedicated Ku-band 12 kbps/radio station FM subcarrier broadcast Operati Satellite Feed ons Center Frame Relay WAN And more… US: All 50 states Top 100 MTAs 219 FM stations 177M reach Canada: Top 12 cities 24 FM Stations 12.5M reach Objects Issues for the Tbyte(s), Lifetime, PC: Killer apps in home & office 1. 2. 3. One dbase for everything (articles, books, conversations, ... financial transactions) …vs. long-term use of hierarchical files. Guarantee that data will live forever! “dear appy” problem Cheap, easy, and data-rich (e.g. time, place) capture: GPS and time everywhere Paper capture has to be as easy as discarding (scanner/shredder) Personal meeting capture...perhaps by the room E-book…e-magazines & journals need to have critical mass! Telephony and audio capture with indexing (telephonic speech-to-text needed) Media Center compatible for entertainment (photos, video, TV, radio) CARPE: Continuous Archival Recording & Retrieval of Personal Experience 4. 5. 6. 7. 8. 9. Content analysis (critical for photo & video!); doable for text. Annotations/meta-information add every-increasing value at high cost! Easy annotation for aiding search and it becomes the content Information control: privacy, security, expunge/deniability,… Having to be schizophrenic or have a lobotomy when leaving a “life” or being a part of some other person’s life recording Other “killer apps”: Alzheimer, immortality, surrogate memory? GUI’s to improve use (e.g. time to learn, use, aid in retention) MyLifeBits Challenge for Multimedia 1. 2. “Handling” picture, audio, & video “content”! Just plain photo content analysis 3. Capturing audio accurately and easily 4. Faces, places, things, scene types, any attribute, etc. All kinds of microphones… just like our ears can Speech-to-text for retrieval Video capture Segmenting content into useful clips Doing more than treating scenes as just pictures i.e. what is happening Problems: Control, “Amnesia” Ownership & other “life” bits issues Full sharing of bits that are mine I created them, OK to copy and distribute DRM: purchased for my own use The bits “belong” to a corporation or org. “OK to look at, but I only own half the bits” The bits are the real, untampered bits Controlling forgetfulness Private, do not “demo” Expunge forever... “this never happened” Codec not found The “dear appy” problem Dear Appy, How committed are you? Please come back to me, Lost and forgotten data Who’s responsible? media platform, file, and databases evolving standards and formats evolving and/or disappearing apps The End How to lose at Video Conferencing 1. 2. 3. 4. “Voice quality” must be comparable to the low cost alternative “The call” must be as easy as the low cost alternative Video conferencing must be as ubiquitous as the low cost alternative. Video must increase “presence” The Content Analysis Problem 1. 2. 3. 4. “Cliplets”: Automatic segmentation of a pile of documents and video into individual documents and scenes. Item typing: Would like a minimal Dublin Core for each item: date, creator, title, source, abstract, and type “Type” classification: articles, letters, memos, etc. Ontology creation for collections What we need from multi-mediators 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Less data types… we are drowning in new types. Standards. Dear Appy, A general transcoder. Better audio starting with a range of microphones. Essential for speech recognition Names, places, and times for every object in a photo or video! Who said what? Who was playing, what, when? Who wrote it? Composition of stories from content…. CODEC (Standards) Hell: 3-5 groups evolving Picture aspirations lead technology Cell phones may be the way to video conf. MM: past, present, future • • • • • Relation to IEEE on Visualization What do we see from Moore’s Law, Bell’s Law & platforms Standard intro of platforms and potential one (new portable pmc) Difficulty of bringing high quality video to the desktop…Moviemaker. Looking back technology… didn’t predict cellphone. – Replace the PC as the UI for communication, commerce, entertainment • 3 levels of a/v: portable, stationary including rooms, caves – Also brings up is it personal or does the infrastructure know it all? – Interesting thesis: GPU will enable vast computation • Show my video vs. transcription?? Probably not • Show the video of acm 97…look at how far we have to go – Everything in cyberspace; telepresentation would be in 2047 • a/v as a user data-type (create, produce, use: ambience) • DUST as a platform. Low data rate. Evolution unclear. • What’s in the network? Cameras EVERYPLACE! Technology lessons • • • • Wilkes build system as if it will someday be true Moore’s and Bell’s Law to predict platforms ACM 97 paper missed the cellphone Chameleon No technology before its time e.g. tablets, cellphones with high bandwidth, Multimedia in the large: More than just another A/V file 1. 2 d documents including graphics 2. Pictures 3. Audio: personal & professional voice, music, sounds; radio; telephone; 4. Plain old video: personal & professional 5. Web content that includes all of the above 6. Data streams of all kinds (entertainment, carpe, • • • • • • Environmental monitoring Meetings (sound, video, presentation, notes) monitoring; Continuous a/v monitoring e.g. DejaView, Sensecam… including use for reality TV and multiple users Continuous body monitoring stuff e.g. BodyMedia Record of user behavior Medical imagers.. Multimedia: profitable bets more than just betting against optimists 10/93:12/96 VOD. 5 cities. 250K users 9/94:+6 mos.10K units, Microunity’s MM processor (VOD, set-top) 96:01 Multimedia Roundtable Telepresentations will be a well defined app More people view ACM97 then attend 6/96:4/01 50% PCs will have video; 10% of those used 3/97:12/00 10K machines communicate @Gbps 8/99:12/04 LEP/OLED will outsell LCDs; e-ink outsell LEP/OLED Dust Networks LifeLines (Plaisant et al.) www.cs.umd.edu/hcil/lifelines University of Maryland Capturing what you see ACM Multimedia 2004 call for papers Multimedia 2004 invites your participation in the premier annual multimedia conference, covering all aspects of multimedia computing: from underlying technologies to applications, theory to practice, and servers to networks to devices. We especially encourage introduction of novel media such as haptic, smell, sensors, animation, etc. Technical Program The technical program will consist of plenary sessions and talks with topics of interest in: • Multimedia analysis, processing, and retrieval, including multimedia semantics, aesthetics, modeling, fusion, audio/video/multi-modal processing, multimedia content description and indexing, multimedia digital rights management (protection and attribution), content-based retrieval with emphasis on multiple and novel media. • Multimedia networking and system support, including context-aware multimedia communications, Internet telephony, peer-to-peer streaming, audio/video streaming, multimedia content distribution, wireless multimedia, adaptive support for scalable media, Internet protocols, multimedia servers, operating systems, middleware and QoS. • Multimedia tools, end-systems, and applications, including new UI metaphors, usable distributed collaboration, authoring, multi-modal interaction and integration, multimedia in elearning, entertainment, personal media, assisted living, and virtual environments. Multimedia Analysis, Processing and Retrieval Track The Multimedia analysis, processing, and retrieval track of ACM Multimedia has always been at the forefront of research in the area of media mining, media processing, and media presentation. We are highly encouraging submissions in these areas: 1. Containing novel and fresh ideas, 2. Questioning existing paradigms/unwritten rules, or 3. Advancing the field by thorough theoretical or experimental analysis 4. Chartering into new directions (e.g., multimedia sensor networks on distributed platforms) Original papers are solicited in, but are not limited to the following technical areas: . Multimedia analysis, processing, and retrieval • Multimedia content description • Audio/video/multi-modal processing • Multimedia semantics modeling • Multimedia indexing and retrieval • Digital rights management • MPEG-7/-21 standards • Content-based retrieval with emphasis on multiple and novel media • Media Mining • Multimedia sensor networks on small/large-scale distributed platforms • Active media capturing, processing, and rendering from a control angle. The term multimedia is interpreted in a very broad sense. It encompasses image, audio, video, tactile, and/or olfactory data as well as compound documents such as presentations (e.g., in PowerPoint), word documents, media emails, and web pages. Multimedia Networking and System Support Track ACM Multimedia has been a premier annual conference, where researchers, developers, and practitioners from academia and industry present new ideas and future directions, and experience a stimulating synergy in all aspects of multimedia computing. For the network and system support track in the technical program, we invite submissions in the topics below, but not limited to: • context-aware multimedia communications • Internet telephony • peer-to-peer streaming • broadband audio/video streaming • multimedia content distribution • wireless networking for multimedia • multimedia in ad-hoc networks • ubiquitous multimedia services • multimedia synchronization • multimedia authentication and security • multimedia server design • QoS-aware resource allocation Multimedia Tools, End-systems, and Applications Track • • • • For the applications and tools track in the technical program, we invite submissions in the topics below, but not limited to: - UI metaphors - usable distributed collaboration - authoring - multi-modal interaction and integration - multimedia in e-learning - entertainment - personal media - assisted living - virtual environments Papers should present novel multimedia tools and applications, or a theoretical or empirical contribution that advances our our understanding of how to design or implement successful multimedia tools and applications. Submission should make clear what the contribution is, and how it has been validated. Submissions that present novel applications and tools must place these in the context of state-of-the-art multimedia research, and state clearly what the advancement compared to previous applications and tools. If the main contribution is the application of multimedia tools and techniques to another field (for example, education, entertainment, security), the submission should - identify and explain the need or problem in that field, and - present some proof that the application meets the requirements and solves the problem (e.g. performance comparison or usability evaluation).