How “MyLifeBits” Sees Multimedia… The Relevance for Multimedia When We Record Everything Personal

advertisement
How “MyLifeBits” Sees
Multimedia…
The Relevance for Multimedia
When We Record Everything Personal
ACM Multimedia 2004
12 October 2004
Gordon Bell
gbell@microsoft.com
Bay Area Research Center
San Francisco, CA
ACM MM2004: A great time for multi-mediators:
audio & video become the principle data-types


Overall Vision: Everything into or accessed via Cyberspace as
we move from an analog to digital world
Increasing technology dimensions: MHz, bytes, pixels, b/s



Massive power in GPU, Bytes at each level, big & little pixels, wireless
Thresholds enable new computer classes*, capabilities, &
converged devices
*Classes: platform, interface, & network @price level => Apps
New classes




PC rebirth as the personal & home mainframe: ambiance…entertainment
PC rebirth for archiving everything MyLifeBits
Phone-PDA|PC-Camera-etc. on body device
In and around body devices





Sensecam
DARPA ASSIST Project
BodyMedia
Wireless sensor nets… network, interface, & size will create a class
Challenges…especially to multimedia community
Everything cyberizable will be
in Cyberspace and covered by
a hierarchy of computers!
Continent
World
Body
Region/
Cars…
phys. nets
Intranet
Home…
Campus buildings
Fractal Cyberspace: a network
of … networks of … platforms
IP On Everything
Cyberization: interface to all
bits and process information
Coupling to all information and information
processors
 Pure bits e.g. printed matter
 Bit tokens e.g. money
 State: places, things, and people
 State: physical networks

Industry’s evolutionary path
Moore’s Law: ¿Que sera sera
Goodness
Grand Challengeland
New systems, classes, & apps
Evolution in
performance, cost
2000
Time
2012
Hz, Bits, Bytes, Pixels, Bits/Sec
and cost determine our future us…
Computer components must all
evolve at the same rate


Amdahl’s law: one instruction per second requires
one byte of memory and one bit per second of I/O
Processor speed has evolved at 60% per year.







Graphics Processing Unit offers one-time opportunity
Big bang: 64 bit processors => VM & physical memory.
Storage has evolved at 60%; now almost 100%
Wide Area Network speed evolves at 60%
Local Area Network speed evolved 26-60%
Grove’s Law: Plain Old Telephone Service (POTS)
evolved at 14%! … US: Now stuck <1 Mbps.
Wireless. ROW: hundreds Kbps. US & 3rd world: nil
Extrapolation from 1950s:
20-30% growth per year
Tera
Giga
Storage
Backbone
Processing
Memory
??
Mega
Kilo
1
1947
Telephone Service
17% / year
1957
1967
1977
1987
1997
2007
CACM 1997 Predictions
1018
(exa)
Secondary
Memory
1012
(peta)
1012
(tera)
Primary
Memory
109
(giga)
106
(mega)
103
(kilo)
Processing
1
1947
1967
1987
2007
2027
2047
Power




PC Software
Ecosystem
Riding Moore’s Law
Scale up and out
64-bit, 64-way
Next Generation
Secure Computing
Base
Convenience



Wireless networking
Always on
Always with you
Personal Computer
in 2005-6
CPU: 4-6 GHz; 2 cores
Memory: 2+ GB
Disk: 0.5 - 1 TB
GPU: 4x today
Net: 1Gbps; 54Mbps wireless
2010 Platforms
DeskTop / Phone …
Home
PDA, etc.
Server
Processor 50 GHz
(3+Pc per
chip
12.5 GHz
Quad Multi-Core
(programmed 50GHz
everything)
(12+ computing
elements)
Memory
50 GB.
15 GB
200 GB+ (NUMA)…
TByte servers in lab!!
Storage
3 TB
4 GB flash;
500 GB disk
5 EB (exabyte, 1018)
Display
30” flat
panel
OLED…
paper alt.?
Network
GPUs (Then & Now)
Ardent Titan c1988
Best card c2004
 1 pipe (4 results)
 16 Pixel pipelines
 32 MHz
 500 MHhz
 0.256 GB/s
 35 GB/s (Mem BW)
 0.1-0.2 MT/s
 800 MT/s
 0.017 Gpixel moves  8 Gpixels /s fill
National Storage Roadmap
2000
100x/decade
=100%/year
~10x/decade = 60%/year
Storage Trends
Source: Ed Grochowski, IBM Research Almaden
The virtuous cycle of bandwidth
supply and demand
Increased
Demand
Standards
IP
Create new
service
Telnet & FTP
EMAIL
Increase Capacity
(circuits & bw)
Lower
response time
WWW
Audio
Video
Voice!
Grids
Bell’s law of computer class
formation to cover Cyberspace




New computer platforms emerge based
on chip density evolution
Computer classes require new
platforms, networks, and cyberization
New apps and content develop around
each new class
Each class becomes a vertically
disintegrated industry based on
hardware and software standards
Every decade a new, lower cost class
of computers emerge defined by


log (people per computer)

Computing platform
Interface to humans or other parts of world
New networking and/or interconnect structure
Mainframe
Minicomputer
Workstation
PC
Laptop
PDA
???
year
David Culler UC/Berkeley
log (people per computer)
New Role for Computing
Number Crunching
Data Storage
productivity
interactive
year
streaming
information
to/from physical
world
David Culler UC/Berkeley
CMOS Trends: miniaturization
and more
Itanium2 (241M )
nearly a thousand 8086’s
would fit in a modern
microprocessor
Actuation
Sensing
Processing &
Storage
Communication
I SDQ SD
PLL baseband
filters
mixer
LNA
David Culler UC/Berkeley
Network
Interface
Platform
Platform, Interface, & Network
Computer Class Enablers
“The
Mini &
Computer” Timesharing
Mainframe
PC/WS
Web browser
tube, core, SSI-MSI, disk, micro, floppy, PC, scalable
drum, tape, timeshare
disk, bit-map servers,
batch O/S
O/S
display, mouse,
dist’d O/S
direct >
batch
terminals via
commands
WIMP
Web, HTML
POTS
LAN
Internet
Network
Interface
Platform
Platform, Interface, & Network
Computer Class Enablers
Web services Communicator Home nets
Wireless
(Phone <-> PDA
based
monitoring
evolution?) Entertainm’t,
infrastructure
health, monitor minimal
Clusters ala Phone/PDA, Multi
…
Beowulf; grid Gbyte, camera,
GPS, body nets
sense/effect
TV/PC
XML
converge,
Pocket sized
Wireless
sense/effect
www servers,
Periphery
GPRS, WiFi,
web services,
monitoring
Wired &
Web
services
Lamda-nets
networks
wireless
Corp. nets
networks
Conclusion: a new era

New “computer” classes create new industries
Web services (Grid)
 Virtually unlimited storage enables the lifetime
store
 Networked computing takes over home
entertainment
 High speed wireless networks
 Smart Personal Objects


New classes require new breeds of software
PC At An Inflection Point
PCs
Non-PC
devices and Internet
TV/AV
Mobile
Companions
Consumer
PCs
The Dawn Of The PC-Plus Era,
Not The Post-PC Era…
devices aggregate via PCs!!!
Communications
Automation
& Security
Household
Management
Telephone, Television,
and Radio…
Evolution of media in the home
Today:
Yesterday:




Analog storage
Separate
distribution
networks
Physical space
limitations
Tedious
management
and manual
search



Digital storage
proliferation (CDs,
DVDs, PVRs, MPEG
& WMA/V)
Digital cable,
internet radio,
analog phone
Storage limitations &
different stores for
different stuff
Tomorrow:
 All digital
 “PC” platforms
 Everything
connected (IP)
 Unlimited
storage
 Everything in a
database
SQL
stereo
Wfr
L
Spkr
stereo
CD
5 speakers
Legacy
Spkr
IR
LVCR
egacy stereo
Video*
5.1 digital
Redundant
DVD
comp. Receiver
Cassette
egacy
Set top
Cable/
Satellite
Ethernet
Camera
Mic
stereo
Video*
Set top
Media
Center
Computer
Kbd Mse
5.1 digital
SVHS-wide
Cables/links
Speaker 5+1
Plasma 2 or 3
Cable/Enet 2
IR 8
Stereo 4
5.1 digital 2
Comp./S-video 3
Plasma panel 1
Power 10
Kbd/mse 2
Monitor II (opt.) 4
Camera 2
Total 42 – 46
Things 18+remote
Video*
Plasma Panel
*Video = composite or S-video
Tivo for Radio
Lifetime Personal Information Stores
based on MyLifeBits
Gordon Bell, Jim Gemmell, Roger Lueder
The 1 TB Life

1TB gives 65+ years. 25,000 days at:









100 email messages a day (5KB each)
100 web pages day (50KB each)
5 scanned pages of paper a day (100KB each)
1 book every 10 days (1 MB each)
10 photos per day (400 KB JPEG each)
8 hours per day of sound - e.g. telephone,
voice annotations, and meeting recordings (8 Kb/s)
1 new music CD every 10 days (45 min each at 128 Kb/s)
5 years to fill c2004 80 GB drives
Want video? Buy more drives (1 TB/year gets
4 hours/day @ 1.5 Mb/s video)
Everything goes in a database

You need all the features of a database
(Consistency, Indexing, Pivoting, Queries, Speed/scalability, Backup, replication)


If you don’t use one, you will create one!
Files are blobs, that sync with legacy file system & apps
SQL
MyLifeBits Software
GPS import &
Map display
TV capture
tool
SenseCam
Telephone
capture tool
MyLifeBits
store
Internet
TV EPG
download
tool
database
Browser
tool
MyLifeBits
Shell
Screen saver
PocketPC
transfer
tool
PocketRadio
player
Radio
capture
& EPG
MAPI
interface
Legacy
email client
files
Legacy
applications
IM capture
Voice
annotation
tool
Text
annotation
tool
Import files
Memex
As We May Think, Vannevar Bush, 1945
“A memex is a device in which an individual stores all
his books, records, and communications, and which
is mechanized so that it may be consulted with
exceeding speed and flexibility”
 Full-text search, text & audio annotations, and
hyperlinks
I am data
Statistics of use
Capture and encoding
I mean everything
WWMX.org
Personal
Capture of
Content
Steve Mann
in
Cyberspace
c1995
The personal capture space…

Capture…

In passing at personal level





Network based





Office, including web pages
Legacy: Paper, photos, CDs, Video,
Real time: Phone, meetings,
On board aka on body
Especially phone
Smart rooms
TV
Personal carry always devices
Organization…

Automatic … Ease of organizing, annotating, retrieval
On body sense & capture…

Architecture –







Universal or many devices or networked devices with hub
Connection with any external sensors or networks…
Purpose – meeting, experience capture, surveillance,
memory prosthesis, health, …
Sensors: camera, GPS/compass, voice | audio, text,
stills | movies
Displays: augmented reality, etc.
Environment: temperature, light, etc.
Physiological: BodyMedia (energy expended, acceleration,
pulse | heart rate,…)
The A/V/real time data Future:
SenseCam
new capture modes/devices
Deja View
MSR Cambridge
SenseCam
Quindi Meeting capture
St. Jude
Pacemaker
Body Media
IP On Everything
“I sensed”
Clarkson MIT c2001
Visually impaired
UW 2004
Potentially useful trivia – but not
normally photographed
Advanced Soldier Sensor Information
System and Technology (ASSIST)




The objective is to exploit soldier-worn sensors to
augment recall and reporting to enhance situational
understanding.
Demonstrate new capabilities that exploit information
captured via soldier-worn sensors. Input streams from
location, images, audio, and motion sensors -- logged
and processed for reports and representations.
Capture – active information capture and voice
annotations. …prototype wearable capture units and
supporting operational software for processing, logging
and retrieval.
Analysis – passive collection and automated
activity/object recognition. …
Personal devices

Will the notebooks we all know and love to
carry, take on a much smaller and or
disintegrated form factor?

Phone+ camera, GPS, personal store, “PC”,
body area gateway
Tablet or book?
 General purpose or n special appliances?

One appliance, one function versus
one appliance, multiple functions
OQO & Tiquit
Chameleon: PC/XP & CE phone
256 MB; 20 GB; 800 x 300 pixels; c2001
Smart Personal ObjecTs (SPOT)
Services
Network
MSN Dedicated Ku-band 12 kbps/radio station
FM subcarrier broadcast
Operati Satellite Feed
ons
Center
Frame Relay
WAN
And more…
US:
All 50 states
Top 100 MTAs
219 FM stations
177M reach
Canada:
Top 12 cities
24 FM Stations
12.5M reach
Objects
Issues for the Tbyte(s), Lifetime, PC:
Killer apps in home & office
1.
2.
3.
One dbase for everything (articles, books, conversations, ... financial
transactions) …vs. long-term use of hierarchical files.
Guarantee that data will live forever! “dear appy” problem
Cheap, easy, and data-rich (e.g. time, place) capture:
GPS and time everywhere
Paper capture has to be as easy as discarding (scanner/shredder)
Personal meeting capture...perhaps by the room
E-book…e-magazines & journals need to have critical mass!
Telephony and audio capture with indexing (telephonic speech-to-text needed)
Media Center compatible for entertainment (photos, video, TV, radio)
CARPE: Continuous Archival Recording & Retrieval of Personal Experience
4.
5.
6.
7.
8.
9.
Content analysis (critical for photo & video!); doable for text.
Annotations/meta-information add every-increasing value at high cost!
Easy annotation for aiding search and it becomes the content
Information control: privacy, security, expunge/deniability,…
Having to be schizophrenic or have a lobotomy when leaving a “life” or
being a part of some other person’s life recording
Other “killer apps”: Alzheimer, immortality, surrogate memory?
GUI’s to improve use (e.g. time to learn, use, aid in retention)
MyLifeBits Challenge for Multimedia
1.
2.
“Handling” picture, audio, & video “content”!
Just plain photo content analysis

3.
Capturing audio accurately and easily


4.
Faces, places, things, scene types, any attribute, etc.
All kinds of microphones… just like our ears can
Speech-to-text for retrieval
Video capture


Segmenting content into useful clips
Doing more than treating scenes as just pictures i.e.
what is happening
Problems: Control, “Amnesia”
Ownership & other “life” bits issues

Full sharing of bits that are mine
I created them, OK to copy and distribute
 DRM: purchased for my own use

The bits “belong” to a corporation or org.
 “OK to look at, but I only own half the bits”
 The bits are the real, untampered bits
 Controlling forgetfulness



Private, do not “demo”
Expunge forever... “this never happened”
Codec not found
The “dear appy” problem
Dear Appy,
How committed are you?
Please come back to me,
Lost and forgotten data

Who’s responsible?
 media
 platform, file, and databases
 evolving standards and formats
 evolving and/or disappearing apps
The End
How to lose at Video Conferencing
1.
2.
3.
4.
“Voice quality” must be comparable
to the low cost alternative
“The call” must be as easy
as the low cost alternative
Video conferencing must be as ubiquitous
as the low cost alternative.
Video must increase “presence”
The Content Analysis Problem
1.
2.
3.
4.
“Cliplets”: Automatic segmentation of a
pile of documents and video into
individual documents and scenes.
Item typing: Would like a minimal Dublin
Core for each item: date, creator, title,
source, abstract, and type
“Type” classification: articles, letters,
memos, etc.
Ontology creation for collections
What we need from multi-mediators
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
Less data types… we are drowning in new types. Standards.
Dear Appy, A general transcoder.
Better audio starting with a range of microphones. Essential for speech
recognition
Names, places, and times for every object in a photo or video!
Who said what?
Who was playing, what, when? Who wrote it?
Composition of stories from content….
CODEC (Standards) Hell: 3-5 groups evolving
Picture aspirations lead technology
Cell phones may be the way to video conf.
MM: past, present, future
•
•
•
•
•
Relation to IEEE on Visualization
What do we see from Moore’s Law, Bell’s Law & platforms
Standard intro of platforms and potential one (new portable pmc)
Difficulty of bringing high quality video to the desktop…Moviemaker.
Looking back technology… didn’t predict cellphone.
– Replace the PC as the UI for communication, commerce, entertainment
• 3 levels of a/v: portable, stationary including rooms, caves
– Also brings up is it personal or does the infrastructure know it all?
– Interesting thesis: GPU will enable vast computation
• Show my video vs. transcription?? Probably not
• Show the video of acm 97…look at how far we have to go
– Everything in cyberspace; telepresentation would be in 2047
• a/v as a user data-type (create, produce, use: ambience)
• DUST as a platform. Low data rate. Evolution unclear.
• What’s in the network? Cameras EVERYPLACE!
Technology lessons
•
•
•
•
Wilkes build system as if it will someday be true
Moore’s and Bell’s Law to predict platforms
ACM 97 paper missed the cellphone Chameleon
No technology before its time e.g. tablets,
cellphones with high bandwidth,
Multimedia in the large:
More than just another A/V file
1. 2 d documents including graphics
2. Pictures
3. Audio: personal & professional voice, music, sounds;
radio; telephone;
4. Plain old video: personal & professional
5. Web content that includes all of the above
6. Data streams of all kinds (entertainment, carpe,
•
•
•
•
•
•
Environmental monitoring
Meetings (sound, video, presentation, notes) monitoring;
Continuous a/v monitoring e.g. DejaView, Sensecam…
including use for reality TV and multiple users
Continuous body monitoring stuff e.g. BodyMedia
Record of user behavior
Medical imagers..
Multimedia: profitable bets
more than just betting against optimists
10/93:12/96 VOD. 5 cities. 250K users
9/94:+6 mos.10K units, Microunity’s MM processor (VOD, set-top)
96:01 Multimedia Roundtable
Telepresentations will be a well defined app
More people view ACM97 then attend
6/96:4/01 50% PCs will have video; 10% of those used
3/97:12/00 10K machines communicate @Gbps
8/99:12/04 LEP/OLED will outsell LCDs; e-ink outsell LEP/OLED
Dust Networks
LifeLines (Plaisant et al.)
www.cs.umd.edu/hcil/lifelines
University of Maryland
Capturing what you see
ACM Multimedia 2004 call for papers
Multimedia 2004 invites your participation in the premier annual multimedia conference,
covering all aspects of multimedia computing: from underlying technologies to applications,
theory to practice, and servers to networks to devices. We especially encourage
introduction of novel media such as haptic, smell, sensors, animation, etc.
Technical Program
The technical program will consist of plenary sessions and talks with topics of interest in:
• Multimedia analysis, processing, and retrieval, including multimedia semantics,
aesthetics, modeling, fusion, audio/video/multi-modal processing, multimedia content
description and indexing, multimedia digital rights management (protection and attribution),
content-based retrieval with emphasis on multiple and novel media.
• Multimedia networking and system support, including context-aware multimedia
communications, Internet telephony, peer-to-peer streaming, audio/video streaming,
multimedia content distribution, wireless multimedia, adaptive support for scalable media,
Internet protocols, multimedia servers, operating systems, middleware and QoS.
• Multimedia tools, end-systems, and applications, including new UI metaphors, usable
distributed collaboration, authoring, multi-modal interaction and integration, multimedia in elearning, entertainment, personal media, assisted living, and virtual environments.
Multimedia Analysis, Processing and Retrieval Track
The Multimedia analysis, processing, and retrieval track of ACM Multimedia has always
been at the forefront of research in the area of media mining, media processing, and
media presentation. We are highly encouraging submissions in these areas:
1. Containing novel and fresh ideas,
2. Questioning existing paradigms/unwritten rules, or
3. Advancing the field by thorough theoretical or experimental analysis
4. Chartering into new directions (e.g., multimedia sensor networks on distributed
platforms) Original papers are solicited in, but are not limited to the following technical
areas:
. Multimedia analysis, processing, and retrieval
• Multimedia content description
• Audio/video/multi-modal processing
• Multimedia semantics modeling
• Multimedia indexing and retrieval
• Digital rights management
• MPEG-7/-21 standards
• Content-based retrieval with emphasis on multiple and novel media
• Media Mining
• Multimedia sensor networks on small/large-scale distributed platforms
• Active media capturing, processing, and rendering from a control angle.
The term multimedia is interpreted in a very broad sense. It encompasses image, audio,
video, tactile, and/or olfactory data as well as compound documents such as
presentations (e.g., in PowerPoint), word documents, media emails, and web pages.
Multimedia Networking and System Support Track
ACM Multimedia has been a premier annual conference, where researchers,
developers, and practitioners from academia and industry present new ideas
and future directions, and experience a stimulating synergy in all aspects of
multimedia computing.
For the network and system support track in the technical program, we invite
submissions in the topics below, but not limited to:
• context-aware multimedia communications
• Internet telephony
• peer-to-peer streaming
• broadband audio/video streaming
• multimedia content distribution
• wireless networking for multimedia
• multimedia in ad-hoc networks
• ubiquitous multimedia services
• multimedia synchronization
• multimedia authentication and security
• multimedia server design
• QoS-aware resource allocation
Multimedia Tools, End-systems, and Applications Track
•
•
•
•
For the applications and tools track in the technical program, we invite
submissions in the topics below, but not limited to:
- UI metaphors
- usable distributed collaboration
- authoring
- multi-modal interaction and integration
- multimedia in e-learning
- entertainment
- personal media
- assisted living
- virtual environments
Papers should present novel multimedia tools and applications, or a theoretical
or empirical contribution that advances our our understanding of how to design
or implement successful multimedia tools and applications. Submission should
make clear what the contribution is, and how it has been validated.
Submissions that present novel applications and tools must place these in the
context of state-of-the-art multimedia research, and state clearly what the
advancement compared to previous applications and tools.
If the main contribution is the application of multimedia tools and techniques to
another field (for example, education, entertainment, security), the submission
should
- identify and explain the need or problem in that field, and
- present some proof that the application meets the requirements and solves
the problem (e.g. performance comparison or usability evaluation).
Download