Advanced Video Capabilities of HD DVD-Video

advertisement
Advanced Video Capabilities
of HD DVD-Video
Kilroy Hughes
Digital Media Architect
Microsoft Corporation
Contents
•
•
•
•
•
•
•
•
•
•
What is HD DVD-Video Format?
Video Capabilities
Video/Graphics Layout Model (2D)
Video/Graphics Composition Model (3D)
Presentation and Synchronization Model (4D)
Programming
Animation
Resource Management
Output
Conclusion
What is HD DVD-Video?
• “HD DVD-Video Format” is an APPLICATION format (i.e.
content format) defined by the DVD Forum for use on
various storage media
• The HD DVD-Video Application format is currently
specified for use on:
– HD DVD-ROM discs (blue laser, 15 – 60GB)
– DVD-ROM discs (red laser, 4.7 – 16.8GB)
– R/W storage (flash memory, hard disk, etc.)
• The format can combine video, audio, text, and graphics
from optical disc, internal player storage, local area
network storage, and program streams from the Web
into realtime interactive video presentations
Advanced Video Capabilities
•
Simultaneous presentation of:
–
–
–
–
–
–
•
•
•
•
•
1 stream of up to 1920x1080P30 HD video (MPEG-2, H.264, or SMPTE 421M)
1 stream of up to 720x576P30 video (SD required, HD optional)
3 streams of up to 8 channel audio; streams can be from different sources
1 stream of text and graphics Subtitles, or bitmap Subpictures
16 Applications with programmed text, images, drawing, and animation
A graphical cursor controlled by a pointing device (e.g. joystick, mouse, trackball,
pressure pad, etc.)
Z-order and alpha blend of graphics objects, and alpha blending of graphics
and video planes
Independent scaling, clipping, and positioning of all video and graphics
objects
Property animation (i.e. object size, position, transparency, color, etc. can
be changed over time)
Frame accurate composition and animation based on timecodes derived
from video time (video frame/position) or Application time (elapsed time)
Network support that enables updating presentations on optical disc with
new content and programming that can be downloaded and streamed from
the Web (e.g. new subtitles, new commentary, new movie trailers, new
menus, new storyboard guide, new video games, etc.)
2D Video/Graphics Layout Model
Application coordinates
(0, 0)
Application Region
Canvass
coordinate space
(0, 0)
(all origins upper left)
Text
Full Screen Display Aperture
Author specified
(e.g. 1920x1080,1280x720)
Text
Invisible
Video
(-200, 1000)
(+2^31 -1)
Note: Only App text/graphics
Inside both App Region and
Aperture are visible
(+2^31 -1)
Text
Invisible text
Text
Text
Visible
Video
Text
Invisible
Video
(1920, 1080)
(2220, 1000)
Example of video object with portions outside the visible Aperture
(To pan right, video object position would be animated left, etc.)
3D Multi-Plane Composition Model
Object opacity style
Cursor
Application and Object Z-order
Object opacity style
Interactive Graphics
Point of View
Subtitles
Object opacity style
Secondary Video
Alpha key & Rect
Primary Video
and background
Opaque
Z-Axis
3D Multi-Application and Object
Composition Model
Text
Application Region
z=0.0
z=0.1
z=0.n
z-ordered
Objects in
an App
Application Region
z=1.0
z=1.1
z=1.n
z-ordered
Objects in
an App
Application Region
z=N.0
z=N.1
z=N.n
z-ordered
Objects in
an App
Z=0
Text
Z-ordered
Applications
Z=1
Text
Z=N
Text and graphics
objects contained
in an Application’s
3D Region
Interactive Graphics
Plane
Painter’s algorithm draws
objects from back to front,
from z=N.n to z=0.0, with
“Source Over” mixing
Application and Object Z-orders can be dynamically changed by programming
Video Keying and Blending
• The Primary Video Plane is opaque, and any area not filled
with video will show a designated background color
• The Secondary Video Plane can be “luma” and chroma
keyed, can have transparent objects called “clear
rectangles”, and can set an Opacity style property (alpha
value) for the entire video object
– “Luma key” treats author designated sub-black pixels as
transparent to the Primary Video below it (intended for
professionally pre-produced blue screen or rotoscoped mattes)
– “Chroma key” allows authors to designate a transparent color
range, with the caveat that color quantization and block transforms
used in video compression may result in rough edges and
unintended areas of opacity or transparency (may be appropriate
for “live video” overlay)
– A video alpha channel for alpha per pixel is not supported
• “Clear Rectangles” are layout objects defined in Graphics
Plane Applications that “cut a hole” through any graphics
objects in the same area and reveal either the Primary or
Secondary Video beneath as designated
Primary and Secondary
Clear Rectangles
Secondary
Clear Rect
Graphics Plane
Secondary Video
Primary Video
Primary
Clear Rect
Primary Video Plane
Secondary Video
Overlay (Not Keyed)
Secondary Video with Chroma Key
An Image in the Graphics Plane
Overlaying Primary Video
Example of a “Clear Rectangle”
Punching Through Graphics to Video
Secondary Video with Clear Rectangle
to Secondary Video Plane
Secondary Video with Clear Rectangle
to Primary Video Plane
Presentation and Synchronization
Model
• HD DVD-Video uses an XML presentation language referred to as
“iHD” for frame accurate video and graphics presentation and
animation
• A “Title Timeline” is specified for each presentation sequence (a
Title); and Video Clips, Audio Clips, Subtitles, Applications, and
Resources are laid out in sequences on that timeline and called
Tracks
• Multiple Titles can be combined in a Playlist, which contains all the
valid content and playback sequences defined for a disc and its
associated downloaded and streamed content
• iHD Applications use a timing language that can reference the
timecode of a Title, which is synchronized to a frame of video or
audio on each Track, so iHD Applications can create deterministic,
frame accurate, interactive graphics and video presentations
• Simple interactive video applications without interactive graphics
can be created with only a Playlist, video and audio Program
Streams, and Time Map indexes for those Program Streams
Playlists
•
•
•
•
•
•
•
•
•
•
Typically multiple Titles in a Playlist
Each Title has its own timeline and Title:Timecode
Video Clips sequence to form Video Tracks
Audio Clips sequence to form Audio Tracks
Subtitle Segments sequence to form Subtitle Tracks
Application Segments sequence to form Application Tracks
Application Resource Tracks span one Application
Title Resource Tracks span multiple Applications
Playlist Applications and Resources span multiple Titles
Playlists also specify:
– Configuration information such as Aperture size
– Navigation mapping of Tracks for remote controls
– Media attributes that identify codec, resolution, active area, source
frame rate, number of audio channels, nominal bitrate, etc.
Playlist Title with 3 Video Clips
Ch1
Video
Track
Audio
Track
Ch2
Title Timeline
Ch3
End
Video Clip 1
Video Clip 2
Video Clip 3
Audio Clip 1
Audio Clip 2
Audio Clip 3
Program Stream “Clips” can be segments of the same or different files
They are combined on the Title timeline and “spliced” on playback
TMAP (File 1.MAP)
TMAP (File 2.MAP)
TMAP (File 3.MAP)
Three Time Map files provide timecode > byte offset indexes for three video files
P-storage A/V
(File 1.EVOB)
Disc A/V
(File 2.EVOB)
Web A/V
(File 3.EVOB)
File/byte offsets are used to play Program Streams from files or HTTP: protocol
Playlist with Secondary Video
Ch1
Ch2
Title Timeline
Ch3
End
Video Clip 1
Video Clip 2
Video Clip 3
Audio Clip 1
Audio Clip 2
Audio Clip 3
Video Clip 1
Video Clip 2
Video Clip 3
Audio Clip 1
Audio Clip 2
Audio Clip 3
Menu App 1
Tablet PC App 2
Tracking App 3
App 1 Resources
App 2 Resources
App 3 Resources
Main Video
Sub Video
Application
4D layout of content that can be shown in Primary Video Plane, Secondary Video
Plane, and Graphics Plane with additional control by iHD Application programs
Playlist Resource Management
Ch1
Ch2
Title Timeline
Ch3
End
Video Clip 1
Video Clip 2
Video Clip 3
Audio Clip 1
Audio Clip 2
Audio Clip 3
Video Clip 1
Video Clip 2
Video Clip 3
Audio Clip 1
Audio Clip 2
Audio Clip 3
Menu App 1
Tablet PC App 2
Tracking App 3
App 1 Resources
App 2 Resources
App 3 Resources
Main Video
Sub Video
App
Resources
The Resource Track on the bottom schedules loading and unloading of all
required Application files into a 64MB File Cache so they are instantly accessible
to the user during any portion of the Title when that App is “valid”
iHD Programming
•Optimized mix of Declarative and Procedural languages
•Declarative Markup language handles most presentation needs with simple
tags and reliable, realtime performance using native code and hardware
•Compact ECMAScript Procedural language provides full programmability,
through content and player APIs, author handled events and state machine
iHD XML and ECMAScript Language
Markup
Style
Timing
Script
Advanced Content Files
(Playlist, Manifest, Markup, Script, Resources)
Content Object Model
Image,
text, etc.
Objects
Video,
Audio, etc.
Objects
System Object Model
Playlist,
App, etc.
Objects
Network
Player, etc.
Objects
Animation
•
Property animation
–
–
•
Bitmap animation
–
–
•
Bitmap animations are a sequence of images that capture a pre-rendered animation.
Playback can use a timed sequence of PNG or JPG image files (good for frame accuracy,
trick modes, such as reverse play, etc.); or a single MNG file.
Cell animation
–
•
Any object (graphics, text, drawing, video) can change its properties over time in response to
simple markup statements
Properties include position, size, opacity, color, z-order, etc.
Cell animation combines bitmap or property animated objects with separate backgrounds.
Performance is improved because the entire frame does not have to be stored and redrawn
each frame, and it is more flexible because animated foreground objects can be added,
removed, and controlled by programming and user input.
Animation can be synchronized to the Title Clock, Application Clock, or Page Clock
–
–
–
If an animation is synchronized to the Title Clock, it will pause when video pauses, jump to a
timecoded animation frame or state when the video jumps to that timecode, play slow when
the video plays slow, etc. One thing this enables is “video tracking hotspots”, which are
graphics or interactive regions superimposed over “objects’’ in the video, such as adding a
halo to a person who is walking around, appearing and disappearing from the video.
If an animation is synchronized to the Application clock, it will continue to run or loop
regardless of video playback
If an animation is synchronized to an Application “page”, it can be run each time the page is
loaded; for instance to do a menu build, or “fly in” a video image
Audio/Video Output Synchronization
• Most “DVD” video is 24 frame per second progressive source, such
as movies and episodic television
• HD DVD-Video perpetuates the practice of encoding 24P source as
30i by adding repeat field flags to generate 60Hz timing and
(optionally) 3:2 pulldown
• The HD DVD-V system is capable of ignoring the repeat flags and
outputting pure 24 frame per second video, text, and graphics over
HDMI … but
• The current consumer electronics industry direction is to apply 3:2
pulldown and convert to 60 fields per second somewhere in the
display pipeline in order to generate a raster signal for analog
connections to CRT displays
• It is very important that new HD displays and their HDMI inputs
support 1080P24 input mode. Scaling and refresh should be
handled in the display with methods appropriate for its particular
display technology (which will rarely be CRT), and not add an extra
step of inverse telecine detection, deinterlacing, scaling, and filtering
The 50Hz/60Hz “Problem”
• The legacy solution of +4% speed shift from 25Hz to
25Hz no longer works with compressed digital audio
outputs (and was never really satisfactory)
• HD DVD-V format requires that video be encoded at
either 50Hz or 60Hz, so most content will be 24P
encoded with 60HZ timing
• Europe’s “HD Ready” logo indicates a display will handle
both 50Hz and 60Hz HDMI input, but what about 24Hz?
• Unless Europe (and other 50Hz regions) require 24Hz
on HDMI displays, the options are:
– Wait for a format converted 50Hz version of each disc
– Watch the 60Hz version at 30i with 3:2 pulldown
– Speed shift 24P to 25P and watch at 50Hz with pitch shifted
uncompressed audio over HDMI
The Interlace “Problem”
• Most new DVD players and displays today support 480P
over analog component interfaces at various refresh
rates (e.g. 72Hz refresh)
• But, the encoded video has reduced vertical resolution
intended to reduce flicker on interlaced CRT displays
(done by CCD sensors that mix adjacent “scan” lines,
optical filters, FIR filters on resampling, etc.)
• Deinterlace chips can’t restore the vertical resolution that
was thrown away (a separate issue from the number of
scan lines)
• The industry needs to change this production and
display model for HD DVD-V and BD!!!
– Encode 1080P24 video at full vertical resolution to enable
full resolution progressive display
– Players must apply anti-alias and interlace filtering if they
subsample and sequentially output 540 line fields for 1080i30
signal output (also applies to generated text and graphics)
Take Aways on HD DVD-Video
• XML Playlists accomplish “on the fly” editing and mixing
in the player like EDLs or AAF on video editing work
stations
• Players include an HD video and graphics “blender”’ that
alpha blends multiple planes of video, graphics and text
in realtime with frame and pixel accuracy
• Resources from various storage and network sources
are marshaled and managed for realtime presentations
that can be interactively navigated by users
• Advanced audio and video codecs provide state of the
art quality and efficiency including 1080P video and
mathematically lossless 8 channel audio
• Programmable and network updatable user experiences
create new entertainment possibilities that combine the
flexibility of the Web with the high quality and reliable
consumer experience of DVD-Video
Thank You
Download