IP Multicast in Digital Television ... Infrastructure Kirimania Murithi

advertisement
IP Multicast in Digital Television Transmission
Infrastructure
by
Kirimania Murithi
Submitted to the Department of Electrical Engineering and Computer Science
in partial fulfillment of the requirements for the degrees of
Bachelor of Science in Electrical Engineering and Computer Science
and
BARKEIR
S W
and
Master of Engineering in Electrical Engineering and Computer Science
at the
MASSACHUSETTS INSTITUTE
OF TECHNOLOGY
MASSACHUSETTS INSTITUTE OF TECHNOLOGY
JUL 11 2001
May 2001
@
AQ
LIBRARIES
Kirimania Murithi, MMI. All rights reserved.
The author hereby grants to MIT permission to reproduce and distribute publicly
paper and electronic copies of this thesis document in whole or in part.
A uthor ...........................................................
Department of Electrical Engineering and Computer Science
May 26, 2001
Certified by..............................................
V. Mikael Bove, Jr
Principal Research Scientist
Thesis Supervisor
Accepted by
...............
...........
Arthur C. Smith
Chairman, Department Committee on Graduate Students
IP Multicast in Digital Television Transmission Infrastructure
by
Kirimania Murithi
Submitted to the Department of Electrical Engineering and Computer Science
on May 26, 2001, in partial fulfillment of the
requirements for the degrees of
Bachelor of Science in Electrical Engineering and Computer Science
and
Master of Engineering in Electrical Engineering and Computer Science
Abstract
Simultaneous access to popular data on the Internet calls for IP multicast protocols. The digital
television (DTV) transmission infrastructure has not been sufficiently utilized as a means of IP
multicast despite the congestion problem that face implementations of IP multicast applications
over the Internet. Due to the nature of DTV transmission and coding schemes, significant portions
of its channel bandwidth end up unused. This unused bandwidth can be leveraged for DTV IP
multicast, that is, broadcasting Internet content that has been detected to be on high demand.
In this thesis, DTV channel coding and compression schemes are explored and analyzed in depth,
leading to implementation of an IP multicast protocol that encodes packetized Internet data into
the unused spectrum of a DTV transmission channel.
Thesis Supervisor: V. Michael Bove, Jr
Title: Principal Research Scientist
2
Acknowledgments
Dr. V. Michael Bove, Jr. - Principal Research Scientist, MIT Media Lab
Prof. William F. Schreiber - Professor Of Elec. Eng., Emeritus, SR Lectuer
Prog. George C. Verghese - Professor Of Elec. Eng. & Computer Science
Everest Huang - MIT Elec. Eng. PhD Candidate
Dwaine Clarke - MIT Computer Science Master's Candidate
3
Contents
1
Overview
2
Introduction
3
4
8
10
2.1
Motivation DTV IP Multicast ..........................
10
2.2
Problem Description ...............................
11
2.3
Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
2.4
The Objective and General System Model . . . . . . . . . . . . . . . . . . .
13
History & Background Information
15
3.1
Streaming Media and the Internet
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
3.2
IP Multicast over the Internet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16
3.3
DTV Transmission Infrastructure Standards Development . . . . . . . . . . . . . . .
17
3.3.1
Conventional Analog Television Standards . . . . . . . . . . . . . . . . . . . .
18
3.3.2
The Advanced Television Systems Committee (ATSC) . . . . . . . . . . . . .
18
3.3.3
The Digital Video Broadcasting (DVB) Project . . . . . . . . . . . . . . . . .
19
3.3.4
Digital Broadcast Schemes and Modulation Formats . . . . . . . . . . . . . .
19
3.3.5
Moving Picture Experts Group (MPEG) Standards . . . . . . . . . . . . . . .
22
3.3.6
Dolby Digital . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
24
3.4
Transition from Analog to DTV Transmission . . . . . . . . . . . . . . . . . . . . . .
24
3.5
The PC/Internet-enabled Devices in DTV Transmission . . . . . . . . . . . . . . . .
25
3.6
DTV Encoders and Developments in DTV IP Multicast . . . . . . . . . . . . . . . .
27
28
Theoretical Analysis
. . . . . . . . . . . . . . . . .
28
. . . . . . . . . . . . . . . . . . .
29
4.3
Fundamentals of MPEG-2 Video Compression Algorithms . . . . . . . . . .
30
4.4
The MPEG-2 Video Coding Techniques
. . . . . . . . . . . . . . . . . . . .
30
Intraframe Coding Techniques - Transform Domain Coding - DCT .
31
4.1
Internet Transport versus DTV IP Multicast
4.2
The Structure of an MPEG-2 Bit-Stream
4.4.1
4
Interframe Coding Techniques - Motion Compensated Prediction
32
Coding of Bit-Streams - CBR versus VBR . . . . . . . . . . . . . . . . .
33
4.4.2
4.5
5
Experimentation Procedures & Implementation Details
36
. . . . . . . . . . . . . . .
36
5.1.1
MSSG Encoder Model . . . . . . . . . . . . . . . . . . . . . . . .
36
5.1.2
MSSG Decoder Model . . . . . . . . . . . . . . . . . . . . . . . .
37
5.2
Analysis of the MPEG-2 Video Transport Stream . . . . . . . . . . . . .
37
5.3
Internet IP Data Injecting/Extraction Protocols
. . . . . . . . . . . . .
40
5.3.1
IP Internet Data Encoding (Injecting) Protocol . . . . . . . . . .
41
5.3.2
IP Internet Data Extraction Protocol
. . . . . . . . . . . . . . .
42
Results Analysis and Discussion . . . . . . . . . . . . . . . . . . . . . . .
43
5.1
5.4
DTV Transmission Channel Characterization
6
Conclusion
45
7
Recommendations
47
7.1
Limitations
7.2
Future Work
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
47
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
48
A Media & Streaming File Formats
49
A.1
Media File Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49
A.2
Streaming File Formats
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49
B MPEG-2 Bit-Stream Codec Model
51
B.1 MPEG-2 Codec Parameter File . .
51
B.2 Encoder Usage of MSSG Software
53
B.3 Decoder Usage of MSSG Software.
66
C MPEG-2 Bit-Stream Data Analysis
69
D IP Data Injecting Protocol
71
E IP Data Extraction Protocol
78
Bibliography
87
5
List of Figures
4-1
CBR Multiplexing of Data.................................
34
4-2
VBR Multiplexing of Data.................................
34
5-1
Varying sizes (in bytes) of a 5 minutes long encoded MPEG-2 video/image frames
sequence.
5-2
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A histogram of the sizes of encoded MPEG-2 video/image frames in a 5 minutes long
bitstream . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5-3
38
39
Varying sizes (in bytes) of the first 100 encoded video/image frames in an MPEG-2
sequence.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
40
List of Tables
49
A.1 M edia File Formats.....................................
A.2
..................................
Streaming File Formats .........
7
50
Chapter 1
Overview
In general, multicast is the delivery of data simultaneously to one or more destinations using a
single, local transmission operation [2].
Internet Protocol (IP) multicast involves delivery of IP
data, that is, Internet data that is packetized. The two forms of IP multicast explored are Internet
IP multicast and Digital Television (DTV) IP multicast. Internet IP multicast is the transmission
of data packets - usually audio and video streams - to multiple users simultaneously via the Internet
infrastructure [16]. On the other hand, DTV IP multicast is the transmission of similar data via the
digital television transmission infrastructure. This transmission scheme is similar to that used for
radio and TV programs over the airwaves.
The DTV transmission infrastructure has not been sufficiently utilized as a means of IP multicast despite the congestion problem that face implementations of Internet IP multicast. Internet
congestion is on the rise mainly due to downloading and streaming of large data content, at times
simultaneously. Attempts on using the Internet for large audience and real-time viewing of content
have resulted in poor response time and network over-loading. Simultaneous access to popular data
on the Internet calls for IP multicast protocols. However, implementations of IP multicast applications over the Internet are not cost effective, and do not solve the Internet congestion problem
[10]. Instead, they lead to more congestion. Ultimately, much larger margins for peak data traffic
capacity must be incorporated into the requirements for the Internet infrastructure.
There is a need for a new solution. Due to the nature of DTV transmission and coding schemes,
significant portions of the DTV channel bandwidth end up unused. Thus, broadcasters and service
providers can take advantage of the unused bandwidth of their DTV channels to broadcast Internet
content that has been detected to have a large demand, hence a prime candidate for IP multicast.
In this thesis, DTV channel coding and compression schemes are explored and analyzed in depth,
leading to implementation of an IP multicast protocol that encodes packetized Internet data into the
unused spectrum of a digital television transmission channel. The Internet data types used, without
8
the loss of generality, consisted of mainly streaming media (please refer to appendix A).
Chapter two of the thesis lays out the introduction of DTV IP multicast. It describes the
motivation behind DTV multicast research and implementation, the objective and the problem that
was targeted, and the general system model that was developed. Chapter three contains the history
and background information relevant to DTV IP multicast. It provides history and background
information on streaming media on the Internet, IP multicast over the Internet, and the standards
that have been developed for DTV transmission infrastructure. Further, it explains the need for
transition from analog transmission to DTV transmission, the role of the PC and other Internetenabled devices in DTV transmission, and the current developments in DTV transmission encoders
and IP multicast applications.
Chapter four explores the theoretical analysis behind the research in DTV IP multicast and the
design and implementation that was adopted. In this context, the Internet transport infrastructure
is analyzed and compared to DTV IP multicast. The structure of the MPEG-2 bitstream is also
described, as well as the fundamentals of MPEG-2 video compression algorithms. Additionally, the
MPEG-2 video, audio, and data streams coding techniques are presented and analyzed.
Chapter five describes the experimentation procedures and implementation details. In this chapter, the characterization of a DTV channel using an MPEG-2 transport stream is described, and
the results and analysis of this channel presented. The design and implementation of real-time data
injecting protocols are also described. Chapter six is the conclusion, and chapter seven contains the
recommendations. In chapter seven, the limitations that were encountered are presented, and also
the avenues for future research in this field.
9
Chapter 2
Introduction
2.1
Motivation DTV IP Multicast
Internet Protocol (IP) traffic and the number of users accessing the Internet continue to grow at
exponential-like rates. Additionally, large numbers of users continue to seek broadband access as
well as access to the same content from popular web sites. As its popularity increases, so has Internet
usage changed from predominantly basic Web browsing and e-mail to such applications as online
shopping, online trading, voice chat, interactive games and streaming media. Currently there are
approximately 200 million worldwide users of the Internet and of these there may typically be 26
million users accessing the Internet at a given time [4]. The number of total and simultaneous Internet users continues to grow, which results in increasing traffic demand on the Internet infrastructure
and access facilities leading to Internet congestion. In addition, because of the recent increases and
improvement in content compression, creation, transport, distribution and editing tools (integration
of video, audio, image and data) for the Web, higher access data rates are needed per end user,
which also increases demand for higher transport capacity [6].
Implementation of service enhancements on the Internet in the form of current facility upgrades
requires significant capital outlays and other resources. The spread of the Internet tends to be
limited by the current network fabric and the bottlenecks that develop at the most popular or heavily
trafficked points of the network. With increasing numbers of subscribers, these bottlenecks will be
a more difficult issue to address. Various forms of IP multicast applications over the Internet have
subsequently been designed and implemented in the last few years. However, these applications are
not cost effective, and do not solve the Internet congestion problem [10]. Instead, they lead to more
congestion. Ultimately, much larger margins for peak data traffic capacity must be incorporated
into the requirements for the Internet infrastructure.
Hence, simultaneous downloads and streaming of large data content calls for a new solution that
10
implements IP multicast without increasing or complicating the traffic on the Internet. This solution
involves the use of Digital Television (DTV) transmission infrastructure to deliver data to a PC or
a digital enabled set-top box that has a digital signal receiver. A reliable IP multicast protocol
that takes full advantage of the fast delivery system of DTV channels reduces Internet congestion,
provides an alternative avenue for transmission of large data content that is in high demand, and
finally, leads to efficient and economic use of the underutilized DTV spectrum.
2.2
Problem Description
A digital television signal is transmitted over the same general set of frequencies used by analog
television broadcasts, but instead of continuous analog components carrying video and audio information, there is a single, high-speed bitstream [9]. This bitstream is a combination of encoded video,
encoded audio and system data (e.g., program guides). A digital video signal is created by digitizing
the image to be transmitted into a frame of pixels, then reducing the number of bits needed to
represent the image using a compression method sanctioned by the Motion Picture Group known as
MPEG-2. Further, a digital audio signal is created by digitizing the sound to be transmitted and
compressing the number of bits needed to represent the signal using a compression method known
as Dolby Digital.
In the US, the FCC mandates that by the year 2003, all broadcasts must be digital, and analog
TV is scheduled to end in 2006 [6].
Consequently, FCC is in the process of implementing and
allocating DTV frequencies for over the air transmission, otherwise known as 'terrestrial'. In the
US alone, there are 68 'terrestrial' broadcast channels. Today, most US homes can receive a digital
signal, although few have digital receivers. Despite the fact that nearly 2/3 of American homes have
cable service, more than 1/2 of all receivers use antennas [12].
Each DTV channel allocated has 19.4 million bits per second (Mbps) capacity to deliver content
to viewers, but far less than that amount is needed for the digital broadcast bitstream. Typically,
high definition TV (HDTV) uses 16 Mbps for its broadcast bitstream, while standard definition TV
(SDTV) uses 4 Mbps. In addition, empty packets that are inserted into the bitstream to synchronize
the transmission timing may waste up to 10% extra bandwidth [14].
All of the unused bandwidth provides an opportunity for broadcasters to capture millions of
dollars in revenue from new data services, in particular, those that implement IP multicast over
DTV channels. Broadcasters can offer viewers enhanced TV programming and delivery as well as
new subscription-based information and entertainment services that integrate Web content delivery
with their traditional TV programming. The enormous potential of IP multicast rests with DTV
channels where data is delivered just once but to many recipients. This process allows for download
of vast amounts of data to communities of users over a high-speed data link, that is, the unused
11
spectrum on a channel.
Examples of how IP multicast in this context may be used are corporate information to multiple
locations (e.g., price information, promotional media to supermarket or other retail chains), and
community information (e.g., curriculum-related media to schools and colleges). With the large
data capacity offered by IP multicast over DTV channels, it is entirely feasible that the whole
video content and sound programs could be downloaded and stored at the locations that they are
needed fast and efficiently without Internet overload. Hard disk drives and other memory devices
are increasing in capacity to tens, possibly hundreds, of gigabytes, yet their prices are falling fast.
Unlike DTV terrestrial channels where the currently unused bandwidth can be leveraged, in
digital cable channels, most of the bandwidth is already in use, sometimes in a wasteful manner [6].
Indeed, cable bandwidth is now in short supply due to the complicated ways in which cable operators
allocate their bandwidth. Additionally, despite the fact that cable channels have more bandwidth
at their disposal in comparison to DTV terrestrial channels, wireless transmission has a huge added
advantage in that it does not entail the need for cable drops and other physical connection points.
Also, other wireless devices such as cellular phones and mobile computers can be integrated into the
transmission implementation.
2.3
Problem Statement
New technologies that reclaim otherwise wasted bandwidth in DTV transmission infrastructure need
to be developed. Unfortunately, the current implementations that have been adopted do not take full
advantage of all the unused bandwidth of the DTV channels [7]. The process of reclaiming the unused
bandwidth in the DTV transmission infrastructure in part involves the design of new applications
that inject Internet TCP/IP data into the available spaces of the transmission channel. Internet
data that does not need to be sent in real time can be inserted into a compressed digital bitstream on
a budgeted DTV channel on as-available time basis. This process is possible because the MPEG-2
compression scheme results in a very "bursty" output as a result of the time-varying information
content of the video signal being encoded. Additionally, real-time data, such as streaming Web
video or Web radio, can also be included in the implementation by developing effective buffering
techniques at the receiving end.
Nevertheless, the distinction between data broadcasting (one sender to many receivers) and oneto-one transmission as needed in telephone networks and specific Internet data exchanges must be
kept in mind. For example, combining Web access with watching TV programs creates a problem
in that everyone watches the same program but wants different data from the Internet. Assuming
that the data being transmitted during an IP multicast is the common data that everyone is trying
to download (e.g., corporate information to multiple locations, community information for schools),
12
this process would not be a problem. Otherwise, customized IP multicast is a possible alternative.
In customized IP multicast, the source sends all the data to all the receivers, but arranges
that each receiver only see the data that was requested. Internet intensive content can also be
transmitted and cached closer to the end-users location by developing Web caching technologies.
It is also appropriate to consider a scenario where a significant proportion, or all, of the available
bandwidth is allocated to a particular receiver on a time division basis to facilitate a very rapid
download/transmission of an item before serving the next request in a similar manner. This process
is possible through encapsulation whereby an IP encapsulator is configured to allocate available
bandwidth to any one receiver in the desired manner [6].
2.4
The Objective and General System Model
The objective is twofold. First, to implement IP multicast protocols that enable proficient spectrum sharing of DTV content and Internet data, while at the same time enhancing DTV spectrum
efficiency. Second, to provide an alternative route for Internet streaming media data other than
the Internet transport infrastructure, therefore, reducing Internet congestion mainly caused by multiple simultaneous downloads of popular data. The implementation employed is based on a PC
DTV viewing environment, although it could be adopted for digital set-top boxes and digital TV
sets that support software. The results from the implementation illustrate that the transmission
speed of popular Internet data to PC viewers can be significantly increased using DTV transmission
infrastructure. The approach taken in the implementation aims to emphasize that an economic
model whereby broadcasters exchange unused bandwidth with content providers for revenue can be
realized.
To meet the objectives stated above, DTV channel coding and compression schemes are studied
and analyzed in depth. A digital transmission channel is modeled using an MPEG-2 encoder, a
broadcast quality MPEG-2 bitstream from a typical broadcast station, and an MPEG-2 decoder to
convert the encoded bitstream into a sequence of video frames. The traffic characteristics of the
MPEG-2 bitstream generated by the encoder are studied and analyzed. The channel transmission
budget is estimated based on the current broadcasting standards. A protocol for packaging Internet
data (IP packets) into the unused spectrum of the channel on as-available time basis without corrupting the bitstream content is implemented. Likewise, another protocol for extracting the injected
data from the bitstream and implementing the buffering at the client to hold the extracted data is
also developed.
Finally, the data rate delivered using DTV IP multicast is estimated and compared to the data
rates of the current Internet IP multicast implementations for similar data types. This process is
accomplished by extrapolating the results obtained from the experimentation to reflect a typical IP
13
multicast to multiple clients in a DTV transmission infrastructure, and comparing the observations
against those obtained from an Internet based IP multicast session. Several data types from the
Internet are accounted for in the design and implementation of DTV IP multicast protocols, e.g.,
streaming Web video, streaming MP3 data, and streaming Web radio data (please refer to appendix
A). Prior to being injected into the DTV bitstream at the transmission end, the Internet data is
packetized (i.e., broken down into packets). At the receiving end, these packets are extracted and
buffered to reconstruct the transmitted Internet data. In conducting the experiments, MPEG-2
bitstreams for SDTV transmission and HDTV transmission were considered.
14
Chapter 3
History & Background Information
3.1
Streaming Media and the Internet
Streaming media has been around for a few years, starting with RealNetworks' (Progressive Networks) streaming audio in mid 90s, and followed by streaming video a few years later [22]. Streaming
audio/video accentuates Web page information with video and voice but does not necessarily require
large storage resources at the receiver. Unlike the TV broadcast which has a dedicated local 6 MHz
(19.4 Mbps) channel, the quality (bit rate) of streaming video is limited by the available local and
long haul network transport capacity. Further, data packet transport on the Internet is currently
on a best effort basis, therefore, the quality of the streaming media can vary with traffic during
peak and off peak hours. Additionally, most local access is provided via modems with peak rates in
the range of 18.8 Kbps to 56 Kbps that produce a "jerky" effect combined with intermittent voice
during peak usage. A smaller video window with limited motion is also typical [22].
With the availability of cable modems and Digital Subscriber Line (xDSL) access at data rates
of 256 Kbps to 6 Mbps, which are much higher than Plain Old Telephone Service (POTS) modems,
broadband streaming media at 100 Kbps to 1 Mbps can readily be supported. This technology opens
the door for Internet content with almost TV-like quality and stereophonic sound while maintaining
a reasonable window size on the computer monitor screen. The issue remaining here is the transport
of such content over the Internet. With the rise in subscription by multiple users to interactive
broadband services that potentially lead to simultaneous downloads of large data content, the Internet infrastructure must be greatly expanded to support the transport of broadband streaming
media to multiple simultaneous users. As a result, there is an immense requirement for much more
transport capacity on the Internet transport infrastructure.
Another significant trend is the decreasing cost per Megabyte of hard disk storage, from $2 to
$0.03, over the last five years [4].
A typical hard drive (> 10 GB) today can economically store
15
content comparable to that of several CDs or a few DVDs.
With many simultaneous users and
multiple downloads, content distribution is capable of creating traffic congestion even with an OC192 fiber capacity (9.95 Gbps) on the Internet backbone. As an example, delivering a 1 GB file to
10,000 users in one hour via unicast (point-to-point transmission) over the Internet would require
backbone capacity on the order of 24 Gbps. To efficiently use the Internet backbone capacity and
serve multiple users, large files and streaming media must be transmitted via IP multicast.
3.2
IP Multicast over the Internet
The Internet consists of a network of many routers that use source-to-destination IP address routing
and primarily supports unicast (point-to-point) traffic. To deliver a large file or to stream packets
at a high data rate from a source to many receiving destinations, transmission must be repeated for
each user, even when there is more than one end user destination at the far end of the routing path.
This process is highly inefficient and presents multiple transmissions of the content at the source
and also results in large amounts of redundant traffic on the adjacent paths from the source. As a
result, less capacity is available for other unicast traffic, and congestion occurs with content that is
in high demand.
To eliminate duplicate transmissions associated with unicast, Internet IP multicast was introduced.
Internet IP multicast employs multicast enabled routers (mrouters) [16].
This form of
transmission relies on the mrouters to route multiple copies of packets to the appropriate distribution paths (one copy per path) leading to the intended end users. The routing intelligence at
the mrouter is accomplished by having each end user register to the multicast session address at
their closest mrouter. The registration process ripples backwards from mrouter to mrouter until it
reaches the mrouter closest to the source. This backward rippling forms a multicast distribution tree
that guides the transmission from the source. At the end of the multicast session, the distribution
tree gets torn down by the mrouters, and in effect end user registrations are dropped until a new
multicast session is initiated.
Due to the complexity of many different topological configurations of end users that Internet IP
multicast has to support, various protocols for setting up mrouter-networks for IP multicast have
been developed (e.g., MSDP, OSPF, RIP, BGMP). These different protocols are designed to cover
dense and sparse network operations, to prune and graft distribution trees, to administer session
log on/off, and to locate the multicast sources [10]. Nevertheless, the process of providing a broad
base IP multicast capability for many end users located anywhere on the Internet, and at any time,
requires that all unicast routers be upgraded to mrouters. The use of these complex protocols can
prove to be difficult when applied on a large scale as well. In addition, the cost of replacing the
existing routers before their full capital depreciation with mrouters cannot be readily justified for
16
small multicast sessions since these sessions can be conducted with unicast routing without as much
expense.
To support data transmission during Internet IP multicast sessions via mrouters, several high
level protocols, mostly real-time oriented, have been implemented [17]. These include the Real-time
Transport Protocol (RTP), the Real-time Control Protocol (RTCP) that works in conjunction with
RTP, the Resource Reservation Protocol (RSVP), and the Real-time Streaming Protocol (RTSP).
These protocols, at different levels of maturity, have already been used in implementations of Internet
IP multicast applications such as Multicast Backbone (MBONE).
The MBONE, sometimes called the Multicast Internet, is a virtual network layered on top of the
physical Internet to support routing of IP Multicast packets over the Internet [19]. Set up in 1994
as an experimental and volunteer effort, the MBONE originated in an effort to multicast audio and
video from the Internet Engineering Task Force (IETF) meetings. Since most IP servers and routers
on the Internet do not have the mrouters capabilities, the MBONE was designed to form a network
within the Internet that could transmit IP multicasts via all kinds of routers.
During an MBONE IP multicast session, tunneling is used to forward multicast packets through
routers on the network that are not designed to handle multicast, i.e., the non-mrouters. In the
tunneling procedure, an MBONE router that is sending a packet to another MBONE router through
a non-MBONE part of the network must encapsulate the multicast packet as a unicast packet
so that the non-MBONE router can be able to transmit it [19].
The receiving MBONE router
un-encapsulates the unicast packet upon reception and forwards it appropriately. This process
complicates the Internet traffic and does not save significant bandwidth. Consequently, the current
implementation of MBONE application has a channel bandwidth of about 500 Kbps.
It is therefore important to explore IP multicast over DTV transmission infrastructure. However,
as an emerging new digital field, this infrastructure is compounded with myriad ideologies that have
resulted in competing standards and at times conflicting technologies, some of which may have to
be reconciled to guarantee success in the transition from analog transmission to digital. The US
DTV standards and the European standards are primed to dominate the DTV industry.
3.3
DTV Transmission Infrastructure Standards Development
Television broadcasts began in the United States in 1939 with the National Broadcasting Company
(NBC). The Federal Communications Authority (FCA), the forerunner of today's Federal Communications Commission (FCC), set the first American standards for analog broadcast television in 1941.
In 1953, the National Television System Committee (NTSC) set the standards for color television
broadcasts in the United States [8]. The NTSC standards are also used in Japan. In the US, the
Advanced Television System Committee (ATSC) develops standards for DTV transmission, while
17
the Digital Video Broadcasting (DVB) sets the standards in Europe [8].
3.3.1
Conventional Analog Television Standards
There are 3 major analog television standards in the world. That is, the US National Television Standards Committee (NTSC), the European Phase Alternation Line (PAL), and the French Sequential
Couleur Avec Memoire (SECAM) [5].
The National Television Standards Committee (NTSC)
In 1953, the NTSC was responsible for developing a set of standard protocols for TV broadcast
transmission and reception in the United States. The NTSC standards have not changed significantly
since their inception, except for the addition of new parameters for color signals [5]. NTSC signals
are not directly compatible with computer systems, hence, the need for adapters in the computer
environment.
An NTSC TV image has 525 horizontal lines per frame (complete screen image). These lines are
scanned from left to right, and from top to bottom. In the scanning, every other line is skipped,
therefore, it takes two screen scans to complete a frame: one scan for the odd-numbered horizontal
lines, and another scan for the even-numbered lines. Each half-frame screen scan takes approximately
1/60 of a second; hence, a complete frame is scanned every 1/30 second. This alternate-line scanning
system is known as interlacing.
Phase Alternation Line (PAL)
PAL is the analog television display standard that is used in Europe and certain other parts of the
world. PAL scans the cathode ray tube horizontally 625 times to form the video image.
Sequential Couleur Avec Memoire (SECAM)
SECAM analog TV display technology is the standard in France, and the countries of the former
Soviet Union. Like PAL, SECAM scans the cathode ray tube horizontally 625 times to form the
video image.
3.3.2
The Advanced Television Systems Committee (ATSC)
In 1987, the FCC formed the Advisory Committee on Advanced Television Service (ACATS) whose
purpose was to advise the FCC on the development of advanced television (ATV). ACATS decided
not to consider further improvements on the NTSC analog television system but instead to concentrate solely on DTV - an all-digital television system [6]. Hence, ATSC, a standards organization
created in 1982 by companies in the television industry embarked on promoting the establishment
18
of technical standards for all aspects of ATV systems. Based in Washington, D.C., ATSC has an
international membership of over 200 organizations (up from an original 25) that includes broadcasters, motion picture companies, telecommunications carriers, cable TV programmers, consumer
electronics manufacturers, and computer hardware and software companies.
The ATSC standards specify technologies for transport, format, compression, and transmission
of DTV in the U.S. The main ATSC standards for DTV are 8-Level Vestigial Sideband (8-VSB)
modulation format, MPEG-2 standards for video signal compression, and Dolby Digital for audio
signal coding. The ATSC is finalizing DTV standards for data broadcasting and interactive services.
ATSC standards have sparked controversy in DTV industry in the US. First, cable companies
have not yet determined how to efficiently integrate ATSC standards into their television systems
because cable systems use different modulation formats from those enacted by ATSC. One of the
modulation formats used by cable systems is the Quadrature Amplitude Modulation (QAM) scheme,
whose features allow cable operators to encode many programs into their cable spectrum [6]. However, the "must carry" rule enforced by FCC upon cable operators demand that cable transmissions
must carry local broadcast programs alongside their own content. Additionally, the satellite transmissions use yet another set of different modulation formats, mostly, the Quadrature Phase Shift
Keying (QPSK) scheme [9]. Satellite systems are also under the "must carry" rule in the US.
Second, the current DTV standards controversy is also stranded with competition from European
standards. In Europe, Digital Video Broadcasting (DVB) sets the standards for terrestrial broadcast.
DVB uses Coded Orthogonal Frequency Division Multiplexing (COFDM) as the modulation scheme
and MPEG-2 standards for both audio and video encoding. Nevertheless, ATSC DTV standards
will likely dominate North America, Mexico, Japan and Korea [6].
3.3.3
The Digital Video Broadcasting (DVB) Project
In the early 1990s, European broadcasters, consumer equipment manufacturers, and regulatory bodies formed the European Launching Group (ELG) to discuss introducing DTV throughout Europe.
The Digital Video Broadcasting (DVB) project was created from the ELG membership in 1993 [22].
A fundamental decision of the DVB project was the use of Coded Orthogonal Frequency Division
Multiplexing (COFDM) as a modulation format and the selection of MPEG-2 for compression of
both audio and video signals. DVB is reputed for its robust transmission that opens the possibilities
of providing crystal-clear television programming to television sets in buses, cars, trains, and even
hand-held televisions.
3.3.4
Digital Broadcast Schemes and Modulation Formats
The two most dominant modulation formats in DTV transmission are the 8-Level Vestigial Sideband (8-VSB) and the Coded Orthogonal Frequency Division Multiplexing (COFDM). 8-VSB is a
19
standard radio frequency (RF) modulation format chosen by ATSC for the transmission of digital
television to consumers in the United States and other adopting countries. Countries in Europe
(and others under DVB project) have adopted an alternative format, the COFDM [22].
The ATSC 8-Level Vestigial Sideband (8-VSB)
The ATSC 8-VSB system uses a layered digital system architecture consisting of a picture layer that
supports a number of different video formats; a compression layer that transforms the raw video and
audio samples into a coded bitstream; and a radio frequency (RF) modulation
/
transmission layer
[5]. The ATSC 8-VSB system is a single carrier frequency technology that employs vestigial sideband
(VSB) modulation similar to that used by conventional analog television. The transmission layer
modulates a serial bitstream into a signal that can be transmitted over a 6 MHz television channel.
The ATSC 8-VSB system transmits data in a method that uses trellis-coding with 8 discrete
levels of signal amplitude [5].
Complex coding techniques and adaptive equalization are used to
make reception of the transmitted data more robust to propagation impairments such as multipath
(strong static signal reflections), noise and interference [21].
The 6 MHz ATSC 8-VSB system
transmits data at a rate of 19.4 Mbps. There is also a 16-VSB mode that has 16 discrete amplitude
levels and supports up to 38.57 Mbps of data on a 6 MHz channel. 8-VSB is considered effective
for the simultaneous transmission of more than one DTV program (statistical multiplexing) and the
broadcasting of data alongside a television program (datacasting) because it supports large data
payloads.
DVB-T Coded Orthogonal Frequency Division Multiplexing (COFDM)
DVB-T COFDM system is based on European Terrestrial Digital Video Broadcasting (DVB-T)
standards. In contrast to VSB, DVB-T COFDM system is a multi-carrier technology. The principle
of COFDM is to break a single data stream into many parallel, lower rate data streams and then use
many sub-carriers to transmit these lower rate streams of data simultaneously [5]. To ensure that
the sub-carriers do not interfere with one another, the frequency spacing between them is carefully
chosen so that each sub-carrier is mathematically orthogonal to one another [21]. The individual
sub-carriers are typically modulated using a form of either quadrature amplitude modulation (QAM)
or quadrature phase shift keying (QPSK) [21].
The multi-carrier design of COFDM makes it resistant to transmission channel impairments,
such as, multipath propagation, narrowband interference and frequency selective fading [5]. COFDM
avoids interference from multipath echoes by increasing the length of the signal samples so that it
is greater than the temporal spread of the multipath, and by applying a guard interval between
data symbols where the receiver does not look for information. Guard intervals are designed such
that most multipath echoes arrive within the guard period, and therefore do not interfere with
20
the reception of data symbols.
Further, because information is spread among many carriers, if
narrowband interference or fading occurs, only a small amount of information is lost.
ATSC 8-VSB versus DVB-T COFDM
Each system has its unique advantages and disadvantages. The ATSC 8-VSB system, in general, has
a higher data rate capability, has better threshold or carrier-to-noise (C/N) performance, requires less
transmitter power for equivalent coverage, and is more robust to impulse and phase noise [5]. On the
other hand, the DVB-T COFDM system has better performance in both dynamic and high level (up
to 0 dB) long delay static multipath situations [6]. The COFDM system may also offer advantages
for single frequency networks and mobile reception. A single frequency network is a network of
several stations that broadcast the same signal simultaneously using multiple transmitters.
The data throughput of COFDM DTV operation in a 6 MHz channel is less than the 19.4
Mbps provided by 8-VSB operation. Tests indicate that for a 6 MHz channel, COFDM provides a
useable data rate of 18.66 Mbps, about 5 percent less than 8-VSB [5]. While a 5 percent data rate
difference is relatively small, it has some impact on the ability to provide certain high definition
television programming. This difference makes 8-VSB more suitable for data applications, including
emerging broadband services, as well as more appropriate for HDTV programming. The higher data
capacity also enables 8-VSB to provide other services more efficiently, such as multi-channel video
and ancillary data. In addition, 8-VSB system operation is significantly more cost effective for DTV
transmission [5]. A COFDM station construction costs would be higher because of the need for a
more powerful transmitter, heavier antenna and transmission lines, and possibly a stronger tower.
The cost of operating the station would also increase because more electric power would be used.
Additional power would be needed because COFDM has a higher C/N threshold than 8-VSB
and it also operates with a higher peak-to-average signal power ratio than 8-VSB. As such, COFDM
station would require substantially more power than 8-VSB (6 dB, or four times, more power) to
serve the same area [5]. Broadcasters using COFDM would therefore be faced with losing substantial
coverage, or incurring significantly higher costs for more powerful transmitters and additional electric
power for operation. 8-VSB also exhibits more immunity to impulse noise than COFDM. Impulse
noise occurs particularly in the VHF band and the lower portion of the UHF band. There have been
significant problems of interference from impulse noise (RF noise from vacuum cleaners, hair dryers,
light dimmers, power lines, etc.) to COFDM service in Great Britain [5]. COFDM is 8 dB more
susceptible to impulse noise commonly found in consumer homes.
In theory, 8-VSB and COFDM systems should be able to perform nearly the same in providing
service where there is static multipath but COFDM can generally be expected to perform better
in situations where there is dynamic multipath, e.g., in mobile operations. With 8-VSB, multipath
reflection, or ghosting, is processed through an adaptive equalizer and more complex equalizers are
21
needed to handle stronger reflections and longer intervals of reflection. As a single carrier system,
VSB has a higher symbol rate relative to multi-carrier systems such as COFDM. When a signal
is transmission, it is met with obstructions such as buildings that scatter it and cause it to take
multiple paths to reach its final destination - the receiver. VSB data symbols might not be long
enough to withstand multipath echoes without complex adaptive equalization techniques.
COFDM's better performance in multipath situations makes it attractive for mobile television
viewing. Indeed, COFDM is ideal for Europe because stations in Europe transmit the same signal
100 percent of the time across many borders using single frequency networks [8]. However, COFDM's
benefits for large single frequency network operation and mobile service may be inconsistent with the
current structure of broadcasting in the United States and other ATSC-compliant countries. In these
countries, different programs along with local advertising are broadcast at different times throughout
the day depending upon geographic location. Therefore, in order to replicate existing NTSC service
with COFDM, it might be necessary to revisit the DTV Table of Allotments [5]. Early evaluations
of VSB indicated that it does not support mobile television viewing. As a result, VSB equipment
manufacturers are developing ways to solve this problem and that of multipath conditions. It is
expected that devices such as internal antennas will help in overcoming these limitations [8].
It is clear that DTV modulation techniques need to be enhanced, resolved and reconciled for the
success of DTV transmission. Nevertheless, the research, the ideas, the design and implementation,
and the analysis presented in this thesis are not limited to any particular modulation format or
encoding techniques. The approach employed can be adopted for a variety of transmission schemes.
3.3.5
Moving Picture Experts Group (MPEG) Standards
The MPEG group was established to develop a common format for coding and storing digital video
and associated audio information [7]. The MPEG standards are an evolving set of standards for video
and audio compression developed by MPEG. The MPEG group completed the first phase of its work
in 1991 with the deployment of MPEG-1 standard. MPEG-1 was designed for coding progressive
video at a transmission rate of about 1.5 Mbps. It was designed specifically for multimedia CDROM (compact disks) applications [13].
MPEG-1 audio layer-3 (MP3) also evolved from early
MPEG work.
In response to a need for greater input format flexibility, higher data rates and better error
resilience, MPEG-2 standard was developed [7]. MPEG-2 was designed for coding interlaced images
at a transmission rate of above 4 Mbps. MPEG-2 is used for digital television broadcast, and digital
versatile disk (DVD) content. An MPEG-2 player is backward compatible, meaning it can also
handle MPEG-1 data [6].
A proposed MPEG-3 standard, intended for HDTV, was merged with the MPEG-2 standard
when it became clear that the MPEG-2 standard met the HDTV requirements. An MPEG-4 stan-
22
dard is in the final stages of development and release. It is a much more ambitious standard and
addresses speech and video synthesis, fractal geometry, computer visualization, and an artificial intelligence (AI) approach for reconstructing images [6]. An MPEG-7 is also now being discussed, but
is still in its conceptual stage.
Motion Picture Experts Group standard 2 (MPEG-2)
The MPEG-2 encoding standard is the accepted compression technique for all sorts of new products
and services that come with DTV transmission - from satellite broadcasting to DVD to the new DTV
transmission [6]. MPEG-2 video compression exploits spatial and temporal redundancies occurring
in video [7]. Spatial redundancy is exploited by simply coding each frame separately with a technique
referred to as Intraframe coding [13]. Additional compression can be achieved by taking advantage
of the fact that consecutive frames are often almost identical. This temporal compression, which has
potential for major reduction over simply encoding each frame separately, is referred to as Interframe
coding [7].
In Intraframe coding, a frequency-based transform (discrete cosine transform - DCT) algorithm
is used to explore spatial correlations between nearby pixels within an image frame [1].
On the
other hand, motion-compensated prediction algorithm is used in Interframe coding. In this case,
the differences between a frame and its preceding frame are calculated and only those differences
are encoded [7]. In the simplest form of Interframe coding, the Intraframe technique is used to code
the differences between two successive frames.
Additional techniques that are also used in MPEG-2 compression include quantization, bidirectional prediction, and Huffman coding. Quantization, also known as "lossy" compression, is a
technique for losing selective information that can be acceptably lost from visual information without affecting how the human eye perceives the image [13]. In bi-directional prediction, some frames
are predicted from the content of the frames that immediately precede and immediately follow them.
Huffman coding compression technique uses code tables based on statistics about the encoded data
[6].
The ultimate goal of MPEG-2 compression technique is the bit-rate reduction for storage and
transmission of data by exploring redundancies within that data. By using MPEG-2 coding, broadcasters can transmit digital signals using existing terrestrial, cable, and satellite systems. SDTV and
HDTV digital television formats use MPEG-2 compression technique. SDTV's picture and sound
quality is similar to that of a DVD. On the other hand, HDTV programming presents five times as
much information than SDTV, resulting in cinema-quality programming.
23
3.3.6
Dolby Digital
Dolby Digital is a digital audio coding technique that reduces the amount of data needed to produce
high quality sound by taking advantage of how the human ear processes sound [6]. The fewer the bits
used to represent an audio signal, the greater the coding noise; therefore, effective data rate reduction
also involves audio noise reduction. Dolby Digital supports a five-channel audio transmission system
and a low-frequency subwoofer for a full surround sound. In consumer electronics industry and
ATSC-compliant countries, Dolby Digital soundtrack is the standard audio format for DVD, SDTV,
HDTV, and is also used for digital cable and satellite transmissions.
Dolby Digital coding takes maximum advantage of human auditory masking by dividing the
audio spectrum of each channel into narrow frequency bands of different sizes that are optimized
with respect to the frequency selectivity of human hearing. When the coding noise is close to the
frequency of an audio signal, the audio signal masks the noise such that the human ear hears only
the intended audio signal [6]. This property makes it possible to sharply filter out the coding noise
by forcing it to stay very close in frequency to the frequency components of the audio signal being
coded. Sometimes the coding noise cannot be masked because it is not in the same frequency as an
audio signal. In such cases, the noise must be reduced or eliminated to preserve the sound quality
of the original signal. Masking, reducing, or eliminating the noise can reduce the amount of data in
the audio signal to 1/10 of the data on a compact disk (CD).
3.4
Transition from Analog to DTV Transmission
The FCC set deadlines for stations in the US to complete the DTV transition process. Commercial
television stations must complete construction of DTV facilities by 2002 and public television stations
must complete their DTV facilities by 2003. The FCC's schedule for transition to DTV proposes
that everyone in the US should have access to DTV by 2002, although analog transmissions will
continue for some time after that year. After the switch to digital has been completed, regular
television sets will need converters to receive broadcasts. To make the transition to DTV a smooth
one, in the Telecommunications Act of 1996, FCC allotted to each existing broadcaster an additional
6 MHz channel for digital transmissions so that broadcasters can continue to send out both analog
and digital transmissions simultaneously during the transition period. This procedure is known as
simulcasting.
Broadcasters must make the transition from analog to digital transmission for several reasons.
The key benefit is the high transport efficiency of a digital format broadcast. Digital compression
packs five or more times as many channels in a given distribution-network bandwidth [11] relative
to analog transmission. This added advantage in turn increases the broadcaster's revenue potential
by delivering more content to the end-users.
24
Second, all other related segments of the telecommunications industry that are either direct or
indirect competitors to the terrestrial broadcasting have made, or are in the process of making
the transition to digital. These competitors include commercial wireless service providers, such
as cellular and mobile computers; wired services, such as digital subscriber line (DSL) and cable
television systems; and direct broadcast satellites. For the broadcasters to remain competitive and
improve their services, they must make the transition.
Third, the advantages of using digital techniques relative to analog for representing, storing,
processing and transmitting signals are overwhelming [11].
Digital signals are more robust that
analog signals. Therefore, once digital transmission errors occur, they can be detected and corrected.
In addition, digital signals can be encrypted with ease. They can also be manipulated and processed
using modern computer techniques and as such, take advantage of the greater processing power and
falling costs of computers. Additionally, different types of signals or services can be multiplexed or
provided on a common transmission facility with ease in digital signals.
Finally, a successful transition of television broadcast from analog to digital will free up spectrum
for other uses as determined by the marketplace [11]. It is possible to do much more with the current
6 MHz channel than what today's analog SDTV provides. With digital technology, we can continue
to have traditional broadcast services as well as exciting new broadcaster-provided services that
include HDTV, multiple streams of SDTV (statistical multiplexing), and new datacasting services
such as DTV IP multicast.
3.5
The PC/Internet-enabled Devices in DTV Transmission
The inception of DTV creates new opportunities for the PC industry and other Internet-enabled
devices. Trends in the current DTV market dictate that the PC and other Internet-enabled devices
will play a major role in the future of digital broadcasting [9]. Today, the visual computing PC
has the processing power and the flexibility to receive digital information of all kinds, including
digital broadcast. It also stores, processes and provides highly visual information to the viewer.
The PC model as a vessel for digital broadcasting presents an interactive medium for DTV transmission. In most households today, the PC is the only programmable device that is connected to
the Internet. Nevertheless, other household devices such as PDAs and mobile computers are quickly
being transformed into Internet access devices, and in the process providing another avenue for DTV
transmission.
As the PC market grows and its visual computing performance increases, the media viewership trends show that Internet viewing hours are increasing, while the traditional TV viewing is
going down [4].
The new digital medium presents an opportunity for broadcasters and content
providers to offer new data services alongside interactive programming to the PC-viewing audience.
25
Currently, most of the PC viewer-ship is conducted via Web streaming at the expense of increasingly
overcrowding the Internet. DTV transmission infrastructure facilitates better viewer-ship without
over burdening the Internet. Additionally, it provides an opportunity to reduce Internet congestion
by providing another route for streaming media data and large downloads.
PCs are getting cheaper and becoming easier to use. A PC with a large screen display, equipped
for DTV reception, is likely to cost far less than a digital television set, and in addition offers increased
benefits such as Internet access and other interactive features. Most consumer digital television sets
are being introduced at a price between $5,000 and $10,000. A well-equipped PC, with a DTV
receiver and a large screen monitor costs less than half as much, yet offers more functionality [3].
The rate of technology change of the PC and other handheld household devices supersedes that of
the mostly passive TV [4]. Thus, new communication technologies are likely to be easily integrated
into these devices faster than in TVs; in part also due to the shorter average lifetime of PCs and
other handheld household devices relative to TVs.
Most major companies in the PC and TV industry support PC Theater [3]. For example, Compaq
Computer and Intel have proposed the PC Theater initiative, which would establish "plug and play"
standards that would let audio/video devices and PC devices work together. PC Dell Computer
is currently pre-installing DTV receivers (PCI-bus receiver-cards) in their products to make them
more attractive to end-users. Digital TV tuners that let PCs pick up TV signals cost about $150.
Since DTV receivers are projected to fall in prices, the use of PCs equipped with digital receivers is
projected to be on the rise in the US.
As the PC and digital TV convergence begins, the computer industry hopes to dominate by
quickly producing and selling many reasonably priced, TV ready PCs. PC makers estimate that
they can have 40 million digital machines in homes by the end of 2001 [3]. By contrast, studies from
Forrester Research indicate that the TV industry may not sell 20 million digital sets by then. Also,
many consumers may decide to buy converter boxes and keep their analog TVs for a while rather
than purchase new digital TVs. These owners who do not want to buy expensive digital sets will
be able to receive digital format broadcast programming but they will not benefit from digital TV's
high quality pictures and sound [3].
IP multicast over DTV transmission infrastructure presents an attractive avenue for the transmission of large Internet data to the PC Theater environment. Internet data that does not need to
be sent in real time can be inserted in a compressed video stream on an as-available time basis. This
process can also be extended to real-time data like streaming Web video or Web radio by developing
effective buffering techniques at the client. The delivery capacity of DTV IP multicast "blows away
any broadband on the horizon".
26
3.6
DTV Encoders and Developments in DTV IP Multicast
Most commercially available MPEG-2 transmission encoders use Constant Bit-Rate (CBR) mode
of transmission [7]. CBR is a uniform transmission rate, which means that for varying information
content, this transmission mode does not use up all the available bandwidth in a channel.
The
design of CBR mode of transmission is such that it guarantees adequate bandwidth for peak data
rates during real-time transmission; hence its popularity in real-time voice and video traffic. In
Asynchronous Transfer Mode (ATM), for example, CBR guarantees bandwidth for the peak cell
rate of the application [2]. However, the individual video frames that are encoded using MPEG2 compression techniques contain drastically varying amounts of information, resulting in wildly
varying encoding requirements from frame to frame in order to maximize bandwidth efficiency.
With CBR encoders, however, the output must be selected large enough to achieve a minimum
quality of the video frame with the most information, that is the frame with data from the most
difficult scene to encode.
In order to achieve maximum bandwidth efficiency for bursty data traffic, such as MPEG-2
bitstream, Variable Bit Rate (VBR) mode of transmission is more appropriate. However, the fact
that TV broadcast needs to be real-time, implementations of VBR that try to leverage the unused
bandwidth in the DTV channels results in undesirable delays of the information being transmitted
and complicated buffering at the transmitter. As a result, VBR encoding is common in storage
applications such as in DVDs, but not in transmission applications. However, VBR encoding can
be improvised such that more bandwidth in DTV channels that would otherwise be idle is used
for transmission of other data services. VBR transmission leads to more leverage of the unused
bandwidth of DTV channels in comparison to the current CBR transmission.
27
Chapter 4
Theoretical Analysis
4.1
Internet Transport versus DTV IP Multicast
The Transmission Control Protocol (TCP/IP) protocols work very well with one-to-one transmission traffic on an Internet transport system that is not over-congested [18]. This traffic, e.g., e-mail
exchanges, does not usually involve transfer of large data to multiple users, and at times simultaneously, as is the case with Internet media streaming. Multimedia traffic, which comprise a significant
portion of potential IP multicast traffic, possess different characteristics and hence require the use of
different protocols to provide the necessary services without over-burdening the Internet transport
system. Multimedia applications can generally forego the complexity of TCP/IP protocols and use
instead a simpler transport framework. Most playback algorithms can tolerate missing data much
better than lengthy delays caused by retransmissions that are common in TCP/IP protocols, and
also, they do not require guaranteed in-sequence delivery [18].
Large data content downloads involve online delivery of streaming media (video/audio), shareware applications, software upgrades, and electronic documentation. Although the TCP/IP protocols ensure that downloads are 100% accurate by their transmission control from router to router,
they are not capable of covering many simultaneous users without replicating each transmission
[18]. If there are 100 users online requesting the same content, that content has to be downloaded
100 times. To reach many users simultaneously with minimum transport capacity, the use of IP
multicast data transfer is preferred.
Unfortunately, simple Internet IP multicast leads to more Internet congestion. In addition, it
uses the User Datagram Protocol (UDP/IP) protocol, which unlike TCP/IP does not support retransmission between routers for error correction [16]. TCP/IP is not applicable in IP multicast
implementation because it is a point-to-point protocol that has a return path requirement for acknowledgement [16].
UDP/IP provides only a best effort transmission with error packets being
28
dropped instead of being retransmitted. On transmission paths with either significant traffic congestion or high error rates, file corruption, and to some extend corruption of streaming media is
highly probable [18].
Using IP multicast over DTV transmission infrastructure makes it possible to create services
that can be simultaneously distributed over the DTV spectrum and the Internet. Further, the use
of DTV IP multicast provides the broadcasters and content providers with a simple solution to use
standard computer applications to access data transmitted to computer hosted receivers. DTV IP
multicast also provides the broadcasting system with a simple mechanism for data transport using
optimized application layer protocols for streaming media as well as file transfer.
DTV IP multicast operates above the transport layer of the encoded MPEG-2 stream, but the
physical nature of the underlying transmission infrastructure is transparent to its implementation.
An important challenge facing DTV service designers lies in devising data services that operate in the
most common, broadcast only environment and scale up in user-level functions with the increased
capability of the underlying DTV infrastructure [11].
4.2
The Structure of an MPEG-2 Bit-Stream
An MPEG-2 bitstream contains three major frame types: Intracoded (I) frames which are selfcontained still pictures, Predictive (P) frames which are block-by-block differences with the previous
frames, and Bi-directional (B) frames which are the differences with the previous and next frames
[7]. While the I-frames are intracoded, the P-frames and the B-frames are intercoded. The I-frames
must therefore appear regularly in the stream since they are needed to decode subsequent intercoded
frames, that is the P-frames and the B-frames. Indeed, the decoding process cannot begin until an
I-frame is received. An I-frame is usually inserted into a stream approximately every half second.
P-frames and B-frames use macroblocks to code interframe differences. A macroblock is composed
of 16x16 pixels in the luminance space and 8x8 pixels in the chrominance space for the simplest
color format [7]. A macroblock is encoded by searching the previous frame or the next frame for the
closest match. In a frame with a fixed background and a moving foreground object, for example,
the foreground object can be represented by macroblocks from the previous frame and an offset that
represents the motion. While the P-frames only require the past frames as a reference to code the
differences, the B-frames require both past and future frames. The information content of the three
frame types in an MPEG-2 bitstream varies greatly. The I-frames have the largest sizes, while the
P-frames tend to have larger sizes relative to the B-frames.
29
4.3
Fundamentals of MPEG-2 Video Compression Algorithms
The ultimate goal of video compression algorithms is the bit-rate reduction of data for transmission
and storage by exploring redundancies in the data content and encoding a "minimum set" of information. The performance of video compression techniques depends on the amount of redundancy
contained in the image data as well as on the actual compression techniques used for coding. With
practical coding schemes, a trade-off between coding performance (high compression with sufficient
quality) and implementation complexity is targeted. For the development of the MPEG-2 compression algorithms, the consideration of the capabilities of current and future technologies as guided by
the existing standards was most important [13].
Depending on the applications requirements, the compression of data may be "lossless" or "lossy".
The aim of "lossless" coding is to reduce video data for transmission and storage while retaining the
quality of the original images prior to encoding. In contrast, the aim of "lossy" coding techniques which is more relevant to the applications envisioned by MPEG-2 video standards - is to meet a given
target bit-rate for transmission and storage [13]. DTV transport infrastructure demands applications
that work efficiently with communication channels that have low or constrained bandwidth. In these
applications, high video compression is achieved by degrading the video quality such that the decoded
image quality is reduced (to an acceptable level) compared to the quality of the original images prior
to encoding. The smaller the capacity of the channel the higher the necessary compression of the
video data. The ultimate aim of lossy coding techniques is to optimize image quality for a given
target bit-rate subject to an optimization criteria.
4.4
The MPEG-2 Video Coding Techniques
The MPEG-2 digital video coding techniques are statistical in nature. Video sequences usually
contain statistical redundancies in both temporal and spatial directions [7]. The basic statistical
property upon which MPEG-2 compression techniques rely on is the inter-pixel correlation, including
the assumption of simple correlated motion between consecutive frames. Thus, it is assumed that
the value of a particular image pixel can be predicted from nearby pixels within the same frame
(using Intraframe coding techniques) or from pixels of a nearby frame (using Interframe techniques).
It is clear that in some circumstances, e.g., during scene changes of a video sequence, the temporal
correlation between pixels in nearby frames is small or even vanishes to an extent that the video
scene resembles a collection of uncorrelated still images. In such cases, Intraframe coding techniques
are appropriate to explore spatial correlation in order to achieve efficient data compression.
The MPEG-2 compression algorithms employ Discrete Cosine Transform (DCT) coding techniques on image blocks of 8x8 pixels for Intraframe coding [1]. However, if the correlation between
pixels in nearby frames is high, i.e., in cases where two consecutive frames have similar or identical
30
content, it is desirable to use Interframe coding techniques that employ temporal prediction (motion
compensated prediction between frames). In MPEG-2 video coding schemes, an adaptive combination of both temporal motion compensated prediction followed by transform coding of the remaining
spatial information is used to achieve high data compression.
4.4.1
Intraframe Coding Techniques - Transform Domain Coding - DCT
The term Intraframe coding refers to the fact that the various compression techniques are performed
relative to the information that is contained only within the current frame, but not relative to any
other frame in the video sequence. In other words, no temporal processing is performed outside of
the current picture or frame. The basic processing block of Intraframe coding is the Discrete Cosine
Transform (DCT). In general, neighboring pixels within an image tend to be highly correlated. As
such, it is desirable to use an invertible transform to concentrate randomness into fewer, decorrelated
parameters.
The DCT is near optimal for a large class of images in decomposition of the signal into underlying
spatial frequencies. In Fourier analysis, a signal is decomposed into weighted sums of orthogonal
sines and cosines that can be added together to reproduce the original signal. Besides decorrelation
of signal data, the other important property of the DCT is its efficient energy concentration. In this
manner, the sharp time domain discontinuities are eliminated, allowing the energy to be concentrated
more towards the lower end of the frequency spectrum.
Consequently, Transform Domain Coding is a very popular compression method for still image
coding and video coding. Once the Intraframe image content has been decorrelated, the transform
coefficients, rather than the original pixels of the image are encoded [13]. In this process, the input
images are split into disjoint blocks of pixels, b, of size NxN pixels.
The transformation can be
represented as a matrix operation using an NxN transformation matrix, A, to obtain the NxN
transform coefficients, c, i.e.
c = Ab
The transformation is reversible, hence the original NxN block of pixels, b, can be reconstructed
from c using inverse transformation, i.e.
b= A'c
Upon many possible alternatives, DCT implementation applied to small image blocks (macroblocks) of usually 8x8 pixels has become the most successful transform for still image and video
coding [13]. On top of their high de-correlation performance, DCT based implementations are also
31
used in most image and video coding standards due to the availability of fast DCT algorithms
suitable for real time implementations.
The major objective of the DCT based algorithms is to make as many transform coefficients
as possible small enough such that they do not need to be coded for transmission [1].
At the
same time, it is desirable to minimize statistical dependencies between coefficients with the aim of
reducing the amount of bits needed to encode the remaining coefficients. Coefficients with small
variances (the variability of the coefficients as averaged over a large number of frames) are less
significant for reconstruction of the image blocks than coefficients with large variances. On average,
only a small number of DCT coefficients need to be transmitted to the receiver to obtain a valuable
approximate reconstruction of the image blocks [1]. Further, the most significant DCT coefficients
are concentrated around the low DCT coefficients and the significance of these coefficients decay with
increased distance between the blocks. This implies that higher DCT coefficients are less important
for reconstruction than lower DCT coefficients.
The DCT is closely related to Discrete Fourier Transform (DFT) [20], therefore, DCT coefficients
can be given a frequency interpretation close to that of DFT. Thus, low DCT coefficients relate to
low spatial frequencies within image blocks and high DCT coefficients to high frequencies. This
property is used in DCT based coding schemes to remove redundancies contained in the image
data based on human visual systems criteria. The human viewer is more sensitive to reconstruction
errors related to low spatial frequencies than to high frequencies. Therefore, a frequency adaptive
weighting (quantization) of the coefficients according to the human visual perception (perceptual
quantization) is often employed to improve the visual quality of the decoded images for a given bit
rate [13].
4.4.2
Interframe Coding Techniques - Motion Compensated Prediction
The previously discussed Intraframe coding technique is limited to processing video signal on a
spatial basis, relative only to information within the current video frame. Considerably more compression efficiency can be obtained however, if the inherent temporal, or time-based redundancies,
are exploited as well. Temporal processing to exploit this redundancy uses a technique known as
Motion Compensated Prediction, which uses motion estimation. Starting with an Intraframe (Iframe), the encoder can forward predict a future frame. This frame is commonly referred to as a
P-frame, and it may also be predicted from other P-frames, although only in a forward time manner.
Each P-frame in this sequence is predicted from the frame immediately preceding it, whether it is an
I-frame or a P-frame. Unlike P-frames, I-frames are coded spatially with no reference to any other
frame in the sequence.
The encoder also has the option of using a combination of forward and backward interpolated
prediction. These frames are commonly referred to as bi-directional interpolated prediction frames,
32
or just B-frames. The B-frames are coded based on a forward prediction from a previous I-frame
or P-frame, as well as a backward prediction from a succeeding I-frame or P-frame. The main
advantage of the usage of B-frames is coding efficiency. In most cases, B-frames will result in less
bits being coded overall. Quality can also be improved in the case of moving objects that reveal
hidden areas within a video sequence. Backward prediction in this case allows the encoder to make
more intelligent decisions on how to encode the video within these areas. Also, since B-frames are not
used to predict future frames, errors generated will not be propagated further within the sequence.
As a result, the majority of the frame-types in an MPEG-2 bitstream are B-frames.
The temporal prediction technique used in MPEG-2 video is based on motion estimation. The
basic premise of motion estimation is that in most cases, consecutive video frames will be similar
except for changes induced by objects moving within the frames. In the trivial case of zero motion
between frames (and no other differences caused by noise), it is easy for the encoder to efficiently
predict the current frame as a duplicate of the prediction frame. When this is done, the only information necessary to transmit to the decoder becomes the syntactic overhead necessary to reconstruct
the picture from the original reference frame. When there is motion in the images, the process is
not as simple. Still, in such a case, the prediction of the actual video frame is given by a motion
compensated prediction from a previously coded frame.
Motion compensated prediction compression algorithms are a powerful tool in reducing temporal
redundancies between frames, and are therefore used extensively in MPEG-2 video coding standards
[13]. In these algorithms, video frames are usually separated into macroblocks of 16x16 pixels and
a single motion vector is used to estimate and code each of these blocks for transmission. Only the
difference between original images and motion compensated prediction images are transmitted.
4.5
Coding of Bit-Streams - CBR versus VBR
The MPEG-2 encoding standards define methods for multiplexing one or more audio, video, or
optional data streams. Each of these streams is packetized and then multiplexed to form a single
output stream for transmission [7]. Most commercially available MPEG-2 transmission encoders use
Constant Bit-Rate (CBR) mode of transmission [7].
As figure 4-1 illustrates, multiplexing of different data streams in a CBR mode of transmission
is straight forward - the channel bandwidth is appropriated according to the maximum requirement
of each stream. This process however, does not use up all the available bandwidth because the
individual video frames that are encoded contain drastically varying amounts of information. With
CBR encoders, the bandwidth apportioned to each stream must be large enough to transmit enough
data to achieve an acceptable quality of images from the frames with the most information, that is,
the I-frames.
33
Available Bandwidth
for Data
Total Bandwidth
Budgeted Bandwidth
for DTV Transmission
time
Figure 4-1: CBR Multiplexing of Data.
Available Bandwidth
for Data
Total Bandwidth
Budgeted Bandwidth
for DTV Transmission
time
Figure 4-2: VBR Multiplexing of Data.
34
On the other hand, figure 4-2 illustrates another form of MPEG-2 transmission encoders, the
Variable Bit-Rate (VBR) mode of transmission. This mode of transmission is commonly used in
storage applications such as in DVDs, but not in the current real-time transmission applications
because of potential delays and buffering requirements at the transmitter. However, an MPEG-2
bitstream is varying in nature because it is composed of three frame-types that vary greatly in sizes,
that is, I, P, and B frames. Although VBR mode of transmission presents a challenge in multiplexing
of different data streams, it offers a better opportunity to use up all the bandwidth available in a
DTV transmission channel. Given a channel budget, other forms of useful data can be injected into
the unused spaces of the bandwidth that are available due to the varying nature of the MPEG-2
bitstream. The injected data is then transmitted alongside this bitstream and extracted at the
receiver.
35
Chapter 5
Experimentation Procedures &
Implementation Details
5.1
DTV Transmission Channel Characterization
In the simulation of a DTV transmission channel, a broadcast quality MPEG-2 video bitstream was
used. Both SDTV and HDTV bitstreams were considered for experimentation, but only the analysis
and results of the latter are presented. The same strategy that was employed in the experimentation
procedures for HDTV bitstream can easily be adopted for an SDTV bitstream. The HDTV bitstream
that was used lasted for about 5 minutes, which although not a very long time, was sufficient for the
purposes of this research. This time limit was basically a consequence of the enormous computer
hard-disk space requirements associated with HDTV video frames.
The characteristics of this bitstream were analyzed with the help of a modified version of the
MPEG Software Simulation Group (MSSG) software [15]. The MSSG develops MPEG-2 software
with the purpose of providing aid in understanding the various algorithms that comprise an MPEG2 encoder and decoder, and in the process, giving a sample implementation based on advanced
encoding models. This software project, mostly useful only for academic research, is still an ongoing development. The MSSG software can be simply classified as an MPEG-2 bitstream encoder
and a decoder. The encoder and the decoder were verified using a set of verification pictures, a small
bitstream and a Unix shell script to automatically test the outputs of the encoder and the decoder.
5.1.1
MSSG Encoder Model
The MSSG MPEG-2 encoder software converts an ordered set of uncompressed input video frames
into a compressed and coded bitstream sequence compliant with MPEG-2 compression standards.
With various modifications (please refer to appendix B), this software was adopted for NTSC trans36
mission encoding standards (6 MHz channel, 8-VSB encoding, 30 frames/second) to generate the
results that are presented. Although not presented, PAL transmission and encoding standards (8
MHz channel, COFDM encoding, 25 frames/second) were also considered. The changes made in
this software package included adjusting the parameters for compression and transmission rates to
match the standards employed in NTSC transmission. Some tools that were initially designed for
this software also involved CBR mode of transmission. Those tools were replaced with VBR mode
of transmission tools.
5.1.2
MSSG Decoder Model
The MSSG MPEG-2 decoder software converts a video bitstream into an ordered set of uncompressed
video frames. Just like the encoder, there were modifications made to this part of the software to
make it user friendly and consistent with NTSC transmission standards. In order to verify and
observe the MPEG-2 bitstream characteristics and properties prior to the decoding process, various
aspects of this software were modified accordingly. Other changes made to this software also included
modifications that were necessary to make it work alongside the IP data extracting protocol which
is discussed later.
5.2
Analysis of the MPEG-2 Video Transport Stream
The varying nature of an MPEG-2 video transport stream is clearly illustrated in figure 5-1. This
figure represents the varying sizes (in bytes) of video/image frames in a coded MPEG-2 bitstream
(in bytes). The HDTV MPEG-2 bitstream that was used to generate this data lasted for a period
of about 5 minutes.
In this bitstream sequence, the maximum size of the encoded video frames
was 46,440 bytes, while the minimum was 2,848 bytes.
This is indeed a huge variation (in the
range of about 43,592 bytes), but it is consistent with the theoretical analysis and expectations
(please refer to chapter 4). From this observation, it is clear that the CBR mode of transmission
(without the undesirable buffering and delays at the transmitter) does not use up all the available
bandwidth because the individual video frames that are encoded contain drastically varying amounts
of information. With CBR encoding, the bandwidth apportioned to the bitstream must be large
enough to transmit enough data to achieve an acceptable quality of images from the video frames
with the most information, that is, the I-frames.
The MPEG-2 bitstream is drastically varying in nature because it is composed of three frametypes that vary greatly in sizes, that is, the I, the P, and the B frames. The B-frames, coded based
on a forward prediction from a previous I-frame or P-frame, as well as a backward prediction from
a succeeding I-frame or P-frame are extensively used in MPEG-2 bitstreams because of their coding
efficiency in terms of space and quality. In most cases, B-frames result in least bits being coded
37
x 104
4.5
-
j
~
I
I
I
4
co
3.5
CD
3
E
LL
U-
a) 2.5
CO
E
0
2
0
S1.5
1
r
I
1000
I
I
III
2000
3000
lit
P
I
I'1'1
.
7000
6000
5000
4000
Video/Image Frames Sequence
8000
9000
10000
Figure 5-1: Varying sizes (in bytes) of a 5 minutes long encoded MPEG-2 video/image frames
sequence.
38
3000 - -
I
I
I
I
I
I
I
I
I
I
I
3.5
4
4.5
2500 F-
E
E 2000 F
U
(Z
M
CM
E
0
1500 F
(D
0
~0
2
E
z11000 I
500 I-
0'0
"
0.5
L
1
1.5
.. L
.. L~
2.5
2
~
3
Size Video/Image Frames in Bytes
5
x 104
Figure 5-2: A histogram of the sizes of encoded MPEG-2 video/image frames in a 5 minutes long
bitstream.
overall, followed by the P-frames, while the I-frames require the most bits. As a result, the majority
of the frame-types in an MPEG-2 bitstream are B-frames, while the fewest frame-types are the
I-frames. Figure 5-2 is an illustration of this phenomenal. This figure, generated using the
MATLAB
script in appendix C, is a histogram of the sizes of the encoded video frames in the 5 minutes long
MPEG-2 bitstream described above. The total number of the video/image frames in this bitstream
sequence was about 10,500.
In figure 5-2, the first gaussian distribution representing the majority of the video frames corresponds to the B-frames in the bitstream sequence. The middle gaussian distribution corresponds to
the P-frames, while the third and smallest distribution corresponds to the I-frames. This observation is also consistent with the theoretical analysis and expectations of the structure of an MPEG-2
bitstream.
Further, using the
MATLAB
"step" function, as illustrated in the script in appendix C, figure 5-3
39
x 104
5
I
I
I
i
i
i
I
4.5 F4
CO,
()
-53.5 C
E
E5
U-
S2.5 -
E
0
a)
.L 1.5
C,,
1
F
0.5
jir
_,J -
I
II
0
0
10
20
30
I
KLRJl
LJ
LF
I
60
40
50
Video/Image Frames Sequence
I
I
I
70
80
90
100
Figure 5-3: Varying sizes (in bytes) of the first 100 encoded video/image frames in an MPEG-2
sequence.
was generated. This figure is simply a magnified version of figure 5-1, where only the first 100 frames
are considered. From this small sample of frames, it is clear which frames are the I-frames, P-frames
or B-frames. Additionally, the pattern is also clear. For example, the 7 th frame appears to be an
I-frame, followed by a sequence of B-frames till the
2 2
nd frame, which is most likely a P-frame. This
observation is consistent with the histogram of frame sizes illustrated in figure 5-2.
5.3
Internet IP Data Injecting/Extraction Protocols
The maximum encoded frame size observed in the MPEG-2 bitstream that is described above was
46,440 bytes (46,440 x 8 = 371,520 bits). The NTSC transmission encoding standards allow for a
transmission rate of 30 frames/second. Therefore, for CBR mode of transmission, the bandwidth
requirements for this bitstream would be (371,520 x 30) bits/sec, which is about 11 Mbps. A
40
typical HDTV broadcast bitstream is theoretically expected to use up to a maximum of 16 Mbps
in bandwidth. In this case, the extra 5 Mbps of bandwidth could be attributed to the MPEG-2
bitstream sample that was employed not being a complete representation of the characteristics of
the entire program bitstream sequence. As such, some encoded video frames that were not observed
in the 5 minutes duration may have bigger sizes than the observed maximum size of 46,440 bytes.
Further, broadcasting stations also use some of this extra space for encoding program metadata such
as digital TV guides and copyright information. Still, in most cases, there is some space left over at
the end that can be used for multiplexing other program sequences, or data streams.
The VBR mode of transmission generates more extra space than CBR mode. However, this
extra space is "bursty" and unpredictable, therefore, difficult to leverage for real-time transmission
applications that multiplex different program data streams because of potential delays and buffering
requirements at the transmitter. Nevertheless, this extra space can efficiently be leveraged for IP
multicast applications, and therefore, a primary focus of this research. The VBR mode of transmission offers a better opportunity for using up all the bandwidth available in a DTV transmission
channel, and in the process, provides an alternative route for transmission of large data content
other than the Internet infrastructure. Given a channel budget, Internet data can be injected into
the unused spaces of the channel bandwidth available as a result of the varying nature of the MPEG2 bitstream. The injected data is then transmitted alongside this bitstream and extracted at the
receiver. Two protocols, one at the encoder stage and the other at the decoder stage, were designed
and implemented (from scratch) to accomplish this task (please refer to appendix D & E) by working
alongside the MSSG software.
5.3.1
IP Internet Data Encoding (Injecting) Protocol
This protocol packages packetized Internet data (IP packets) into the unused spectrum of the DTV
channel on as-available time basis without corrupting the MPEG-2 bitstream content. The Internet
data, which consisted mainly of media and streaming file formats (please refer to appendix A), was
first broken down into IP packets. These IP packets were composed of headers, the data itself, and
footers. The headers and footers were necessary so that the transmitted data could be reconstructed
at the receiver.
After the generation of MPEG-2 bitstream sequence through compression and
encoding of video frames, this protocol began by computing the "local maximum size" of the first
100 encoded video frames in the bitstream sequence. The number of encoded video frames, 100,
was chosen based on the observations on the nature of the MPEG-2 bitstream that was used for
experimentation (please refer to figure 5.3). It is expected that the more the encoded video frames
in a bitstream sequence that are used to determine this number, the better the overall results in
determining the extra space that can be leveraged.
The "local maximum size" value, which is updated sequentially after the transmission of every
41
100 encoded video frames is completed, was necessary to determine how much extra space was
unused in those 100 encoded video frames. IP packets would then be injected into the encoded video
frames that have sizes less than the value of the "local maximum size". This process then repeats
until the end of the MPEG-2 bitstream sequence being transmitted.
The IP data injecting protocol takes full advantage of the varying nature of the MPEG-2 bitstream. It was designed to work with VBR mode of transmission, although it also works to serve
the same purpose with CBR mode. Nevertheless, the focus of this research was to determine how
much extra space can be leveraged for Internet IP multicast in a VBR mode of transmission, which
would otherwise not be possible in CBR mode. For the 5 minutes long MPEG-2 bitstream that was
examined, a total of 356,701,896 bytes (356,701,896 x 8 = 2,853,615,168 bits) of unused space was
leveraged. This suffices to approximately 9.5 Mbps extra bandwidth. Granted, this extra bandwidth
is on the high side because the other forms of data that accompany a broadcast program, such as
program metadata, were not accounted for. More important, a statistical multiplexer also works to
multiplex other program bitstreams with a varying MPEG-2 bitstreams where possible.
A statistical multiplexer seeks to bundle multiple programs into one optimized, multiplexed
output stream. It uses high-speed programmable digital signal processors to dynamically adjust
bandwidth allocation amongst different program bitstreams by exploiting the variations on encoded
MPEG-2 streams. In statistical multiplexing, different program bitstreams are multiplexed to form
a single CBR bitstream. However, in this arrangement, if one program has a higher share of the total
multiplex bandwidth, other program streams typically have an aggregate lower share of the multiplex
bandwidth. Therefore, owing to the sporadic nature of an MPEG-2 bitstream, this process cannot be
used to entirely leverage all the unused bandwidth in a channel especially when the bandwidth that
is left over is not large enough to multiplex any of the available program bitstreams. Therefore, even
though a statistical multiplexer may be able to leverage a significant amount of extra bandwidth
observed from this experiement, the results obtained are a clear indication of the enormous potential
of DTV transmission infrastructure in IP multicast.
5.3.2
IP Internet Data Extraction Protocol
This protocol extracts the injected IP packets from the MPEG-2 bitstream, reconstructs the transmitted data from these packets, and buffers the extracted data to ensure that the data being transmitted can be accessed as it is received without waiting for the entire streaming file to finish being
transmitted. This process took place prior to the decoding of the encoded MPEG-2 bitstream sequence. The longer the duration of the MPEG-2 bitstream, the larger the IP data content that was
transmitted alongside the encoded video frames. The implementation of this protocol ensured that
once the IP data was extracted from the encoded video frames, the resulting MPEG-2 bitstream
sequence was identical to the original bitstream prior to injecting the IP data at the transmitter.
42
In order to view the received IP data in real time, a very big buffer (in the tune of several gigabytes) must be implemented at the receiver. Where this implementation is not possible, the data
can be put into a streaming file such that new incoming data is appended to this file as the file is
accessed from the top.
5.4
Results Analysis and Discussion
The research objective was clearly met, that is, implementing IP multicast protocols that enable
spectrum sharing of DTV content and Internet IP data while enhancing DTV spectrum efficiency,
and therefore providing an alternative route for Internet streaming media data other than the Internet transport infrastructure. The alternative data route is critical for reducing Internet congestion,
mainly caused by multiple simultaneous downloads of large contents of popular data. The implementation discussed above is based on a PC DTV viewing environment, although it could be adopted for
digital set-top boxes and digital TV sets that support software. The results from the implementation illustrate that using DTV transmission infrastructure can significantly increase the transmission
speed of popular Internet data to PC DTV viewers. The research approach and the implementation
aims to foster an economic model whereby broadcasters exchange unused bandwidth with content
providers for revenue without complicating the transmission scheme at the transmitter.
The enormous potential of IP multicast rests with DTV transmission infrastructure, where data
is delivered just once but to many recipients. This process allows for download of vast amounts
of data to communities of users over a high-speed data link, that is, the unused spectrum on a
DTV channel. The significant extra bandwidth in DTV transmission channels clearly illustrates
the advantage of using DTV transmission infrastructure for IP multicast. On the contrary, the
current Internet IP multicast transmission rates vary from tens of kilobits per second (Kbps) to a
few megabits per second (Mbps) depending on the structure of the network between the server and
the clients. The Internet infrastructure is made up of a myriad of independent crisscrossing networks
that employ a backbone structure made up of numerous routers and servers. The variations in delays
from these servers and routers are the primary problem of poor performance of streaming media
over the Internet. With the rapid increase in Internet usage, and the subsequent congestion, this
performance is not likely to improve.
IP multicast over DTV channels eliminates the obstacles to getting rich media and other large
content to users because there are no routers in the path to introduce delays on overloaded servers.
Further, since IP multicast over DTV channels is a broadcast data service, even explosive growth
of the user base has no effect on performance. With the advent of DTV transmission, the potential
exists for digital broadcasting to play a role as a high-speed unidirectional "overlay" to the Internet.
Over time, data that is accessed by many people can be broadcast, leaving the traditional bi-
43
directional Internet more available for true point-to-point communications, such as e-commerce and
video teleconferencing.
44
Chapter 6
Conclusion
The very structure of the Internet is contrary to the delivery of rich-media content and other forms
of large data that are accessed simultaneously by users. The Internet is made up of a myriad of
independent crisscrossing networks that employ a backbone structure with numerous routers and
servers. The variations in delays from these servers and routers are the primary problem of poor
performance of streaming media over the Internet. The Internet was not designed to deliver rich
media on a large scale and in real time. The availability and access of video, music, games, software
and other forms of rich media is growing faster than the Internet network can keep up with. Even
more challenging, the user population is growing faster with each new user placing an independent
demand on the network. Implementations of Internet IP multicast have not solved this problem.
However, IP multicast over DTV channels eliminates the obstacles to getting rich media and other
large content to users because there are no routers in the path to introduce delays on overloaded
servers. Further, since IP multicast over DTV channels is a broadcast data service, even explosive
growth of the user base has no effect on performance. With the advent of DTV transmission, the
potential exists for digital broadcasting to play a role as a high-speed unidirectional "overlay" to the
Internet. Over time, data that is accessed by many people can be broadcast, leaving the traditional
bi-directional Internet more available for true point-to-point communications, such as e-commerce
and video teleconferencing.
The research conducted implemented IP multicast protocols that enable spectrum sharing of
DTV content and Internet IP data while enhancing DTV spectrum efficiency, and therefore providing an alternative route for Internet streaming media data other than the Internet transport
infrastructure. The alternative data route is critical for reducing Internet congestion, mainly caused
by multiple simultaneous downloads of large contents of popular data. The implementation conducted was based on a PC DTV viewing environment, although it could be adopted for digital
set-top boxes and digital TV sets that support software. The results from the implementation illus-
45
trate that using DTV transmission infrastructure can significantly increase the transmission speed of
popular Internet data to PC DTV viewers. The research approach and the implementation aims to
foster an economic model whereby broadcasters exchange unused bandwidth with content providers
for revenue without complicating the transmission scheme at the transmitter.
The enormous potential of IP multicast rests with DTV transmission infrastructure, where data is
delivered just once but to many recipients. This process allows for download of vast amounts of data
to communities of users over a high-speed data link, that is, the unused spectrum on a DTV channel.
The significant extra bandwidth in DTV transmission channels clearly illustrates the advantage of
using DTV transmission infrastructure for IP multicast. On the contrary, the current Internet IP
multicast transmission rates vary from tens of kilobits per second (Kbps) to a few megabits per
second (Mbps) depending on the structure of the network between the server and the clients. With
the rapid increase in Internet usage, and the subsequent congestion, this performance not likely to
improve.
46
Chapter 7
Recommendations
7.1
Limitations
The main limitation in conducting the experiments and procedures for this research was computer
space and processing power. HDTV video frames (MPEG-2 files, in general) take a lot of space.
Processing these files on a desktop computer also demanded lots of time, especially due to the
presence of huge number of video frames in a short duration bitstream sequence. While this limitation
can be overcome by more hard-disk space and computer processing power, there are other challenging
obstacles facing DTV transmission.
The uncertainty in standards and modulation formats may stumble the transition to DTV transmission. The current controversy over 8-VSB and COFDM could lead to problems in deployment of
DTV broadcasting. Additionally, each DTV broadcasting channel is stuck with 6 MHz (19.4 Mbps)
capacity, even with development and adoption of more efficient modulation formats. On the other
hand, cable and satellite transmission networks have the ability to increase their bandwidth more
readily.
Unlike satellite and cable transmissions, terrestrial/wireless transmission/broadcasting experiences interference as a result of multipath. This problem needs to be solved especially in the US
and other ATSC-compliant countries for DTV transmission to be successful. Additionally, satellite
and cable transmissions are likely to evolve faster than terrestrial broadcasting. An example of such
evolution is developing cable transmission such that content is transmitted as IP data/packets instead of the current bitstreams so that the bandwidth can be used more effectively and the switching
schemes improved.
Finally, the mobile reception of DTV IP multicast is still in question. It is still not clear how to
implement an affordable mobile receiver such that the transmitted data in a DTV channel can be
extracted using a reliable mobile receiver.
47
7.2
Future Work
The PC DTV IP multicast model that has been presented can be adopted for other net-connected
devices, such as digital TVs, digital set-top boxes and mobile devices. Further, transmission models
that allow broadcasters to combine their unused bandwidth to form a single cumulative high speed
data link for IP multicast can also be developed. Finally, there is a need for more research on how to
develop affordable receivers for DTV IP multicast in order to accommodate mobile data reception.
48
Appendix A
Media & Streaming File Formats
A.1
Media File Formats
The content from the broadcast media can be stored in several digital file formats. Table A.1 shows
a selection of common standard media file formats for video and audio representations.
A.2
Streaming File Formats
A streaming file format is one that has been specially encoded so that it can be played while it
downloads, instead of having to wait for the whole file to download. As part of the streaming format
there is usually some form of compression included. It is possible to stream some standard media file
formats, however it is usually more efficient to encode them into streaming file formats. A streaming file format includes extra information such as timing, compression and copyright information.
Table A.2 below shows a selection of common streaming file formats.
Table A.1: Media File Formats
Media Type and Name (Video/Audio)
Quicktime Video
MPEG Video
MPEG Layer 3 Audio
Wave Audio
Audio Interchange Format
Sound Audio File Format
Audio File Format (Sun OS)
Audio Video Interleaved (Microsoft Win)
49
File Format Extension
.mov
.mpg
.mp3
.wav
.aif
.snd
.au
.avi
Table A.2: Streaming File Formats
Media Type and Name (Video/Audio)
Advanced Streaming Format (Microsoft)
Real Video/Audio file (Progressive Networks)
Real Audio file (Progressive Networks)
Real Pix file (Progressive Networks)
Real Text file (Progressive Networks)
Shock Wave Flash (Macromedia)
Vivo Movie File (Vivo Software)
50
File Format Extension
.asf
.rm
.ra
.rp
.rt
.swf
.viv
Appendix B
MPEG-2 Bit-Stream Codec Model
With modest changes, the MPEG Software Simulation Group (MSSG) [15] software was adopted
for MPEG-2 bitstream encoding and decoding. Presented here with the necessary changes that
were employed to achieve NTSC transmission standards is the parameter file for the encoding and
decoding process, as well as the usage instructions.
B.1
MPEG-2 Codec Parameter File
MPEG-2 Test Sequence, 30 frames/sec
testd
/* name of source files */
qYd
/* name of reconstructed images ("-": don't store) */
-
/* name of intra quant matrix file
-
/* name of non intra quant matrix file C"-": default matrix) */
stat.out
/* name of statistics file ("-": stdout ) */
0
/* input picture file format:
150
/* number of frames */
0
/* number of first frame */
("-": default matrix) */
0=*.Y,*.U,*.V, 1=*.yuv, 2=*.ppm */
00:00:00:00 /* timecode of first frame */
15
/* N (# of frames in GOP) */
3
/* M (I/P frame distance) */
0
/* ISO/IEC 11172-2 stream */
0
/*
704
/* horizontalsize */
480
/* vertical-size */
2
/* aspectratioinformation 1=square pel, 2=4:3, 3=16:9, 4=2.11:1 */
5
/* frameratecode 1=23.976, 2=24, 3=25, 4=29.97, 5=30 frames/sec. */
0:frame pictures, 1:field pictures */
51
5000000.0 /* bit-rate (bits/s) */
112
/* vbv-buffer-size (in multiples of 16 kbit) */
0
/* low-delay
0
/* constrained-parameters-flag */
4
/* Profile ID: Simple = 5, Main
8
/* Level ID:
0
/* progressive-sequence */
1
/* chroma_format: 1=4:2:0, 2=4:2:2, 3=4:4:4 */
2
/* video-format: 0=comp., 1=PAL, 2=NTSC, 3=SECAM, 4=MAC, 5=unspec. */
5
/* color-primaries */
5
/* transfercharacteristics */
4
/* matrix-coefficients */
704
/* display-horizontal-size */
480
/* displayvertical-size */
0
/* intra-dc-precision (0: 8 bit, 1: 9 bit, 2: 10 bit, 3: 11 bit */
1
/* topfield_first */
0 0 0
/* frame-predframe-dct
0 0 0
/* concealment-motionvectors (I P B)
1 1 1
/* q.scale-type
1 0 0
/* intra-vlcjformat (I P B)*/
0 0 0
/* alternate-scan (I P B) */
0
/* repeat-firstfield */
0
/* progressive-frame */
0
/* P distance between complete intra slice refresh */
0
/* rate control: r (reaction parameter) */
0
/* rate control: avg-act (initial average activity) */
0
/* rate control: Xi (initial I frame global complexity measure) */
0
/* rate control: Xp (initial P frame global complexity measure) */
0
/* rate control: Xb (initial B frame global complexity measure) */
0
/* rate control: dOi (initial I frame virtual buffer fullness) */
0
/* rate control: dOp (initial P frame virtual buffer fullness) */
0
/* rate control: dOb (initial B frame virtual buffer fullness) */
2 2 11 11 /* P:
*/
=
4, SNR = 3, Spatial
Low = 10, Main = 8, High 1440 = 6, High
(I P B)
2, High = 1 */
=
=
4
*/
*/
(I P B) */
forwhor_f_code forw-vertfcode search-width/height */
1 1 3
3
/* Bi: forwhor_f_code forw-vert-f-code search-width/height */
1 1 7
7
/* Bi: backhor_f_code backvert_f_code searchwidth/height */
1 1 7
7
/* B2: forw-hor_f-code forw-vertfcode search-width/height */
52
1 1 3
B.2
3
/* B2: backhor_f_code back-vert_f_code searchwidth/height */
Encoder Usage of MSSG Software
/* name of source frame files */
A printf format string defining the name of the input files. It has to
contain exactly one numerical descriptor (%d, %x etc.):
Example: frameX02d
Then the encoder looks for files: frameOO, frame0l, frame02
The encoder adds an extension (.yuv, .ppm, etc.) which depends on the
input file format. Input files have to be in frame format, containing
two interleaved fields (for interlaced video).
/* name of reconstructed frame files */
This user parameter tells the encoder what name to give the reconstructed
frames.
These frames are identical to frame reconstructions of decoders
following normative guidelines (except of course for differences caused by
different IDCT implementation).
Specifying a name starting with - (or just
- by itself) disables output of reconstructed frames.
The reconstructed frames are always stored in Y,U,V format (see below),
independent of the input file format.
/* name of intra quant matrix file
Setting this to a value other than
("-":
-
default matrix) */
specifies a file containing
a custom intra quantization matrix to be used instead of the default
matrix specified in ISO/IEC 13818-2 and 11172-2. This file has to contain
64 integer values (range 1.. .255) separated by white space (blank, tab,
or newline), one corresponding
to each of the 64 DCT coefficients. They
53
are ordered line by line, i.e. v-u frequency matrix order (not by the
zig-zag pattern used for transmission). The file intra.mat contains the
default matrix as a starting point for customization. It is neither
necessary or recommended to specify the default matrix explicitly.
Large values correspond to coarse quantization and consequently more
noise at that particular spatial frequency.
For the intra quantization matrix, the first value in the file (DC value)
is ignored. Use the parameter intra-dc-precision (see below) to define
the quantization of the DC value.
/* name of non intra quant matrix file ("-": default matrix) */
This parameter field follows the same rules as described for the above
intra quant matrix parameter, but specifies the file for the NON-INTRA
coded (predicted / interpolated) blocks. In this case the first
coefficient of the matrix is NOT ignored.
The default matrix uses a constant value of 16 for all 64 coefficients.
(a flat matrix is thought to statistically minimize mean square error).
The file inter.mat contains an alternate matrix, used in the MPEG-2 test
model.
/* name of statistics file */
Statistics output is stored into the specified file. - directs statistics
output to stdout.
/* input picture file format */
A number defining the format of the source input frames.
Code
Format description
0
separate files for luminance (.Y extension), and chrominance (.U,
54
.V)
all files are in headerless 8 bit per pixel format.
.U and .V must
correspond to the selected chromaformat (4:2:0, 4:2:2, 4:4:4, see
below).
Note that in this document, Cb = U, and Cr = V. This format
is also used in the Stanford PVRG encoder.
1
similar to 0,
but concatenated into one file (extension .yuv).
This is the format used by the Berkeley MPEG-1 encoder.
2
PPM,
Portable PixMap,
only the raw format (P6)
is supported.
/* number of frames */
This defines the length of the sequence in integer units of frames.
/* number of first frame */
Usually 0 or 1, but any other (positive) value is valid.
/* timecode of first frame */
This line is used to set the timecode encoded into the first
Pictures'
'Group of
header. The format is based on the SMPTE style:
hh:mm:ss:ff (hh=hour, mm=minute, ss=second, ff=frame (0..picturerate-1)
/* N (# of frames in GOP) */
This defines the distance between I frames (and 'Group of Pictures'
headers).
Common values are 15 for 30 Hz video and 12 for 25 Hz video.
/* M (I/P frame distance) */
Distance between consecutive I or P frames. Usually set to 3.
N has to be a multiple of M.
M = 1 means no B frames in the sequence.
(in a future edition of this program, M=0 will mean only I frames).
55
/* ISO/IEC 11172-2 stream */
Set to
1 if you want to generate an MPEG-1 sequence. In this case some
of the subsequent MPEG-2 specific values are ignored.
/* picture format */
0 selects frame picture coding, in which both fields of a frame are coded
simultaneously, 1 select field picture coding, where fields are coded
separately. The latter is permitted for interlaced video only.
/* horizontal-size */
Pixel width of the frames. It does not need to be a multiple of 16.
You have to provide a correct value even for PPM files (the PPM
file header is currently ignored).
/* verticalsize */
Pixel height of the frames. It does not need to be a multiple of 16.
You have to provide a correct value even for PPM files (the PPM file
header is currently ignored).
/* aspectratio_information */
Defines the display aspect ratio. Legal values are:
Code
Meaning
1
square pels
2
4:3 display
3
16:9 display
4
2.21:1 display
MPEG-1 uses a different coding of aspect ratios. In this cases codes
56
1 to 14 are valid.
/* frameratecode */
Defines the frame rate (for interlaced sequences: field rate is twice
the frame rate). Legal values are:
Code
Frames/sec
Meaning
1
24000/1001
23.976 fps --
2
24
Standard international cinema film rate
3
25
PAL (625/50) video frame rate
4
30000/1001
29.97 --
5
30
NTSC drop-frame (525/60) video frame rate
6
50
double frame rate/progressive PAL
7
60000/1001
double frame rate NTSC
8
60
double frame rate drop-frame NTSC
NTSC encapsulated film rate
NTSC video frame rate
/* bit-rate */
A positive floating point value specifying the target bitrate.
In units of bits/sec.
/* vbvbuffer-size (in multiples 16 kbit) */
Specifies, according to the Video Buffering Verifier decoder model,
the size of the bitstream input buffer required in downstream
decoders in order for the sequence to be decoded without underflows or
or overflows.
You probably will wish to leave this value at 112 for
MPEG-2 Main Profile at Main Level, and 20 for Constrained Parameters
Bitstreams MPEG-1.
/* lowdelay */
When set to 1, this flag specifies whether encoder operates in low delay
mode.
Essentially, no B pictures are coded and a different rate control
57
strategy is adopted which allows picture skipping and VBV underflows.
This feature has not yet been implemented. Please leave at zero for now.
/* constrained-parametersflag */
Always 0 for MPEG-2.
You may set this to 1 if you encode an MPEG-1
sequence which meets the parameter limits defined in ISO/IEC 11172-2
for constrained parameter bitstreams:
horizontalsize <= 768
vertical-size
<= 576
picture-area
<= 396 macroblocks
pixel-rate
<= 396x25 macroblocks per second
vbvbuffersize <= 20x16384 bit
bitrate
<= 1856000 bits/second
motion vector range <= -64.. .63.5
/* Profile ID */
Specifies the subset of the MPEG-2 syntax required for decoding the
sequence. All MPEG-2 sequences generated by the current version of
the encoder are either Main Profile or Simple Profile sequences.
Code
Meaning
Typical use
1
High Profile
production equipment requiring 4:2:2
2
Spatially Scalable Profile
Simulcasting
3
SNR Scalable Profile
Simulcasting
4
Main Profile
95 % of TVs, VCRs, cable applications
5
Simple Profile
Low cost memory, e.g. no B pictures
/* Level ID */
Specifies coded parameter constraints, such as bitrate, sample rate, and
maximum allowed motion vector range.
58
Code
Meaning
Typical use
4
High Level
HDTV production rates: e.g. 1920 x 1080 x 30 Hz
6
High 1440 Level HDTV consumer rates: e.g. 1440 x 960 x 30 Hz
8
Main Level
CCIR 601 rates: e.g. 720 x 480 x 30 Hz
10
Low Level
SIF video rate: e.g. 352 x 240 x 30 Hz
/* progressive-sequence */
0 in the case of a sequences containing interlaced video (e.g.
video camera source),
1 for progressive video (e.g. film source).
/* chromajformat */
Specifies the resolution of chrominance data
Code
Meaning
1
4:2:0
half resolution in both dimensions (most common format)
2
4:2:2
half resolution in horizontal direction (High Profile only)
3
4:4:4
full resolution (not allowed in any currently defined profile)
/* video-format:
0=comp., 1=PAL, 2=NTSC, 3=SECAM, 4=MAC, 5=unspec. */
/* color-primaries */
Specifies the x, y chromaticity coordinates of the source primaries.
Code
Meaning
1
ITU-R Rec. 709 (1990)
2
unspecified
4
ITU-R Rec. 624-4 System M
5
ITU-R Rec. 624-4 System B, G
6
SMPTE 170M
7
SMPTE 240M (1987)
59
/* transfercharacteristics */
Specifies the opto-electronic transfer characteristic of the source picture.
Code
Meaning
1
ITU-R Rec. 709 (1990)
2
unspecified
4
ITU-R Rec. 624-4 System M
5
ITU-R Rec. 624-4 System B, G
6
SMPTE 170M
7
SMPTE 240M (1987)
8
linear transfer characteristics
/* matrixcoefficients */
Specifies the matrix coefficients used in deriving luminance and chrominance
signals from the green, blue, and red primaries.
Code
Meaning
1
ITU-R Rec. 709 (1990)
2
unspecified
4
FCC
5
ITU-R Rec. 624-4 System B, G
6
SMPTE 170M
7
SMPTE 240M (1987)
/* displayhorizontal.size */
/* displayvertical-size */
Displayhorizontal-size and display-vertical-size specify the "intended
display's" active region (which may be smaller or larger than the
encoded frame size).
60
/* intra-dc-precision */
Specifies the effective precision of the DC coefficient in MPEG-2
intra coded macroblocks. 10-bits usually achieves quality saturation.
Code
Meaning
0
8 bit
1
9 bit
2
10 bit
3
11 bit
/* topfieldfirst */
Specifies which of the two fields of an interlaced frame comes earlier.
The top field corresponds to what is often called the "odd field," and
the bottom field is also sometimes called the "even field."
Code
Meaning
0
bottom field first
1
top field first
/* framepred_frame_dct
(I P B)
*/
Setting this parameter to 1 restricts motion compensation to frame
prediction and DCT to frame DCT. You have to specify this separately for
I, P and B picture types.
/* concealment-motion-vectors (I P B) */
Setting these three flags informs encoder whether or not to generate
concealment motion vectors for intra coded macroblocks in the
three respective coded picture types.
This feature is mostly useful
in Intra-coded pictures, but may also be used in low-delay applications
61
(which attempts to exclusively use P pictures for video signal refresh,
saving the time it takes to download a coded Intra picture across a
channel).
concealment_motion-vectors
in B pictures are rather pointless
since there is no error propagation from B pictures. This feature is
currently not implemented.
Please leave values at zero.
/* q-scale.type (I P B) */
These flag sets linear (0) or non-linear (1) quantization scale type
for the three respective picture types.
/* intravlcformat (I P B) */
Selects one of the two variable length coding tables for intra coded blocks.
Table 1 is considered to be statistically optimized for Intra coded
pictures coded within the sweet spot range (e.g. 0.3 to 0.6 bit/pixel)
of MPEG-2.
Code
Meaning
0
table 0 (= MPEG-1)
1
table 1
/* alternate-scan (I P B) */
Selects one of two entropy scanning patterns defining the order in
which quantized DCT coefficients are run-length coded.
The alternate
scanning pattern is considered to be better suited for interlaced video
where the encoder does not employ sophisticated forward quantization
(as is the case in our current encoder).
Code
Meaning
0
Zig-Zag scan (= MPEG-1)
1
Alternate scan
62
/* repeat-first-field */
If set to one, the first
field of a frame is repeated after the second by
the display process. The exact function depends on progressive-sequence
and topjfieldfirst.
repeat-firstfield is mainly intended to serve as
a signal for the Decoder's Display Process to perform 3:2 pulldown.
/* progressive-frame */
Specifies whether the frames are interlaced (0) or progressive (1).
MPEG-2 permits mixing of interlaced and progressive video. The encoder
currently only supports either interlaced or progressive video.
progressive-frame is therefore constant for all frames and usually
set identical to progressive-sequence.
/*
intraslice
refresh picture period (P factor) */
This value indicates the number of successive P pictures in which
all slices (macroblock rows in our encoder model) are refreshed
with intra coded macroblocks.
coding.
This feature assists low delay mode
It is currently not implemented.
/* rate control: r (reaction parameter) */
/* rate control: avgact (initial average activity) */
rate control: Xi (initial I frame global complexity measure) */
rate control: Xp (initial P frame global complexity measure) */
/*
rate control: Xb (initial B frame global complexity measure) */
rate control: dOi (initial I frame virtual buffer fullness) */
rate control: dOp (initial P frame virtual buffer fullness) */
rate control: dOb (initial B frame virtual buffer fullness) */
These parameters modify the behavior of the rate control scheme. Usually
set them to 0, in which case default values are computed by the encoder.
/* P:
forwhor_f_code forw-vertfcode searchwidth/height */
/* Bi: forw-hor_f_code forwvert_f_code search-width/height */
63
/* Bi: backhor_f-code back-vert_f_code searchwidth/height */
/* B2: forwhor_f_code forw-vertfcode searchwidth/height */
/* B2: backhor_f_code back-vert_f_code search.width/height */
This set of parameters specifies the maximum length of the motion
vectors. If this length is set smaller than the actual movement
of objects in the picture, motion compensation becomes ineffective
and picture quality drops. If it is set too large, an excessive
number of bits is allocated for motion vector transmission, indirectly
reducing picture quality, too.
All fcode values have to be in the range 1 to 9 (1 to 7 for MPEG-1),
which translate into maximum motion vector lengths as follows:
code
1
range (inclusive)
-8 ...
max search width/height
+7.5
7
2
-16
...
+15.5
15
3
-32
...
+31.5
31
4
-64
...
+63.5
63
5
-128
...
+127.5
127
6
-256
...
+255.5
255
7
-512
...
+511.5
511
8
-1024
...
+1023.5
1023
9
-2048
...
+2047.5
2047
f-code is specified individually for each picture type (P,Bn),
direction
(forward prediction, backward prediction) and component (horizontal,
vertical). Bn is the n'th B frame surrounded by I or P frames
(e.g.: I B1 B2 B3 P B1 B2 B3 P ... ).
For MPEG-1 sequences, horizontal and vertical
fcode have to be
identical and the range is restricted to 1.. .7.
P frame values have to be specified if N (N = # of frames in GOP) is
greater than 1 (otherwise the sequences contains only I frames).
64
M -
1 (M = distance between I/P frames) sets (two lines each) of values
have to specified for B frames. The first
line of each set defines
values for forward prediction (i.e. from a past frame), the second
line those for backward prediction (from a future frame).
search-width and search-height set the (half) width of the window used
for motion estimation. The encoder currently employs exhaustive
integer vector block matching. Execution time for this algorithm depends
on the product of search-width and search-height and, too a large extent,
determines the speed of the encoder. Therefore these values have to be
chosen carefully.
Here is an example of how to set these values,
assuming a maximum
motion of 10 pels per frame in horizontal and 5 pels per frame in
vertical direction and M=3 (I B1 B2 P):
search width / height:
backward
hor.
vert.
5
B1 <- P
20
10
20
10
B2 <- P
10
5
30
15
forward
hor.
I ->B1
10
I ->B2
I ->P
vert.
fcode values are then selected as the smallest ones resulting in a range
larger than the search widths / heights:
3 2
30 15 /* P:
forw-hor-f_code forwvert_f_code search-width/height */
2 1
10
3 2
20 10 /* Bi: backhor_f-code backvert_f_code search.width/height */
3 2
20 10 /* B2: forwhor_f-code forwvert_f_code search-width/height */
2 1
10
5 /* Bi: forwhor_f-code forwvert_f_code search-width/height */
5 /* B2: backhor_f_code backvert_f_code search-width/height */
65
Decoder Usage of MSSG Software
B.3
mpeg2decode {options} input.m2v {upper.m2v} {outfile}
Options:
-vn
verbose output (n: level)
Instructs mpeg2decode to generate informative ouput about the sequence
to stdout.
Increasing level (-vi,
-v2,
etc.) results in more detailed
output.
-on
output format (0: YUV,
1: SIF, 2: TGA,
3:PPM, 4:X11,
To choose a file format for the decoded pictures.
5:X11 HiQ)
Default is 0 (YUV).
The following formats are currently supported:
YUV: three headerless files, one for each component. The luminance component
is stored with an extension of .Y, the chrominance components are
stored as .U and .V respectively. Size of the chrominance files depends
on the chromaformat used by the sequence. In case of 4:2:0 they have
half resolution in both dimensions, in case of 4:2:2 they are subsampled
in horizontal direction only, while 4:4:4 uses full chrominance
resolution. All components are stored in row storage from top left to
bottom right.
SIF: one headerless file, with interleaved components. Component order
is Cb, Y, Cr, Y. This format is also known as Abekas or CCIR Rec. 656
format.
The chrominance components have half resolution in horizontal
direction (4:2:2) and are aligned with the even luminance samples.
File extension is .SIF.
TGA: Truevision TGA [4] 24 bit R,G,B format in uncompressed (no run length
coding) format with .tga extension.
PPM: Portable PixMap format as defined in PBMPLUS [5],
66
a graphics package by
Jef Poskanzer. Extension is .ppm.
X11: display decoded video on an X Window System server. The current version
supports only 8 bit color display. You can use the DISPLAY environment
variable to select a (non-default) display. The output routines perform
8 bit dithering and interlaced to progressive scan conversion. You can
choose among two different scan conversion algorithms (only for 4:2:0
interlaced streams):
- a high quality slow algorithm (-o5, X11 HiQ)
- a faster but less accurate algorithm (-o4, X11)
-f
store interlaced frames in frame format
By default,
interlaced video is stored field by field. The -f option
permits to store both fields of a frame into one file.
-r
use double precision reference IDCT
The -r option selects a double precision inverse DCT which is primarily
useful for comparing results from different decoders. The default is to
use a faster integer arithmetic only IDCT implementation which meets the
criteria of IEEE 1180-1990 [3].
-s infile
spatial scalable sequence
Spatial scalable video is decoded in two passes. The -s option specifies
the names of the output files from the first
(lower layer) pass to the
second (enhancement layer) pass. 'infile' describes the name format of the
lower layer pictures for spatial scalable sequences in a format similar to
outfile as described below.
-q
Set this switch to suppress output of warnings to stderr. Usually a bad idea.
67
t
Setting this option activates low level tracing to stdout.
This is mainly for
debugging purposes. Output is extremely voluminous. It currently doesn't
cover all syntactic elements.
outfile
This parameter has to be specified for output types -oO to -o3 only. It
describes the names of the output files as a printf format string. It has to
contain exactly one integer format descriptor (e.g.
%d, %02d) and, except
for frame storage (-f option or progressive video), a Xc descriptor
example: out%02d_%c generates files
outOOa.*, outOOb.*, outOl.a.*, ...
('a' denotes the top field, 'b' the bottom field,
.* is the suffix appropriate for the output format)
upper.m2v
is the name of the upper layer bitstream of an SNR scalable stream or a
data partioning scalable bitstream (input.m2v is the lower layer).
68
Appendix C
MPEG-2 Bit-Stream Data Analysis
This Matlab script was used to analyze data obtained from a 5 minute session of an MPEG-2
broadcast bitstream. With this script, the figures in chapter 5 were generated, as well as the values
for space availability in a typical MPEG-2 bitsream and rate of data transmission.
close all;
clear all;
load outputStats.txt;
A = outputStats; % 10517x1 matrix with values continous
N = size(A,1); % this is the rows of A, i.e. 10517
for i = 1:N-1
B(i,1) = A(i+1,1)-A(i,1);
end
figure(1)
plot (B)
axis([1 10516 1 48000]);
xlabel('Sequence of Image Frames')
ylabel('Size of Image Frames in Bytes')
print -deps chap5figl.eps
69
figure (2)
hist(B,100)
xlabel('Size Image Frames in Bytes')
ylabel('Number of Image Frames')
print -deps chap5fig2.eps
figure(3)
stairs(B(1:100))
xlabel('Sequence of Image Frames')
ylabel('Size of Image Frames in Bytes')
print -deps chap5fig3.eps
disp('Maximum Frame-Size of the Bit-Stream')
MAXB = max(B)
disp('Minimum Frame-Size of the Bit-Stream')
min(B)
disp('Mean of the Frame-Sizes in the Bit-Stream')
mean(B)
%std(B)
x
= 1:10516;
size(x);
K = size(B);
TOTAL = 0;
for
j
= 1:K
TOTAL = TOTAL + (MAXB - B(j,1));
end
disp('Size available in Bytes - For a 5 minutes bitstream')
TOTAL
70
Appendix D
IP Data Injecting Protocol
This protocol packages packetized Internet data (IP packets) into the unused spectrum of the DTV
channel on as-available time basis without corrupting the MPEG-2 bitstream content. It was designed and implemented by the author from scratch.
#include <stdio.h>
#include <stdlib.h>
#include <dirent.h>
#include <string.h>
int main(int argc, char *argv[])
{
DIR *dp;
struct dirent *dirp;
int count = 0;
int i = 0;
int num = 0;
int entireStreamTransmittedFlag = 0;
/* Files */
/* INPUT */
const char *sourceFileDirectory = "frames/"; /* MAY HAVE TO CHANGE */
char *directoryFiles[2000]; /* MAY HAVE TO CHANGE */
const char *streamingFile
=
"stream/bitstream";
/* OUTPUT */
71
/* MAY HAVE TO CHANGE */
const char *encodedFileDirectory = "encodedFrames/";
/* MAY HAVE TO CHANGE */
/* OTHER VARS */
FILE *sourceFileFPtr = NULL; /* input */
FILE *streamingFileFPtr = NULL; /* input */
FILE *encodedFileFPtr = NULL; /* output */
char fullSourceFileName[100]
="";
"\0";
char fullEncodedFileName[100]
char *fileContents = "\0";
char *testContents = "\0";
int maxFileSize = 0;
int sizeOfAFileName = 80;
/* read in command line arguments, check them, and convert argument into an int */
if (argc != 2)
{
printf ("Usage:
myEncode maxFileSize\n");
exit(0);
}
else
{
maxFileSize = atoi(argv[1J);
printf("%sXd\n",
"The maxFileSize argument entered is:
",
maxFileSize);
/* STUB!!! TAKE OUT */
//maxFileSize = 25000;
}
/* read files in the frame source directory */
if ( (dp = opendir(sourceFileDirectory))
== NULL)
{
printf("Xss\n",
"Error: could not open: ", sourceFileDirectory);
exit(0);
}
while ( (dirp = readdir(dp)) != NULL)
{
if ((strncmp(dirp->d.name,
".",
1)) &&
72
(strncmp(dirp->d-name,
2)))
"..",
{
if ((directoryFiles[count]
= calloc(sizeOfAFileName,
sizeof(char)))
== NULL)
{
printf("%s\n",
"Error: could not allocate memory for filenames");
exit (0);
}
dirp->dname,
memcpy(directoryFiles [count],
count++; //
strlen(dirp->d-name));
number of files in the directory
}
}
// open streaming file in binary mode
if ((streamingFileFPtr = fopen(streamingFile,
"rb"))
== NULL)
{
printf("Xss\n",
"Error: could not open ", streamingFile);
exit(0);
}
entireStreamTransmittedFlag = 0;
count is the number of files in the directory
//
for(i = 0; i < count; i++) //
for each file in the directory
{
//
full source file name (input file)
strcpy(fullSourceFileName,
sourceFileDirectory);
strcat(fullSourceFileName,
directoryFiles[i]);
//
full encoded file name (output file)
strcpy(fullEncodedFileName,
encodedFileDirectory);
strcat(fullEncodedFileName,
dire ctoryFiles [i]);
printf ("Processing file:
[Xs]\n", fullSourceFileName);
/* allocate size of fileContents to be maxFileSize,
73
then read in
contents of source file into it */
//
allocate memory
sizeof (char)))
if ((f ileContents = calloc(maxFileSize,
== NULL)
{
printf("
ss\n", "Error: could not allocate memory for fileContents
when processing
fullSourceFileName);
",
exit(0);
}
// open file in binary mode
if ((sourceFileFPtr = fopen(fullSourceFileName,
"rb"))
== NULL)
{
printf("'/ss\n",
"Error: could not open
",
fullSourceFileName);
exit(0);
}
//
read in the file char by char
num = 0;
while ((num < maxFileSize) &&
(fread(&fileContents [num],
==
1,
sizeof (char),
sourceFileFPtr)
1))
{
num++;
}
/* check if num is greater than maxFileSize; if it is, return an error
if ((testContents = calloc(1, sizeof (char)))
== NULL)
{
printf("
ss\n", "Error: could not allocate memory for fileContents
when processing
",
fullEncodedFileName);
exit(0);
}
//
if
num is
//
then error
>= maxFileSize,
and there is
if (num >= maxFileSize)
74
still
something to be read,
{
if (fread(&testContents[0],
==
sizeof(char),
1,
sourceFileFPtr)
1)
{
printf("%ss\n",
"Error: file too large ", fullSourceFileName);
exit (0);
}
}
//
close the file
fclose(sourceFileFPtr);
if (entireStreamTransmittedFlag)
{
/* print into encoded file output */
if ((encodedFileFPtr = fopen(fullEncodedFileName,
"wb"))
== NULL)
{
printf("Xss\n",
"Error: could not open ",
fullSourceFileName);
exit(0);
}
fwrite(fileContents,
sizeof(char),
num, encodedFileFPtr);
fclose(encodedFileFPtr);
}
else
{
/* append the next bytes in the stream onto fileContents */
/* check to see if there is enough space */
if (num < (maxFileSize - 2)) //
i.e. (maxFileSize - 3) or less
{
/* Set delimiter ...
*/
fileContents[num] = '1';
num++;
fileContents[num] = '1';
num++;
75
fileContents[num] = '1';
num++;
/* read in next contents of streamingFile until maxFileSize */
while ( (num < maxFileSize) &&
(fread(&fileContents[num],
==
sizeof(char),
1,
streamingFileFPtr)
1))
{
num++;
}
/* end of streamingFile */
if (num < maxFileSize)
{
printf("Xs%s\n",
"Reached end of streaming file while processing
",
printf("%s\n", "Encoding complete, entire stream transmitted");
entireStreamTransmittedFlag = 1;
}
} /* if (num < (maxFileSize - 2))
*/
/* print into encoded file output */
if ((encodedFileFPtr = fopen(fullEncodedFileName,
"wb"))
== NULL)
{
printf("%ss\n",
"Error: could not open ", fullSourceFileName);
exit(0);
}
fwrite(fileContents,
sizeof(char),
num, encodedFileFPtr);
fclose(encodedFileFPtr);
} /* else */
} /* for */
fclose(streamingFileFPtr);
closedir(dp);
if (!entireStreamTransmittedFlag)
76
fullSourceFileName);
{
printf ("%s\n",
"WARNING: Encoding incomplete,
}
exit (0);
} /* main */
77
entire stream was not transmitted");
Appendix E
IP Data Extraction Protocol
This protocol extracts the injected IP packets from the MPEG-2 bitstream, reconstructs the transmitted data from these packets, and buffers the extracted data to ensure that the data being transmitted can be accessed as it is received without waiting for the entire streaming file to finish being
transmitted. This protocol was also designed and implemented by the author from scratch.
#include <stdio.h>
#include <stdlib.h>
#include <dirent .h>
#include <string.h>
// holds binary strings for writing to a file
typedef struct SData
{
char *data; //
holds s-expression
int length; //
length of data
} SData;
char *myBinaryStrstr (char *working-string,
int working-string-length,
char *search-string,
int search-stringjlength)
{
char *pi
="\;
78
char *p2 = "\O";
int p2Length = 0;
p2 = working-string;
p2Length = working-string-length;
while ((p1 = memchr(p2,
search-string[0],
p2Length))
1=
NULL)
{
if ((memcmp(p1,
search-string, search- stringjlength))
0)
==
{
return p1; /* found search-string */
//
p2Length = old p2Length -
(new p2 - old p2)
p2Length = p2Length - ((p1 + 1) - p2);
p2 = (p1 + 1);
}
return NULL; /* failed to find search-string */
}
void myStringTokenize(SData *string,
SData *search-string,
SData **resultTokens,
int *resultSize)
char *working-string = "\0";
int tempjlength = 0;
char *p1 = "\0";
int i = 0;
int working-stringjlength = 0;
//working-string = ap-pstrdup(r->pool,
string);
79
working-string = calloc(string->length, sizeof(char));
memcpy(working-string, string->data, string->length);
i = 0;
working-string-length = string->length;
/* insert bytes before the delimeter '111' into an array in resultTokens.
This does it for each 111 found, and increments i with each pass */
while ( (p1 = myBinaryStrstr(working-string,
working-stringjlength,
search-string->data,
search-string->length))
!= NULL)
{
/* copy first part over
temp-length = p1 - working-string;
//temp-string = ap-pstrndup(r->pool, working-string, temp_length);
//allocate memory for this SData
resultTokens[i] = calloc(i, sizeof(SData));
resultTokens[i]->data = calloc(tempjlength, sizeof(char));
memcpy(resultTokens[i]->data, working-string, temp-length);
resultTokens[i]->length = temp-length;
/* find last part */
p1 = p1 + strlen(search-string->data);
/* make last part working string */
//tempstring = ap.pstrcat(r->pool, temp-string, p1, NULL);
working-string-length = working-string-length -
(tempjlength + strlen(searchstring->data))
memcpy(working-string, p1, working-stringjlength); //
copy over the remainder
memset((working-string + working-stringjlength), '\0',
(string->length - workingstring.length));
//
set rest of bytes to null
i++;
//
insert the bit stream that was appended (if everything goes ok ... )
//
or insert input string if no delimiter was found
80
resultTokens[i] = calloc(1, sizeof(SData));
resultTokens [i] ->data = calloc(working- stringlength, sizeof (char));
memcpy(resultTokens [i] ->data,
working-string, working-stringjlength);
resultTokens [i] ->length = working-stringjlength;
i++;
if (i > *resultSize)
{
*resultSize = -1;
} else {
*resultSize = i;
}
}
int main(int argc, char *argv[])
{
DIR *dp;
struct dirent *dirp;
int count = 0;
int i = 0;
int num = 0;
SData **resultTokens;
int resultSize = 2;
/* Files */
/* INPUT */
const char *encodedFileDirectory
=
"encodedFrames/";
/* MAY HAVE TO CHANGE */
char *directoryFiles[2000]; /* MAY HAVE TO CHANGE */
/* OUTPUT */
const char *decodedFileDirectory = "decodedFrames/"; /* MAY HAVE TO CHANGE */
const char *streamingFile = "decodedFrames/decodedBitstream"; /* MAY HAVE TO CHANGE */
/* OTHER VARS */
FILE *streamingFileFPtr = NULL; /* input */
FILE *encodedFileFPtr = NULL; /* input */
81
FILE *decodedFileFPtr = NULL; /* output */
SData *encodedSData = calloc(1,
sizeof(SData));
SData *searchStringSData = calloc(1,
sizeof(SData));
char fullEncodedFileName[100]
= "\O";
char fullDecodedFileName[100]
= "\O";
char *fileContents = "\0";
char *testContents = "\0";
int maxFileSize = 0;
int sizeOfAFileName = 80;
/* read in command line arguments, check them, and convert argument into an int */
if (argc != 2)
f
printf ("Usage: myEncode maxFileSize\n");
exit(0);
}
else
{
maxFileSize = atoi(argv[1]);
printf("%sd\n",
"The maxFileSize argument entered is:
",
maxFileSize);
/* STUB!!! TAKE OUT */
//maxFileSize = 25000;
}
/* read files in the encoded frame directory */
if ( (dp = opendir(encodedFileDirectory))
== NULL)
{
printf("'/ss\n",
"Error: could not open: ", encodedFileDirectory);
exit(0);
}
while ( (dirp = readdir(dp))
!= NULL)
{
if ((strncmp(dirp->d-name,
(strncmp(dirp->d-name,
"..",
".",
1))
&&
2)))
{
82
if ((directoryFiles[count]
= calloc(sizeOfAFileName,
sizeof (char)))
== NULL)
{
printf("%s\n",
"Error: could not allocate memory for filenames");
exit (0);
}
dirp->d.name,
memcpy(directoryFiles [count],
count++; //
strlen(dirp->d-name));
number of files in the directory
}
}
//
open streaming file in binary mode
if ((streamingFileFPtr = fopen(streamingFile,
"wb"))
== NULL)
{
printf("%ss\n",
"Error: could not open ", streamingFile);
exit(0);
}
searchStringSData->data = "111";
searchStringSData->length = 3;
//
count is the number of files in the directory
for(i = 0; i < count; i++) //
for each file in the directory
{
//
full encoded file name (input file)
strcpy(fullEncodedFileName,
encodedFileDirectory);
strcat(fullEncodedFileName,
directoryFiles[ii);
//
full decoded file name (output file)
strcpy(fullDecodedFileName,
decodedFileDirectory);
strcat(fullDecodedFileName,
direct oryFiles [i]);
printf ("Processing file:
['s] \n", fullEncodedFileName);
/* allocate size of fileContents to be maxFileSize, then read in
contents of source file into it */
83
//
allocate memory
if ((fileContents = calloc(maxFileSize,
sizeof (char)))
== NULL)
{
printf("XsYs\n", "Error: could not allocate memory for fileContents
when processing
",
fullEncodedFileName);
exit(0);
}
//
open file in binary mode
if ((encodedFileFPtr = fopen(fullEncodedFileName,
"rb"))
== NULL)
{
printf("Xss\n",
"Error: could not open
",
fullEncodedFileName);
exit(0);
}
//
read in the file char by char
num = 0;
while ((nurn < maxFileSize) &&
(fread(&fileContents [nuim], sizeof (char),
==
1,
encodedFileFPtr)
1))
{
num++;
}
/* check if num is greater than maxFileSize;
if ((testContents = calloc(1,
if it
is, return an error
sizeof (char)))
== NULL)
{
printf("YXsXs\n",
"Error: could not allocate memory for fileContents
when processing ", fullEncodedFileName);
exit(0);
}
//
if num is >= maxFileSize, and there is still
//
then error
if (num >= maxFileSize)
{
84
something to be read,
sizeof(char),
if (fread(&testContents[0],
==
1,
encodedFileFPtr)
1)
{
printf("Xss\n",
"Error: file too large;
argument ",
increase maxFileSize
fullEncodedFileName);
exit(0);
}
}
if (num >= maxFileSize)
printf("Xss\n",
"Error: file too large; increase maxFileSize
argument ", fullEncodedFileName);
exit(0);
} */
//
close the file
fclose(encodedFileFPtr);
//
make encodedSData
encodedSData->data = fileContents;
encodedSData->length = num;
resultSize = 2;
resultTokens = (SData **)
calloc(resultSize,
sizeof(SData *));
myStringTokenize(encodedSData,
searchStringSData,
resultTokens,
&resultSize);
if (resultSize == -1)
{
printf("%ss\n",
"Error: more than
1 delimiter found in ", fullEncodedFileName);
exit(0);
}
if ((decodedFileFPtr
=
fopen(fullDecodedFileName, "wb"))
85
== NULL)
{
"Error: could not open ", fullDecodedFileName);
printf("Xss\n",
exit (0);
}
fwrite (resultTokens [0] ->data,
sizeof (char),
resultTokens [0] ->length,
fclose(decodedFileFPtr);
if (resultSize
==
2) {
if (resultTokens[1] ! NULL)
{
if (resultTokens[1]->length != 0)
{
fwrite (resultTokens [1] ->data,
sizeof (char),
resultTokens[1)->length, streamingFileFPtr);
}
}
}
} /* for */
fclose(streamingFileFPtr);
closedir(dp);
exit(0);
} /* main */
86
decodedFileFPtr);
Bibliography
[1] John R. Buck Alan V. Oppenheim, Ronald W. Schafer.
Discrete-Time Signal Processing.
Prentice-Hall, Inc., New Jersey 07458, second edition, 1999. DSP Bible.
[2] Grenville J. Armitage. Ip multicasting over atm networks. EEE Journal on Selected Areas in
Communications, 1997. NJ, USA.
[3] David Clark. Pc and tv makers battle over convergence. IEEE Journal, Computer - CA, USA,
1997. The conversion from analog to digital has begun.
[4] eMarketer. Dtv. http://www.emarketer.com/, Online journal, 2001.
[5] Bruce Franca. Dtv report on cofdm and 8-vsb performance. FCC Report, 1999. FCC, Office of
Engineering and Technology, USA.
[6] Miller Freeman. The Guide To Digital Television. Miller Freeman PSN Inc., New York 10016,
second edition, 1999. DTV Stuff.
[7] Steven Gringeri. Transmission of mpeg-2 video streams over atm. GTE Laboratories, 1998.
MA, USA.
[8] MIT DSP Group. Mpeg. http://rleweb.mit.edu/Publications/currents/curlO-1.htm#change,
Online Journal, 2001.
[9] Ken Aupperle John B. Casey. Digital television and the pc. Hauppauge Computer Works, Inc.,
1998. USA.
[10] H. M. Shuler K. Jong, J. W. Carlin. Ip transport trends and satellite applications. IBC 2000
proceedings Journal, 2000. Loral Skynet, USA.
[11] Milan Milenkovic. Delivering interactive services via digital tv infrastructure. IEEE Feature
Article on MultiMedia, 1998. Intel, USA.
[12] William Schreiber. Technological decision-making at the national level. MIT DSP, 1998. USA.
87
[13] Thomas Sikora. Mpeg-1 and mpeg-2 digital video coding standards. Heinrich-Hertz-Intitut
Berlin - Image Processing Department,2000.
[14] SkyStream. Optimizing bandwidth - the skystream approach. new competitive opportunities
for service providers. http://www.skystream.com/products/optimizing.pdf, Online Journal,2000.
USA.
[15] ISO/IEC 13818 Draft International Standard. Generic coding of moving pictures and associated
audio. http://www.mpeg.org/MPEG/index.html, Online Journal,1996. Test Model 5, ISO/IEC
JTC1/SC29/WG11/ N0400, MPEG93/457.
[16] W. Richard Stevens. TCP/IP Illustrated, Volume 1, The Protocols. Addison Wesley Longman,
Inc., first edition, 1994.
[17] Marjory Johnson Vicki Johnson. Ip multicast apis and protocols. Stardust.com, Inc., CA, USA,
2000. A technical description of IP Multicast APIs, and an overview of application requirements
for their use.
[18] Marjory Johnson Vicki Johnson. Ip multicast apis and protocols. Stardust.com, Inc., CA, USA,
2000. A technical description of IP Multicast APIs, and an overview of application requirements
for their use.
[19] Marjory Johnson Vicki Johnson. Ip multicast backgrounder.
IP Multicast Initiative (IPMI),
CA, USA, 2000. How IP Multicast paves the way for next-generation network applications.
[20] Graham Wade. Signal Coding and Processing. Cambridge University Press, second edition,
1994. Coding Techniques.
[21] Stephen G. Wilson. DigitalModulation and Coding. Prentice-Hall, Inc., New Jersey 07458, first
edition, 1996. Digital Modulation Bible.
[22] www.whatis.com. Dtv. http://whatis.techtarget.com/, Online journal, 2001.
88
Download