chapter_1

advertisement
BIT 3193
MULTIMEDIA DATABASE
CHAPTER 1 :
INTRODUCTION TO MULTIMEDIA DATABASE
What is Multimedia Database ?
The ability to manage, store and retrieve
the different types of media
(multimedia data)
(Lynne Dunckley, 2003)
What is Multimedia Database ?
• support multimedia data types in addition to
providing facilities for traditional DBMS functions :
–
–
–
–
database creation,
data modeling,
data retrieval, data access and organization, and
data independence.
(Kosch, H. and Döller, M., 2005)
Static
•do not have a time dimension
•their contents and meanings do not depend on the
presentation time
• example : graphics/still images, alphanumeric data
Multimedia Data
Dynamic
• have time dimensions
• their meanings and correctness depend on the rate at which
they are presented
• example : animation, video and audio
consists of plain alphanumeric character :
ASCII
TEXT
Storage space requirement in bytes,
being equal to the numbers of characters
(including spaces) in the document
Example :
A typical book with 300 pages, each of
has 3,000 characters
Storage = 900 Kb
Vector-based
• graphics elements are represented in mathematical formulas
• Example : rectangle element
identifier of rectangles + coordinates of 2 opposite corners
are stored –> to change, only modify these parameters
• storage requirements are very low
GRAPHICS
Pixel-based
• the graphics is divided into small picture elements called pixels
• each pixel corresponds to a dot on the screen
• the intensity @ color of pixels are stored in a pixel-based
graphics file
Produced by sequential rendering a
number of frames of graphics
ANIMATION
If the graphics are pixel-based, the
animation is the same as video
If the graphics are vector-based, indexing
and retrieval can be carried out in similar
way for vector graphics  but have extra
temporal dimension
Caused by a disturbance in air pressure
that reaches the human eardrum
Parameter used is amplitude (dB)
AUDIO
A sound wave is continuous in both time
and amplitude
Amplitude
Time
Figure A : Example sound wave
Sound wave is a analog signal
AUDIO
For computers to process and
communicate an audio signal, it must be
converted into a digital signal : ADC
ADC involved 3 stages:
• sampling,
• quantization
• coding
VIDEO
Consists of a number of frames or images
that have to be played at a fixed rate
Parameter used is fps (frames per second)
Example :
10-minutes video with
image size 512 pixels by
512 lines, pixel depth of
24 bits/pixel and frame
rate 30fps
Storage :
600*30*512*512*3=13.8Gb
2 common frame rates :
• 25 fps : PAL systems
• 30 fps : NTSC systems
2 major characteristics of video:
• has a time dimension
• takes a huge amount of data to
represent
There are three challenges that arise from
multimedia data that do not occur with
other data types.
Size
Time
Semantic Nature
• To get an idea of the size of media data objects :
• single colored image could require 6 Mb
• video object (30 fps, 5 minutes video clip) would
require 54 Gb.
• audio will occupy 8 Kb for each second
• Data size will affect the storage, retrieval and
transmission of multimedia.
• Therefore techniques that reduce the size of
multimedia data without impacting on the information
are crucial.
• Video and audio must run in the correct sequence and
at an acceptable rate  otherwise it becomes
meaningless.
• have significance for the way the media objects are
stored, retrieved, transmitted and synchronized
together.
• example : video clip of an interview  audio + image
data (must synchronized together).
• more complex than traditional data types.
• difficult to identify components within the media that
could be used for retrieval or transaction processing.
• interpretations may need based on certain features of
multimedia data and stored as METADATA.
• any data
• that is required to interpret other data
• as meaningful information
• it is used for retrieving and manipulating the
data
Multimedia Database Applications
Entertainment System
Video on Demand
Public Protection
Medical
Information System
The registered user of the system can request a video
from the catalog. The videos may be available
according to a previously advertised fixed schedule or
available at any time, subject to a small delay. The user
can select a video based on textual information of the
cast, production team and synopsis of the plot.
Production information such as storyboards, screenplay
and production notes can be included. Users can view
the video contiguously or play randomly selected scenes.
The video can be paused and resumed play as
requested within constraint.
• This is a single media application.
• The user is not involved in capturing, editing or
manipulating the media.
• Communication of the media is unidirectional.
• Delivery may be simplified by scheduling requests and
combining the delivery to several users at the same
time.
• There is high data volume that requires high
performance storage and networking system.
• There is a large number of users  may accept some
loss of quality.
Exercise
Identify the system requirements of the
following multimedia applications case
study.
In a number of countries police use visual information to
identify people or to record the scenes of crime for
evidence. These photographic records are a valuable
archive. In the UK everyone arrested is photographed
and their images are sampled and stored with their
fingerprints. It is also planned to store sampled DNA
profiles of suspects. Until a subject is convicted, access to
photographic information is restricted. Interrogation of
the database may be on the basis of automatic
fingerprint recognition, DNA matching and face
recognition. Video surveillance also needs to be linked
to the facial recognition system.
• The user is involved in capturing, editing, or
manipulating the media.
• bi-directional data flow
• complex content modeling for complex correlated
queries.
• diverse media – maps, images, audio, video.
• interactivity with media through simple matching
queries.
The medical and related health professionals use and
store visual information in the form of x-ray, ultrasound
and other scanned images for diagnosis and monitoring
purposes. There are strict rules on confidentiality of such
information. The images are kept with patients’ records
stored by unique identifier (e.g national insurance
number). Visual information, provided that it is
rendered anonymous, may also be used for research
purposes. Effective image processing such as edge
detection and feature extraction can be important in
assisting expert diagnosis of lesions, tumors and tracking
their growth. Images may be the result of a single
instrumental approach, e.g x-ray, or the result of a
combination of data from several resources.
• The user is involved in capturing, editing, or
manipulating the media.
• highest quality media data with little toleration of
data loss.
•Confidential security required.
•bi-directional data flow
• complex content modeling for complex correlated
queries.
• diverse media – maps, images, audio, video.
• The physical storage describes how multimedia objects
are stored in a file system.
• Multimedia objects are typically huge  need
different techniques for their storage and retrieval.
• This view deals with the issue of:
• providing fast access to stored data  index
mechanisms
• can be stored in different systems
• can be accessed over computer networks
• can query multimedia data in different ways 
depending on the type of information
• provide a filtered view of the mm databases
to retrieve only the required data
• the object retrieved have to be appropriately
presented
Application Interface
User’s view
W1
Wn
Physical
Storage
View
W1
Wn
Wn
W2
Query 1
Distributed
view
Conceptual
Data
View
Application Interface
W1
W2
Filtered view
Application Interface
W2
Query 2
Query 3
COMMUNICATION NETWORK
Data Models
(OOP Metadata)
Text
Data Access
(Indexing)
Image
Temporal Models
(Petri Nets)
Video
Figure B : Components Involved in MMDBMS
Audio
Deal with raw digitized data
PHYSICAL
STORAGE
VIEW
The main issue is size.
Size of objects influences:
• storage capacity requirements
• retrieval bandwidth (bps) requirements
Table 1 : Media Types, Representation, Size and Bandwidth Requirements
Media
Representation
Data Size
Disk Bandwidth
Text
ASCII
200 KB / 100 Pages
Presentation
Dependent
Image
GIF, TIFF,
JPEG
3.2MB/image
0.4 MB/image
-do-
Video
Uncompressed
HDTV
MPEG
20 MB/sec
110 MB/sec
0.2-1.5 Mbits/sec
20 MB/sec
110 MB/sec
0.2-1.5 Mbits/sec
Audio
Uncompressed
CD Quality
64 Kbits/sec
1.4 Mbits/sec
64 Kbits/sec
1.4 Mbits/sec
This table describes the size and the retrieval disk bandwidth requirements for different
media, based on their format of presentation
Disk Bandwidth Requirements
PHYSICAL
STORAGE
VIEW
Static Media (text and images)
• depends on multimedia database
application
• because do not have any inherent
temporal requirements
Dynamic Media (Audio and Video)
• have inherent temporal requirements
• proportional to their temporal requirements
• can be accessed by multiple users
simultaneously  demand new capabilities
from the file system and the OS
File System Requirements
PHYSICAL
STORAGE
VIEW
Should have the following
capabilities:
• handling huge files
• supporting simultaneously access to
multiple files by multiple users
• supporting the required disk
bandwidth
• can provide new application
programming interfaces
(play, fast forward, reverse for
dynamic media)
Operating System Requirements
Should have the capabilities for
handling real time characteristics for:
scheduling of application process
PHYSICAL
STORAGE
VIEW
• OS  reserve the resources required for an
application process
• depending on the availability of resources
application process may or may not be
admitted for execution
• communication between application process
and the OS kernel
• reduced overhead  can affect the
performance of applications
Objects are in binary form
CONCEPTUAL
DATA
VIEW
These objects are acquired (from devices)
and created (digitized, compressed and
stored) independent of its contents
For using these objects as meaningful
data, one needs to identify their content
The description of the object’s content,
called metadata is subjective
CONCEPTUAL
DATA
VIEW
It dependent of the:
• media type
• the role of an application
Example : video clip of a movie
• the sequence of frames contains actors,
actresses, the background, action going
on, etc.
A1 : Hero Fights Criminal
A2 : Criminal Takes Out Gun
1
A3 : Criminal Points Gun at
Actress
13
A4 : Hero Shoot Criminal
20
30
Frames
Figure C : Example Description of a Video Clip
The conceptual data view of a raw data
helps in building a set of abstraction
CONCEPTUAL
DATA
VIEW
These abstractions form a data model
for a particular application domain
For fast accesses, indexing mechanism is
needed to sort the data according to the
features that are modeled.
MM database may be composed of
multiple media objects
CONCEPTUAL
DATA
VIEW
Presentation to the user has to be
properly synchronized
These synchronization characteristics are
described by temporal models
Hence the conceptual data view of
multimedia data consists of the following
components:
CONCEPTUAL
DATA
VIEW
• metadata
• indexing mechanism
• temporal models
• spatial models
• data models
Metadata
CONCEPTUAL
DATA
VIEW
• Deals with the:
• content
• structures
• semantics of media objects
• The creation of metadata depends on:
• media type
• type of information
• Available techniques for automatic
(or semi-auto) generation of metadata
is important
Metadata
• Example :
CONCEPTUAL
DATA
VIEW
• Video media:
• The techniques should identify
camera shots, characters in a shot,
background of a shot, etc
• Image/text/audio data:
• The techniques should extract and
describe the features of interest –
recognition techniques
Indexing Mechanisms
CONCEPTUAL
DATA
VIEW
• To provide fast access
• should be able to handle different
features of objects such as color or
texture
Temporal Models
• Describes the time and duration of
presentation of each media objects
• Example :
CONCEPTUAL
DATA
VIEW
Video
Audio
t1
v1
v2
a1
a2
t2
t3
t5
t6
time
Video object v1 has to be presented at time t1
for a duration t3-t1 and has to be synchronized
With the presentation of audio object a1
Spatial Models
CONCEPTUAL
DATA
VIEW
• Represent the way media objects are
presented
• by specifying the layout of windows on
a monitor
Video
• Example:
Stream
Text
Stream
Image
Stream
Text
Window
Video
Window
Audio
Stream
Image
Window
Speaker
Data Models
• Object oriented approach is normally
used
CONCEPTUAL
DATA
VIEW
• to represent:
• the characteristics of objects
• metadata associated with them
• their temporal
• spatial requirement
Multimedia data can be distributed
over computer networks
DISTRIBUTED
VIEW
Huge sizes of media objects require large
bandwidth or throughput (bps)
Real time nature of objects need
guarantees on:
• end-to-end delay
• delay jitter
End-to-end delay
DISTRIBUTED
VIEW
•specifies maximum that can be suffered
(delay) by data during communication
Delay jitter
DISTRIBUTED
VIEW
• the variations in the end-to-end delay
suffered by the data
Guarantees are required for smooth
presentation of continuous media
DISTRIBUTED
VIEW
Existing communication protocols address
do not have any real time requirements
Hence, based on the temporal relationships, the buffer required and the available
network bandwidth, the client needs to
identify a retrieval schedule for requesting
objects from the server.
Is provided by user’s query to get
requirement information
FILTERED
VIEW
User’s query can be of the following types:
• query on the content of media objects
• query by example
• time indexed queries
• spatial queries
• application specific queries
Content Based Query
FILTERED
VIEW
• Considering VOD server application,
user can make queries by example
such as:
• get me the movie in which this scene
(an image) appears
• get the movie where this video clip
occurs
• show me the movie which contains
this song
Content Based Query
FILTERED
VIEW
•This refers to the multimedia object
that is used as an example.
• The MMDBMS has to process the
example data (this object) and find one
that matches
• The requirement for similarity
(between the queried object and
database object) can be on different
characteristics:
• exact
• partial
Content Based Query
FILTERED
VIEW
• Partial matching:
•We need to know the degree of
mismatch that can be allowed
between the example objects and
the one in the database
Time Indexed Queries
• For continuous media, users can give
queries in the temporal dimensions.
FILTERED
VIEW
• Example:
• Show me the movie 30 minutes
after its start
Spatial Queries
FILTERED
VIEW
• Media objects such as images and video
have spatial characteristics
• Example:
Show me the image where Mawi
is seen on the left of Siti Nurhaliza
Application Specific Queries
FILTERED
VIEW
• Multimedia database are highly
application specific
• Example: medical or GIS
Show me the video where the tissue
evolves into a cancerous one
User’s view of a MMDBMS is characterized
by the following requirements:
USER’S
VIEW
• user query interface
• presentation of multimedia data
• user interaction during presentation
User Query Interface
• Query interface should allow users to
query by content, example, time, space
or a combination of these possibilities.
USER’S
VIEW
• For queries by example, the user query
interface has to obtain the example
objects from appropriate devices
e.g image from scanner
• In case of partial matching of the resolved
queries, the query interface can suggest
ways to modify the query to get exact
matches.
Presentation of Multimedia Data
USER’S
VIEW
• Media object can be on different format.
Example: image  tiff, gif
• In some cases, might be necessity to
convert data from one format to
another format before presentation.
• The presentation of multimedia object
may have:
• temporal constraints
• spatial constraints
• The constraints describe the layout of
windows on the user’s screen
User Interaction During Presentation
USER’S
VIEW
• User can interact during the
presentation of multimedia objects.
• The interaction is complex since multiple
media objects are involved
• Example:
• Devices such as microphone and
video camera can be used for
speech and gesture recognition
• Hence, simultaneous control of different
devices and handling user input is
required.
User Interaction During Presentation
• The input from the user can be of the
following types:
USER’S
VIEW
• modify the quality of the
presentation, reduction or
magnification of the image
• direct the presentation, skip,
restart etc
Download