BIT 3193 MULTIMEDIA DATABASE CHAPTER 1 : INTRODUCTION TO MULTIMEDIA DATABASE What is Multimedia Database ? The ability to manage, store and retrieve the different types of media (multimedia data) (Lynne Dunckley, 2003) What is Multimedia Database ? • support multimedia data types in addition to providing facilities for traditional DBMS functions : – – – – database creation, data modeling, data retrieval, data access and organization, and data independence. (Kosch, H. and Döller, M., 2005) Static •do not have a time dimension •their contents and meanings do not depend on the presentation time • example : graphics/still images, alphanumeric data Multimedia Data Dynamic • have time dimensions • their meanings and correctness depend on the rate at which they are presented • example : animation, video and audio consists of plain alphanumeric character : ASCII TEXT Storage space requirement in bytes, being equal to the numbers of characters (including spaces) in the document Example : A typical book with 300 pages, each of has 3,000 characters Storage = 900 Kb Vector-based • graphics elements are represented in mathematical formulas • Example : rectangle element identifier of rectangles + coordinates of 2 opposite corners are stored –> to change, only modify these parameters • storage requirements are very low GRAPHICS Pixel-based • the graphics is divided into small picture elements called pixels • each pixel corresponds to a dot on the screen • the intensity @ color of pixels are stored in a pixel-based graphics file Produced by sequential rendering a number of frames of graphics ANIMATION If the graphics are pixel-based, the animation is the same as video If the graphics are vector-based, indexing and retrieval can be carried out in similar way for vector graphics but have extra temporal dimension Caused by a disturbance in air pressure that reaches the human eardrum Parameter used is amplitude (dB) AUDIO A sound wave is continuous in both time and amplitude Amplitude Time Figure A : Example sound wave Sound wave is a analog signal AUDIO For computers to process and communicate an audio signal, it must be converted into a digital signal : ADC ADC involved 3 stages: • sampling, • quantization • coding VIDEO Consists of a number of frames or images that have to be played at a fixed rate Parameter used is fps (frames per second) Example : 10-minutes video with image size 512 pixels by 512 lines, pixel depth of 24 bits/pixel and frame rate 30fps Storage : 600*30*512*512*3=13.8Gb 2 common frame rates : • 25 fps : PAL systems • 30 fps : NTSC systems 2 major characteristics of video: • has a time dimension • takes a huge amount of data to represent There are three challenges that arise from multimedia data that do not occur with other data types. Size Time Semantic Nature • To get an idea of the size of media data objects : • single colored image could require 6 Mb • video object (30 fps, 5 minutes video clip) would require 54 Gb. • audio will occupy 8 Kb for each second • Data size will affect the storage, retrieval and transmission of multimedia. • Therefore techniques that reduce the size of multimedia data without impacting on the information are crucial. • Video and audio must run in the correct sequence and at an acceptable rate otherwise it becomes meaningless. • have significance for the way the media objects are stored, retrieved, transmitted and synchronized together. • example : video clip of an interview audio + image data (must synchronized together). • more complex than traditional data types. • difficult to identify components within the media that could be used for retrieval or transaction processing. • interpretations may need based on certain features of multimedia data and stored as METADATA. • any data • that is required to interpret other data • as meaningful information • it is used for retrieving and manipulating the data Multimedia Database Applications Entertainment System Video on Demand Public Protection Medical Information System The registered user of the system can request a video from the catalog. The videos may be available according to a previously advertised fixed schedule or available at any time, subject to a small delay. The user can select a video based on textual information of the cast, production team and synopsis of the plot. Production information such as storyboards, screenplay and production notes can be included. Users can view the video contiguously or play randomly selected scenes. The video can be paused and resumed play as requested within constraint. • This is a single media application. • The user is not involved in capturing, editing or manipulating the media. • Communication of the media is unidirectional. • Delivery may be simplified by scheduling requests and combining the delivery to several users at the same time. • There is high data volume that requires high performance storage and networking system. • There is a large number of users may accept some loss of quality. Exercise Identify the system requirements of the following multimedia applications case study. In a number of countries police use visual information to identify people or to record the scenes of crime for evidence. These photographic records are a valuable archive. In the UK everyone arrested is photographed and their images are sampled and stored with their fingerprints. It is also planned to store sampled DNA profiles of suspects. Until a subject is convicted, access to photographic information is restricted. Interrogation of the database may be on the basis of automatic fingerprint recognition, DNA matching and face recognition. Video surveillance also needs to be linked to the facial recognition system. • The user is involved in capturing, editing, or manipulating the media. • bi-directional data flow • complex content modeling for complex correlated queries. • diverse media – maps, images, audio, video. • interactivity with media through simple matching queries. The medical and related health professionals use and store visual information in the form of x-ray, ultrasound and other scanned images for diagnosis and monitoring purposes. There are strict rules on confidentiality of such information. The images are kept with patients’ records stored by unique identifier (e.g national insurance number). Visual information, provided that it is rendered anonymous, may also be used for research purposes. Effective image processing such as edge detection and feature extraction can be important in assisting expert diagnosis of lesions, tumors and tracking their growth. Images may be the result of a single instrumental approach, e.g x-ray, or the result of a combination of data from several resources. • The user is involved in capturing, editing, or manipulating the media. • highest quality media data with little toleration of data loss. •Confidential security required. •bi-directional data flow • complex content modeling for complex correlated queries. • diverse media – maps, images, audio, video. • The physical storage describes how multimedia objects are stored in a file system. • Multimedia objects are typically huge need different techniques for their storage and retrieval. • This view deals with the issue of: • providing fast access to stored data index mechanisms • can be stored in different systems • can be accessed over computer networks • can query multimedia data in different ways depending on the type of information • provide a filtered view of the mm databases to retrieve only the required data • the object retrieved have to be appropriately presented Application Interface User’s view W1 Wn Physical Storage View W1 Wn Wn W2 Query 1 Distributed view Conceptual Data View Application Interface W1 W2 Filtered view Application Interface W2 Query 2 Query 3 COMMUNICATION NETWORK Data Models (OOP Metadata) Text Data Access (Indexing) Image Temporal Models (Petri Nets) Video Figure B : Components Involved in MMDBMS Audio Deal with raw digitized data PHYSICAL STORAGE VIEW The main issue is size. Size of objects influences: • storage capacity requirements • retrieval bandwidth (bps) requirements Table 1 : Media Types, Representation, Size and Bandwidth Requirements Media Representation Data Size Disk Bandwidth Text ASCII 200 KB / 100 Pages Presentation Dependent Image GIF, TIFF, JPEG 3.2MB/image 0.4 MB/image -do- Video Uncompressed HDTV MPEG 20 MB/sec 110 MB/sec 0.2-1.5 Mbits/sec 20 MB/sec 110 MB/sec 0.2-1.5 Mbits/sec Audio Uncompressed CD Quality 64 Kbits/sec 1.4 Mbits/sec 64 Kbits/sec 1.4 Mbits/sec This table describes the size and the retrieval disk bandwidth requirements for different media, based on their format of presentation Disk Bandwidth Requirements PHYSICAL STORAGE VIEW Static Media (text and images) • depends on multimedia database application • because do not have any inherent temporal requirements Dynamic Media (Audio and Video) • have inherent temporal requirements • proportional to their temporal requirements • can be accessed by multiple users simultaneously demand new capabilities from the file system and the OS File System Requirements PHYSICAL STORAGE VIEW Should have the following capabilities: • handling huge files • supporting simultaneously access to multiple files by multiple users • supporting the required disk bandwidth • can provide new application programming interfaces (play, fast forward, reverse for dynamic media) Operating System Requirements Should have the capabilities for handling real time characteristics for: scheduling of application process PHYSICAL STORAGE VIEW • OS reserve the resources required for an application process • depending on the availability of resources application process may or may not be admitted for execution • communication between application process and the OS kernel • reduced overhead can affect the performance of applications Objects are in binary form CONCEPTUAL DATA VIEW These objects are acquired (from devices) and created (digitized, compressed and stored) independent of its contents For using these objects as meaningful data, one needs to identify their content The description of the object’s content, called metadata is subjective CONCEPTUAL DATA VIEW It dependent of the: • media type • the role of an application Example : video clip of a movie • the sequence of frames contains actors, actresses, the background, action going on, etc. A1 : Hero Fights Criminal A2 : Criminal Takes Out Gun 1 A3 : Criminal Points Gun at Actress 13 A4 : Hero Shoot Criminal 20 30 Frames Figure C : Example Description of a Video Clip The conceptual data view of a raw data helps in building a set of abstraction CONCEPTUAL DATA VIEW These abstractions form a data model for a particular application domain For fast accesses, indexing mechanism is needed to sort the data according to the features that are modeled. MM database may be composed of multiple media objects CONCEPTUAL DATA VIEW Presentation to the user has to be properly synchronized These synchronization characteristics are described by temporal models Hence the conceptual data view of multimedia data consists of the following components: CONCEPTUAL DATA VIEW • metadata • indexing mechanism • temporal models • spatial models • data models Metadata CONCEPTUAL DATA VIEW • Deals with the: • content • structures • semantics of media objects • The creation of metadata depends on: • media type • type of information • Available techniques for automatic (or semi-auto) generation of metadata is important Metadata • Example : CONCEPTUAL DATA VIEW • Video media: • The techniques should identify camera shots, characters in a shot, background of a shot, etc • Image/text/audio data: • The techniques should extract and describe the features of interest – recognition techniques Indexing Mechanisms CONCEPTUAL DATA VIEW • To provide fast access • should be able to handle different features of objects such as color or texture Temporal Models • Describes the time and duration of presentation of each media objects • Example : CONCEPTUAL DATA VIEW Video Audio t1 v1 v2 a1 a2 t2 t3 t5 t6 time Video object v1 has to be presented at time t1 for a duration t3-t1 and has to be synchronized With the presentation of audio object a1 Spatial Models CONCEPTUAL DATA VIEW • Represent the way media objects are presented • by specifying the layout of windows on a monitor Video • Example: Stream Text Stream Image Stream Text Window Video Window Audio Stream Image Window Speaker Data Models • Object oriented approach is normally used CONCEPTUAL DATA VIEW • to represent: • the characteristics of objects • metadata associated with them • their temporal • spatial requirement Multimedia data can be distributed over computer networks DISTRIBUTED VIEW Huge sizes of media objects require large bandwidth or throughput (bps) Real time nature of objects need guarantees on: • end-to-end delay • delay jitter End-to-end delay DISTRIBUTED VIEW •specifies maximum that can be suffered (delay) by data during communication Delay jitter DISTRIBUTED VIEW • the variations in the end-to-end delay suffered by the data Guarantees are required for smooth presentation of continuous media DISTRIBUTED VIEW Existing communication protocols address do not have any real time requirements Hence, based on the temporal relationships, the buffer required and the available network bandwidth, the client needs to identify a retrieval schedule for requesting objects from the server. Is provided by user’s query to get requirement information FILTERED VIEW User’s query can be of the following types: • query on the content of media objects • query by example • time indexed queries • spatial queries • application specific queries Content Based Query FILTERED VIEW • Considering VOD server application, user can make queries by example such as: • get me the movie in which this scene (an image) appears • get the movie where this video clip occurs • show me the movie which contains this song Content Based Query FILTERED VIEW •This refers to the multimedia object that is used as an example. • The MMDBMS has to process the example data (this object) and find one that matches • The requirement for similarity (between the queried object and database object) can be on different characteristics: • exact • partial Content Based Query FILTERED VIEW • Partial matching: •We need to know the degree of mismatch that can be allowed between the example objects and the one in the database Time Indexed Queries • For continuous media, users can give queries in the temporal dimensions. FILTERED VIEW • Example: • Show me the movie 30 minutes after its start Spatial Queries FILTERED VIEW • Media objects such as images and video have spatial characteristics • Example: Show me the image where Mawi is seen on the left of Siti Nurhaliza Application Specific Queries FILTERED VIEW • Multimedia database are highly application specific • Example: medical or GIS Show me the video where the tissue evolves into a cancerous one User’s view of a MMDBMS is characterized by the following requirements: USER’S VIEW • user query interface • presentation of multimedia data • user interaction during presentation User Query Interface • Query interface should allow users to query by content, example, time, space or a combination of these possibilities. USER’S VIEW • For queries by example, the user query interface has to obtain the example objects from appropriate devices e.g image from scanner • In case of partial matching of the resolved queries, the query interface can suggest ways to modify the query to get exact matches. Presentation of Multimedia Data USER’S VIEW • Media object can be on different format. Example: image tiff, gif • In some cases, might be necessity to convert data from one format to another format before presentation. • The presentation of multimedia object may have: • temporal constraints • spatial constraints • The constraints describe the layout of windows on the user’s screen User Interaction During Presentation USER’S VIEW • User can interact during the presentation of multimedia objects. • The interaction is complex since multiple media objects are involved • Example: • Devices such as microphone and video camera can be used for speech and gesture recognition • Hence, simultaneous control of different devices and handling user input is required. User Interaction During Presentation • The input from the user can be of the following types: USER’S VIEW • modify the quality of the presentation, reduction or magnification of the image • direct the presentation, skip, restart etc