Database Systems and WWW Applications Digital Libraries Multimedia Database Systems

advertisement
Database Systems and
WWW Applications
Digital Libraries
Multimedia Database Systems
2003
WWW-Lib-MM
1
Contents
• Database Systems and WWW Applications
– Internet DB Architecture
– Internet Applications
• Digital Libraries
• Multimedia Database Systems
2003
WWW-Lib-MM
2
1
Internet Application Architecture: Today
Browser
authoring
tools, etc.
HTTP
HTTP
Physical Middle Tier
WEB/APP Server
Application messages
Client Tier
Browser
Middle-Tier
Application
Data Integration,
Storage, Query,
Management
Remote messages
Gateways
Data Sources
Other
ORDBMS
Data
Sources
OLE/DB
Data source
Nori, A., Databases in Internet Applications: Case Studies,
in: Postmodern DBS, UC Berkeley, Spring 1999
2003
•
•
•
•
•
•
•
•
•
WWW-Lib-MM
3
Internet Applications
Entertainment
– Games, Music, Films, Multi-person chat
Public information
– Maps, Tax return helper
Advertisement
– Interactive catalogues for products and services
Medicine
– Diagnosis, Consultation, Remote surgery
Education
– Learning-on-demand (for a degree),
virtual museums, tour remote spaces
Engineering
– Collaborative design, remote parallel simulation services
Publishing
– Submit, Review, Proof-editing (text and graphics)
Tele-communication
– Conferencing
...
2003
WWW-Lib-MM
4
2
Contents
• Database Systems and WWW Applications
• Digital Libraries
–
–
–
–
Definitions
Underlying concepts
Digital Libraries Initiative
Digital Libraries (examples)
• Multimedia Database Systems
2003
WWW-Lib-MM
5
Definitions
In the Stanford Digital Library project, we view long-term digital
library systems as collections of widely distributed, autonomously
maintained services. Of course, a digital library system must include
services that allow users to search over collections of information
objects. Examples of searchable collections include traditional
library collections, digital images, e-mail archives, video, on-line
books, and scientific article citation catalogs (containing only
metadata about the articles, not the articles themselves).
While searching services are valuable, they are not the only kind of
service in the digital library of the future. Remotely usable
information processing facilities are also important digital library
services. These services provide support for activities such as
document summarization, indexing, collaborative annotation, format
conversion, bibliography maintenance, and copyright clearance.
The Stanford Digital Library Technologies Project
2003
WWW-Lib-MM
6
3
Definitions
Digital libraries are organizations that provide the
resources, including the specialized staff, to select,
structure, offer intellectual access to, interpret,
distribute, preserve the integrity of, and ensure the
persistence over time of collections of digital works so
that they are readily and economically available for use
by a defined community or set of communities.
The Digital Library Federation (DLF)
Note:CS researchers tend to focus on digital libraries as content
collected on behalf of user communities, while librarians focus on
digital libraries as institutions or services.
2003
WWW-Lib-MM
7
Contents
• Database Systems and WWW Applications
• Digital Libraries
–
–
–
–
Definitions
Underlying concepts
Digital Libraries Initiative
Digital Libraries (examples)
• Multimedia Database Systems
2003
WWW-Lib-MM
8
4
Notions
•
•
•
•
•
Content – the items in the library collection
Annotation – information added to (associated with) an item
Subject matter – focus of a collection, topics used in classification
Catalog – database (card file) of bibliographic records
Classification – assigning call number, adding keywords
•
•
•
•
•
Rights to use - permissions
License agreements – contractual right to use
Copyright
Watermark – a subliminal pixal pattern to identify a digital work
Copy detection – verifying copying, searching for copies
•
•
Search (40% of search queries on the web are reported to be single words)
Metasearchers (services that provide unified query interfaces to multiple search
engines. Thus users have the illusion of a single combined document source. Three main
tasks: choosing the best sources to evaluate a query; evaluating the query at these sources;
merging the query results from these sources.)
2003
9
WWW-Lib-MM
File Formats
– Image/graphics formats
•
•
•
•
•
•
•
•
•
TIFF
GIF
JFIF
SPIFF
PICT
TGA
EPS
CGM
PhotoCD
Tagged Image File Format
Graphics Image File Format
JPEG File Format
Still Picture Interchange File Format
Macintosh Picture
TrueVision Targa file (bit mapped images)
Encapsulated PostScript
Computer Graphics Metafile
(Kodak)
– Picture and video formats
• JPEG
• Motion JPEG
• MPEG
Joint Photographic Expert Group
Moving Pictures Expert Group
– Document formats
• PostScript
• PDF
2003
(Adobe)
Portable Document Format (Adobe)
WWW-Lib-MM
10
5
Compression
• Compression
– lossless
• color 25%-50%-67%
• B/W 50%-90%;
– lossy up to 95%
• Compression formats
– CCITT Group III or Group IV
– JPEG
– JBIG
An international compression standard
– LZW.Subsampling
(lossy)
• Compression schemes
– LZW
Lempel-Ziv-Welch (lossless)
– MPEG
Group of Pictures: IBBBPBBBPB…I
– QuickTime
(Apple)
2003
WWW-Lib-MM
11
Images in the Digital Library
Most image-database systems store descriptive information about the
images in a traditional text-based information retrieval system. An
additional field containing the filename of the image is added to each
record in this text-based system to link it to an image file. Images are
selected by querying the text-based system. When the query is specific
enough, the user requests a selected image (or a set of images) to view.
Extensions to user-interface software look up the filename field(s) in the
text record(s) and display the image(s), often in a new window. Each
system handles the text/image relationship in its own way, and standards
need to be developed to enable the interchange of image files among
systems
Much research remains in the field of image databases, particularly with
respect to image-quality needs. Further studies need to stratify types of
collections, as well as users and uses of those collections, relating each to
a series of required image qualities.
Howard Besser and Jennifer Trant
Introduction to Imaging: Issues in Constructing an Image Database
http://www.getty.edu/research/institute/standards/introimages/
2003
WWW-Lib-MM
12
6
Metadata and Cataloging
• Metadata
• OPAC
• Content description
• MARC
• Dublin Core
• Indexing
• MPEG7
2003
Data about data (structure and access)
On-line Public Access Catalog
structured vocabularies
data-structure guidelines
MAchine Readable Cataloging
a classification scheme
abbreviation for works
organized for reference
metadata about MPEG data
WWW-Lib-MM
13
Information Retrieval - Text
•
Basic searching techniques:
– Linear search (can do regular expressions)
– Inverted files
– Hash tables
– Signature files (compressed linear search)
•
Linear search requires no extra space, linear complexity in size, no
preprocessing
•
Inverted files cannot search for arbitrary expressions, (usually) must
start at the beginnings of words, building index takes n log n time
length of file (n words). Index overhead ranges from 25% to 200%.
•
Hash coding is sensitive to the exact spelling of the word, and tends to
scatter words nearly spelled the same; requires preprocessing and has
slight storage overhead.
2003
WWW-Lib-MM
14
7
Information Retrieval - Images
• Indexes that were made for other purposes
– Citations
– Reviews
– …
• Thumbnails
• Exploit layout formats (e.g., newspaper columns)
• Image alignment
– Centering
– Feature analysis, normalize rotation and X-Y orientation)
• Complementation: an image of a red rose will not normally have
the keyword "red". Thus image features and associated words
can complement and even disambiguate each other.
2003
WWW-Lib-MM
15
Contents
• Database Systems and WWW Applications
• Digital Libraries
–
–
–
–
Definitions
Underlying concepts
Digital Libraries Initiative
Digital Libraries (examples)
• Multimedia Database Systems
2003
WWW-Lib-MM
16
8
US Digital Libraries Initiative (Phase I)
• University of California, Berkeley
– Work-centered digital information services
• University of California, Santa Barbara
– Spatially referenced map information
• Carnegie Mellon University
– Full-content search and retrieval of video
• University of Illinois at Urbana-Champaign
– Federating repositories of scientific literature
• University of Michigan
– Intelligent agents for information location
• Stanford University
– Interoperation mechanisms among heterogeneous services
Shared vision: an entire Net of distributed repositories, where objects
of any type can be searched within and across indexed collections
2003
WWW-Lib-MM
17
US Digital Libraries Initiative (Phase I)
• University of California, Berkeley
– http://elib.cs.berkeley.edu/
– also see http://sunsite.berkeley.edu/
• University of California, Santa Barbara
– http://www.alexandria.ucsb.edu/
• Carnegie Mellon University
– http://www.informedia.cs.cmu.edu/
• University of Illinois at Urbana-Champaign
– http://dli.grainger.uiuc.edu/idli/idli.htm
• University of Michigan
– http://www.si.umich.edu/UMDL/
• Stanford University
– http://www-diglib.stanford.edu/
2003
WWW-Lib-MM
18
9
Examples of Technology Impact
• University of California, Berkeley
– Multivalent Documents
– Robust Hyperlinks and Robust Locations
– TilePics
– …
• Carnegie Mellon University
– Informedia Digital Video Library System
– …
• Stanford University
– Archival Digital Libraries Repositories
– Large Scale Copy Detection
– Google Search Engine
– …
• …
2003
WWW-Lib-MM
19
Multivalent Documents
• Multivalent Annotations
– Stored separately from the document they annotate
– Appear in situ – as if part of the content of the document
•
•
•
•
Hyperlinks
Highlights
Notes
Copy editor markup (executable)
– Three classes of behavior
• Spans (anchored to points or intervals)
– E.g., Hyperlinks, Rollovers, Highlights
• Lenses (anchored to geometric regions)
– E.g., Bit Magnify, Optical Character Recognition
• Structures (within the document tree)
– E.g., Book w/chapters and sections
– Combining Annotations
• Notemarks
– E.g., outlining, man pages
2003
WWW-Lib-MM
20
10
Robust Hyperlinks and Robust Locations
• URLs can be made robust
– if a web page moves to another location anywhere on the
web, you can find it.
• Even if that page has been edited.
• Robust Hyperlinks
– URLs are augmented with a five or so word content-based
lexical signature to make a robust hyperlink
– If the URL's address-based portion breaks:
Feed the signature into any web search engine
to find the new site of the page.
2003
WWW-Lib-MM
21
TilePics
• A file format designed to store tiled data of arbitrary type in a
hierarchical, indexed format in order to provide fast retrieval.
•
•
•
•
•
a fixed sized header
tile index data
an optional gap
contiguous tile data
optional attribute data
• Encapsulate a large amount of related, static data in a single file.
• A one or two-dimensional dataset
• At multiple scales of resolution or abstraction.
• Tileable, based on x,y coordinates for quick localized access
• Store data at multiple levels of resolution
• in multiple layers of tiles
• each layer relates to the next by the same scale factor
• Zoom by drawing just the relevant tiles at the next layer down
2003
WWW-Lib-MM
22
11
Informedia Digital Video Library System
• IDVLS attempts to automate cataloging by:
– Recognizing speech
– Understanding text and language
– Segmenting text
– Recognizing text within imagery
– Segmenting video
– Analysing video structure
– Image matching based on perceived color
– Region matching for content-based image retrieval
– Detecting video shot boundaries
2003
23
WWW-Lib-MM
Archival Digital Libraries Respositories
users
users
Archival Repository
Web Server
Info
Monitor
2003
WWW-Lib-MM
File System
24
12
Large Scale Copy Detection
•
CDS: Copy Detection System
– content publishers register their valuable digital content in CDS
– CDS crawls the web
• compares the web content to the registered content
– notifies the content owners of illegal copies.
•
Key challenges
– accuracy, in terms of high precision and recall,
– scalability, in terms of coping with several terabytes of data (or
several tens of millions of web pages)
– resiliency to “attacks”
•
Two prototypes
– SCAM (Stanford Copy Analysis Mechanism, for text)
– FRAUD (Finding Replicas of AUDio)
2003
25
WWW-Lib-MM
Google Search Engine
• PageRank: A Citation Importance Ranking
– Number of backlinks (~ citations)
B
B and C are
backlinks of A
A
C
– Large database of links: propagation
N
• Idealized Model
1
l1,2 = 1
l2,1 = 0
2003
• Approximation of importance
• Citation analysis literature
– Citation indexes
• Extreme variation in importance
ni =
Σl
j=1, i≠j
Wj =
Σ (l
i=1, i≠j
2
i,j
N
WWW-Lib-MM
i,j
number of outgoing links
on page i (includes multiple
links to the same page)
Wi
— ) PageRank of page j
ni
26
13
Papers on the Creation of Google
• The Anatomy of a Large-Scale Hypertextual Web Search
Engine, by Sergey Brin, Lawrence Page
• Dynamic Data Mining: Exploring Large Rule Spaces by
Sampling, by Sergey Brin, Lawrence Page
• Computing Iceberg Queries Efficiently, by Min Fang,
Narayanan Shivakumar, Hector Garcia-Molina, Rajeev Motwani,
and Jeffrey D. Ullman
• The PageRank Citation Ranking: Bringing Order to the Web,
by Lawrence Page, Sergey Brin, Rajeev Motwani, and
Terry Winograd
• Extracting Patterns and Relations from the World Wide Web,
by Sergey Brin
• Finding near-replicas of documents on the web,
by Narayanan Shivakumar, Hector Garcia-Molina
• Efficient Crawling Through URL Ordering, by Junghoo Cho,
Hector Garcia-Molina, Lawrence Page
2003
WWW-Lib-MM
27
Contents
• Database Systems and WWW Applications
• Digital Libraries
–
–
–
–
Definitions
Underlying concepts
Digital Libraries Initiative
Digital Libraries (examples)
• Multimedia Database Systems
2003
WWW-Lib-MM
28
14
ACM Portal: ACM Digital Library
Bibliographic information, abstracts, reviews, and the full-text for
articles published in ACM periodicals and proceedings since its
founding in 1947 are available in the library together with selected
works published by affiliated organizations.
As of October 15, 2002, the Digital Library contains:
– over 102,500 full-text articles from journals, magazines, and
conference proceedings.
– Tables of Contents with over 33,000 citations from articles
published in journals and magazines from 1954 forward.
– Tables of contents with more than 69,000 citations from articles
published in over 1100 volumes of conference proceedings
since 1970.
2003
WWW-Lib-MM
29
ACM Digital Library
•
The Digital Library presents all material associated with an article:
– Bibliographic data
includes the title, author(s), publication, volume, issue, and page
numbers of an article.
– Index terms
compiled using article keywords and the ACM Computing
Classification System.
– Abstracts available for most articles in the Digital Library.
– Reviews from ACM Computing Reviews (Selected articles)
– Full-text view or download complete articles.
Most articles are available in PDF -- some are available in other
formats including HTML, postscript, and LaTeX.
– DOI
When ACM submits a reference query and it is matched, a
Universal Resource Name (URN) in the form of a Digital Object
Identifier (DOI) is returned and inserted as an external link from
ACM's site to the source for the material.
2003
WWW-Lib-MM
30
15
University of Oslo Digital Library Project
• “Post graduate theses in the digital library” (Hovedfagsoppgaver
i digitalt bibliotek) is a pilot project where theses will be
published in full text on the world wide web.
• A step in establishing a digital library where the University of
Oslo shall keep electronic teaching materials and documents.
• A joint project between the USIT SGML group, the University of
Oslo library, and other institutions, including the Institute of
Informatics
• Students are to use Microsoft Word and a template file provided
by the project.
• Microsoft Word documents using the template styles can by
automatically converted to HTML and SGML.
• http://www.digbib.uio.no/ (in norwegian)
2003
WWW-Lib-MM
31
Contents
• Database Systems and WWW Applications
• Digital Libraries
• Multimedia Database Systems
– Definitions
– Example Application
• MM QoS Requirements
– MMDBMS Requirements
– MMDBMS Concepts
2003
WWW-Lib-MM
32
16
Definitions
• Multimedia (MM); loosely: any system that can be used to present
information in more than one form: text, graphics, still images,
animation, sound, video, special computer-generated effects.
The system should have user-friendly interactive interfaces that help
the communication of complexly structured data.
• MMDBSs: are the DBSs that manage MM data, facilitate MM for
presentations, and use specific tools for the storage, management,
and retrieval of MM data.
2003
WWW-Lib-MM
33
Multimedia Applications
•
•
•
•
•
•
•
•
•
Entertainment
Public information
Advertisement
Education
Medicine
Engineering
Publishing
Tele-communication
...
2003
WWW-Lib-MM
34
17
Data Flow for a Multimedia Network Server
Multimedia server
Graphics/video
hardware
Storage
Buffers
Network
Buffers
Audio
hardware
Client
2003
WWW-Lib-MM
35
Contents
• Database Systems and WWW Applications
• Digital Libraries
• Multimedia Database Systems
– Definitions
– Example Application
• MM QoS Requirements
– MMDBMS Requirements
– MMDBMS Concepts
2003
WWW-Lib-MM
36
18
Multimedia-Supported Learning of
Practical Medical Procedures
•
•
•
•
•
•
Provide realistic visualization of required practical skills
Proven to be pedagogically beneficial to view the multimedia lesson on
a procedure in a “learning on demand” setting before observing it in the
clinic
Lessons involve realistic multimedia elements (video and audio)
recorded in Oslo hospitals, with expert commentary,
Over 17,000 multimedia elements in OKSE-basen database.
Mostly on CD-ROM.
LoD over the Internet would enable
– Greater flexibility (time and location) for students
– Other applications
• Paramedics review skills on demand in emergency situations
• Doctors take courses in their office for lifelong learning
– Incremental release and revision of lessons or skill segments
2003
WWW-Lib-MM
37
Selective Multimedia Quality is Critical
• Quality of Service (QoS):
– The collective effects of service performance which determine
the degree of satisfaction of a user of the service.
– Performance, not operation (non-functional requirements,
independent of functional requirements)
• Video accuracy, for example, when draining the chest.
– The video must accurately show location of arteries, ribs,
where the drain can safely be inserted to avoid arteries.
• Audio fidelity, for example, when breathing is difficulty.
– The audio must be clear enough to differentiate between stridor,
an obstruction of the large airways, and asthmatic breath sounds.
• Timing accuracy.
– Some procedures should be viewed in near real time, possibly at
reduced video resolution and reduced audio fidelity.
• The critical quality focus may shift within a lesson.
– The infrastructure should shift resources to the critical qualities
(and ignore others if necessary).
2003
WWW-Lib-MM
38
19
Contents
• Database Systems and WWW Applications
• Digital Libraries
• Multimedia Database Systems
– Definitions
– Example Application
• MM QoS Requirements
– MMDBMS Requirements
– MMDBMS Concepts
2003
WWW-Lib-MM
39
Requirements for MMDBSs
Ability to ...
• represent arbitrary data types and specification of programs that
interact with arbitrary data sources;
• query and modify (update, insert, delete) MM data; including
retrieval of MM data via associative search within MM data
(minimally, text);
• specify and execute abstract operations on MM data, e.g., play,
fast forward, pause, and rewind one-dimensional data like audio
or text; to display, expand, and compress two-dimensional data
like bit-mapped images;
• deal with heterogeneous data sources in a uniform manner; this
includes access to data in these sources and migration of data
from one data source to another.
2003
WWW-Lib-MM
40
20
Requirements - 2
MM data storage and retrieval:
• MM & object-oriented data modeling concepts;
• management of several kinds of magnetic and optical storage
devices appropriate for MM data handling;
• uniform management of very large data volumes =>
management of tertiary storage and multi-level storage
hierarchies;
• support for realtime data processing =>
appropriate scheduling and resource allocation techniques;
• support for storage and processing parallelism (performance
requirements);
• support for distribution => appropriate distributed DBMS
concepts.
2003
41
WWW-Lib-MM
Storage space requirements for uncompressed
digital multimedia data (examples)
Media type
Specifications
Data rate per sec.
Voice-quality audio
1 channel, 8-bit
samples at 8 kHz
Equiv. to CD quality
64 Kbits
MPEG-encoded audio
CD-quality audio
2 channels, 16-bit
samples at 44.1 kHz
MPEG2-encoded video 640x480 pixels/frame,
24 bits/pixel
NTSC-quality video
640x480 pixels/frame,
24 bits/pixel
HDTV-quality video
1280x720 pixels/frame,
24 bits/pixel
2003
WWW-Lib-MM
384 Kbits
1.4 Mbits
0.42 Mbytes
27 Mbytes
81 Mbytes
42
21
Requirements - 3
Realtime and synchronization issues:
•
•
•
•
“soft” realtime transfer requirements
“hard” transaction deadlines
synchronization between different data streams (data types)
user interactions (synchronous and asynchronous)
=> dependent on data distribution, storage devices,
compression techniques for the various data types,
buffer management techniques, scheduling algorithms,
data placement techniques, and communication bandwidth
2003
WWW-Lib-MM
43
Contents
• Database Systems and WWW Applications
• Digital Libraries
• Multimedia Database Systems
– Definitions
– Example Application
• MM QoS Requirements
– MMDBMS Requirements
– MMDBMS Concepts
2003
WWW-Lib-MM
44
22
DBMS Concepts
• Data modeling: temporal object-oriented modeling and
presenting (HCI) of multimedia data
+ extra data types & operations
• Query processing and optimization: browsing, content
addressing
• Storage management: optimization techniques
• Transaction management: realtime processing for read
transactions (presentations),
write transactions (authoring) use a advanced transaction model
(e.g., checkout-checkin with versioned data)
2003
WWW-Lib-MM
45
User Interface Design for MM Applications
• User interaction and user interfaces become much more
complex if MM data is involved.
• State-of-the-art: buttons, text entry, scrollable areas, ...
-> does not support interaction with continuous media
• New devices (e.g., cameras, microphones, loudspeakers, ...)
have to be taken into account in addition to keyboard, mouse,
monitor, and external devices (e.g., VCRs, ...) for input and
output handling:
- simultaneous control of different devices
- efficient handling of user interrupts
- standardized interaction paradigms
- support for pen + voice input
- ...
2003
WWW-Lib-MM
46
23
Object-Oriented Data Modeling + ...
Data types and operations for:
•
•
•
•
•
•
•
text
graphic
image
audio
speech
video
generated media
Temporal relationships:
- Synchronization and realtime processing
Quality-of-Service:
- to handle average delay, speed ratio, utilization,
jitter, skew, and reliability.
2003
WWW-Lib-MM
47
Required Data Model Concepts
and Related Work
• Time independent data types
• Time dependent data types (continuous types)
• Temporal concepts: valid, transaction, and play time
• Temporal data models: TIGUKAT, T_Chimera, Mediadoc,
SGML/HyTime, ...
• Multimedia data models: AMOS, SGML/HyTime, LMDM, ...
2003
WWW-Lib-MM
48
24
Learning-on-Demand (Asynchronous IDL)
- students should be able to retrieve
data from campus and from home
- flexible query facilities
- quality of service support
- scalable and synchronized playback
- store lectures in a DBMS
- make lectures available
for students
TOOMM
Network
Query proc. & opt.
ObjectStore
2003
49
WWW-Lib-MM
Concepts of TOOMM
Presentation Model
Logical Data Model
P_Video 13
Video 1
P_Video 14
CPO1
P_Video 15
Composite
Presentation
Object
P_Audio 11
Atomic Presentation Objects
2003
WWW-Lib-MM
Video 2
Audio 1
Multimedia Objects
50
25
Example: Modeling a Video Object
Frame 0
TA 0
Timestamp 0
Frame 1
Video 1
TA 1
Timestamp 1
Frame n
TA n
2003
Timestamp n
51
WWW-Lib-MM
Type Hierarchy
MMDT
PTI_MMDT
PTD_MMDT
Component
Text
Stream
Video
2003
Audio
Picture
CGM
Graphics
LDU
Music Animation
Frame
WWW-Lib-MM
Sample
Event
Note
Anim.
52
26
Play Time
Components of a stream multimedia object
TA 0
TS 0
TA 1
LDU 0
TS 1
TA n
LDU 1
TS n
LDU n
Play Time
event 0
TS 0
event 1
TA 0
TS 1
event n
TA 1
TS n
TA n
Components of a CGM multimedia object
2003
53
WWW-Lib-MM
EER Diagram
Temporal_reference
0:1
1:1 1:1
start
1:1
stop
1:1
MMDT
1:1
0:M
Temporal_Relationships
1
1:1
Serial
1:1
0:M
2
1:M
Parallel
0:M
1:1
Effect
P_MMDT
P_PTD_MMDT
P_PTI_MMDT
P_Stream
P_Text P_Picture P_Graphics
2003
1:1
CPO
P_CGM
P_Audio P_Video P_Music P_Anim.
WWW-Lib-MM
54
27
Example: Using Temporal References
Composite multimedia
presentation
P_Video 1
Multimedia
presentation
objects
Recursive temporal
reference list
1
Reference: True
deviation: 0
time_point: NA
2
Reference: True
deviation: -5
time_point: NA
3
Reference: True
deviation: NA
time_point: 15
P_Video 2
P_Text 3
Actual play time value
Temporal references
2003
Example
CPO
Type: CPO
Name:
Lecture_19_2_1998
MTU_duration:
1/44100
Type: Parallel
Name: TR 1
Temporal relationship
type: Equal
Skew tolerance: 80 ms
2003
time
55
WWW-Lib-MM
Type: P_Video
Name: P_Video 1
Speed: 1
Start: 0
Stop: 18000
p_start.get_time_point()=0
p_stop.get_time_point() =
31752000
Type: Video
Name: PMC_Lecture_hour1_scene1
LDU_duration: 1/25
Duration: 1800
Content description:
- (0, 4988, “Lecturer talks about files”)
- (4989, 12134, “Lecturer talks about
directories”)
Type: P_Audio
Name: P_Audio 1
Speed: 1
Start: 0
Stop: 31752000
p_start.get_time_point()=0
p_stop.get_time_point() =
31752000
Type: Audio
Name: PMC_Lecture_hour1_clip_1
LDU_duration: 1/44100
Duration: 31752000
Content description:
- (0, 4988, “Lecturer talks about files”)
- (4989, 12134, “Lecturer talks about
directories”)
Type: P_HTML
Name: P_HTML 1
p_start.get_time_point() =
3987233
p_stop.get_time_point() =
7234443
Type: HTML
Name: File System
Type: P_HTML
Name: P_HTML 2
p_start.get_time_point() =
10234234
p_stop.get_time_point() =
16230933
Type: HTML
Name: Directory Example
Type: P_Light_Pen
Name: P_Light_Pen 1
p_start.get_time_point() =
4457111
p_stop.get_time_point() =
6283324
Type: Light_Pen
Name: Drawing_objects
LDU_duration: 200
Content description:
- (0, 100, “Draw a bow in File System”)
- (101, 200, “Draw a dot”)
WWW-Lib-MM
56
28
Query Processing and Optimization
- Browsing:
efficient location of data elements in very large amounts of data,
exact-match (pattern-matching) queries (e.g., text)
and similarity-based queries (e.g., images, ...)
-> query refinement
-> set-oriented and navigation-oriented browsing techniques
- Content addressing:
efficient location of data with complex data types like images
(difficult to access in realtime using pattern-recognition techniques)
comprises: natural language understanding,
speech processing,
vision,
and user modeling
2003
WWW-Lib-MM
57
Meta-Data Management
• Meta-data needed especially for continuous data to support
retrieval
• Textual data describing contents of audio and video segments
• Content search mostly performed on meta-data
• Problems:
– Modeling of meta-data
– Meta-data acquisition
– Association of meta-data to “real” data
2003
WWW-Lib-MM
58
29
Storage Management Issues
• addressing techniques
• access paths
• data placement techniques:
clustering, partitioning, allocation
• system buffer management:
paging, ...
• disk scheduling:
sweeping, deadline-driven, …
2003
59
WWW-Lib-MM
Data Placement
• Clustering and partitioning:
– data striping and data interleaving
• Allocation:
– contiguous placement
Controller
Sector 0
Sector 1
Sector 2
Sector 0 Sector 0
Sector 1 Sector 1
Sector 2 Sector 2
Logical sector 0
– constrained placement
– log-structured placement
2003
WWW-Lib-MM
60
30
Disk Scheduling
• Traditional algorithms
– FIFO (first come, first served)
– SSTF (shortest seek time first)
– SCAN (elevator algorithm)
1.Generation MM algorithms
– EDF (earliest deadline first)
– SCAN-EDF
– GSS (grouped sweeping scheme)
2.Generation MM algorithms
– two-phase algorithms
2003
- reduce seek time
- reduce rotational latency
- increase throughput
- fair stream access
- real-time constraints?
WWW-Lib-MM
61
Transaction and Workflow Management
• distributed transaction management mechanisms
• realtime transaction management mechanisms
• various new transaction and workflow management mechanisms
2003
WWW-Lib-MM
62
31
Parallelism and Other Optimization
Techniques in MMDBSs
• parallelism on storage level, e.g., disk arrays -> striping, ...
• parallelism on processing level, e.g., multiprocessor machines, ...
• storage structures / data placement techniques
• query optimization
• transaction management mechanisms
2003
WWW-Lib-MM
63
MMDBS: Conclusions
• investigated functionality needed to support MM applications
• illustrated how object-oriented and other modern DBMS
technologies can be applied to realize MMDBMS
• alternative “levels” of application support by DBMS
• open issues:
- effective storage models
- MM query languages and processing techniques
(handling of imprecise queries)
- ...
• Role of (MM)DBS in distributed MM systems
2003
WWW-Lib-MM
64
32
Conclusions - State-of-the-Art
• Multimedia file systems and multimedia storage servers for
special multimedia applications exist today
• Implement the presented concepts
• Acceptable performance
• Multimedia database systems are still under development,
certain aspects are solved
• Retrieval problems not yet solved in a satisfying manner
2003
WWW-Lib-MM
65
33
Download