Presentation Slides by Maleq Khan

advertisement
Semantic Extraction and Semantics-Based
Annotation and Retrieval for Video Databases
Authors:
Yan Liu & Fei Li
Department of Computer Science
Columbia University
Presented by: Maleq Khan
November 13, 2002
Introduction
Rapid growth and wide application of video database
leads to
fast video data retrieval upon user query
Problem Statement

Finding video clips in large database quickly

Semantic interpretation of visual content
“Find video shots where President Bush is stepping off an airplane”

Extraction and representation of temporal information
“Find video shots where Purdue President Martin Jischeke is
handshaking with President Bush after he stepped off an airplane”

Representation of spatial information
Semantics Annotation

Manual Annotation is not feasible for large database

Many different semantics interpretation

Need automatic annotation
Background

Video shots: unbroken sequence of frames

Key frame: frame that represent the salient feature
or content of a shot

Video scene: Collection of semantically related and
temporally adjacent shots
Background (continued)

Story unit, U: a collection of interesting objects in a shot

Locales, d: background of a shot

A ≡ Ui dj: Ui takes place in locales dj
Dialogue: A B A B A B A ….., A B A C A B A B
Action: progressive sequence of shots with contrasting visual data
content.
VIMS (Video Info Management Sys)
Video data
Video browsing
Segmentation
Color
Motion
Shape
Key frame computation
Feature extraction
Video query, retrieval
and production
…
Semantics-Based Query

Image matching and content based retrieval are
based on visual similarity

Unable to answer semantics-based query
“A red car is running by a tree”

Extracting temporal/spatial information hidden in
video is necessary for semantic description
Semantic Description Model
Color
Motion
…
Object
searching
Object tags
High-level
retrieval
High level
description
Direction
Object recognition
Temporal Comp.
Sample database
Temporal diagram
Temporal Diagram
scene
Objects with
position and
recording
info
Links based on similarity in story
Link to other scene for browsing
scene
Link to
other video
Using
bibliographic
data
Object Tracking
Position is identified with
a boundary rectangle
Motion is defined as change
of relative positions with a
still object
If viewing direction changes by
angle , multiply all position info
by cos.
Dynamic Tag Building

An array to store semantic description

New query: search tag first

If not found, run the procedure and new
semantic description is added to the tag.
Summary

Automatic semantic extraction

Object tracking

Temporal diagram

Automatic tag building
Comments

Identify moving objects if a relative still object is
given

Cannot distinguish different kind of motions

Temporal diagram is not complete

Claimed “real-time computation for large digital
library” but no theoretical or experimental result is
given
Download