Robust Pictorial Mapping onto Text- Keyword Content for Search Engine

advertisement
International Journal of Engineering Trends and Technology (IJETT) – Volume 35 Number 3- May 2016
Robust Pictorial Mapping onto TextKeyword Content for Search Engine
1
2
3
4 123
Divyashree C V , S Malapriya S , Shambhavi B M , Gururaj K S
Sem Student, Associate Professor ,Department of Information Science and Engineering
4
VIII
GSSS Institute of Engineering & Technology for Women, Mysuru
Abstract— Text detection in videos is an important step
to achieve multimedia content retrieval. In this paper, an
efficient algorithm which can automatically detect,
localize and extract horizontally aligned text in images
(and digital videos) with complex backgrounds is
presented. The proposed approach is based on the
application of a colour reduction technique, a method for
edge detection, and the localization of text regions using
geometrical properties. The output of the algorithm is
text boxes with a simplified background, ready to be fed
into a system for subsequent character recognition. Our
proposal is robust with respect to different font sizes, font
colours and background complexities. The performance
of the approach is demonstrated by presenting promising
experimental results for a set of images taken from
different types of video sequences.
Keywords — Detection, Extraction, Frame, Images.
I. INTRODUCTION
The digital video has become one of the most
important elements in many applications such as
education, news and games. Multimedia data are
also getting bigger than before. In order to extract
and search important information from a huge
amount of video data, we need to extract text from
video. Text is obviously an important element in
video. So extracting text appears as a key clue for
understanding contents of video and for instance
for classifying automatically some videos.
Videotext detection and recognition has been
identified as one of the key components for the
video retrieval and analysis system. Videotext
detection and recognition can be used in many
applications such as semantic video indexing,
summarization, video surveillance and security,
multilingual video information access, etc.
Videotext can be classified into two broad
categories: Graphic text and scene text. Graphic
text or text overlay is the videotext added
mechanically by video editors, examples include
the news/sports video caption, movie credits etc.
Scene texts are the video texts embedded in the
real-world objects or scenes, examples include
street name, car license number, and the
number/name on the back of a soccer player. This
report is to address the problem of accurately
detecting and extracting the graph video texts for
video text recognition. Although the overlay text is
manually added into the video, the experiments
ISSN: 2231-5381
showed they are even as hard to extract as many
video objects, such as face, people etc.
The goal of a multimedia text extraction
and recognition system is filling the gap between
the already existing and mature technology of
Optical Character Recognition and the new needs
for textual information retrieval created by the
spread of digital multimedia. A text extraction
system from multimedia usually consists of the
following four stages: spatial text detection,
temporal text detection –tracking (for videos),
image binarization –segmentation, character
recognition. Nowadays the size of the available
digital video content is increasing rapidly; this fact
leads to an urgent need for fast and effective
algorithms for information retrieval from
multimedia content. In order to efficiently detect
texts, we need to analyze the discriminative
properties of text and its basic unit, character.
A text is something that suggests the
presence of a fact, condition, or quality. In this
paper, we are interested in text that has direct
influence upon a tourist from a different country or
culture.
II. PROBLEM STATEMENT
Text in images and video sequences provide highly
condensed information about the contents of the
images or video sequences and can be used for
video browsing in a large video database. Text
superimposed on the video frames provides
supplemental but important information for video
indexing and retrieval. Although text provides
important information about images or video
sequences, it is not an easy problem to detect and
segment them.
The main difficulties lie in the low
resolution of the text, and the complexity of the
background. Video frames have very low resolution
and suffer from blurring effects due to loss of
compression. Additionally the background of a
video frame is more complex with many objects
having text like features. One more problem lies
with the handling of large amount of text data in
video clip images.
http://www.ijettjournal.org
Page 177
International Journal of Engineering Trends and Technology (IJETT) – Volume 35 Number 3- May 2016
PDF document.
III. METHODOLOGY
The application designs an adaptive OCR system.
Given documented videos, the adaptability lies in
the automatic training sample extraction with
limited user interaction. This approach does not
require the support of the ground truth text, which
is extremely useful for the processing of noisy
document images.
A methodology is proposed for processing
noisy printed documents with limited user
feedback. Without the support of ground truth, a
specific collection of scanned documents can be
processed to extract character templates. The
adaptiveness of this approach lies in the extracted
templates are used to train an OCR classifier
quickly. Experimental results show that this
approach is extremely useful for the processing of
noisy documents with many touching character.
The text in video consists of three steps. The first
one is to finding text in original video. Then the
text needs to be separated from background. And
finally a binary image has to be produced.
f)Search Engine:
Selected Keyword is given to the Search Engine for
searching related content.
g)Output:
The content related to the keyword is displayed.
Project implements an efficient system for
the extraction of text from a given documented
video clips and recognizes the extracted text data
for further applications. The implemented project
work finds efficient usage under video image
processing for enhancement and maintenance. The
work can be efficiently used in the area of video
image enhancement such as cinematography and
video presentation etc. The proposed work will be
very useful under digital library maintenance of
video database.
Following are the areas of application of text
isolation and recognition in video images
1. Digital library:
For maintenance of documented video
images in large database.
2. Data modification:
Useful
under
modification
of
information’s in video images.
3. Cinematographic applications:
For enhancing the document information
in movie video clips.
4. Instant documentation of news and reports:
For documentization of instant reports and
news matters in paper.
5. License car plate character recognition for
toll collection.
Figure 1: Data flow diagram
IV. IMPLEMENTATION
a)Data Set-Video:
The video that contains text is given as input to be read
by placing it in the created GUI in MATLAB
The functions included in Robust Text extraction
from Video are as follows:
b)Sub sampling of video to images:
rgb2gray() - Converts RGB image to Grayscale
imread(filename) - read image from file
imshow(I) - Display image
bwlabel() - Label connected components in 2-D
binary image
im2bw() - convert image to binary image, based on
threshold
bwareaopen() - remove small objects from binary
image
Find() - Find indices and values of non-zero
elements
greythresh(I) – Computes a global threshold (level)
that can be used to convert an intensity image to
binary image.
videoinput(adaptorname,deviceID) – It creates a
video input object obj, where deviceID is a numeric
scalar value that identifies a particular device.
getsnapshot() – Immediately returns the single
In this phase extraction of images are done. Video is
composed of sequence of images. Therefore, the image
frames should be picked from the video.
c)Applying adaptive OCR:
Using Adaptive OCR approach the text is recognized and
extracted from the frames which contain text in the
video.
d)Text Extraction:
To annotate a video using the detected text, it must be
extracted and recognized.
e)Store Text:
The text extracted from video frames has to be stored in a
ISSN: 2231-5381
http://www.ijettjournal.org
Page 178
International Journal of Engineering Trends and Technology (IJETT) – Volume 35 Number 3- May 2016
image frame from the video input object obj.
imwrite() – It writes image to graphic file.
corr2(A,B) - returns the correlation r between A
and B, where A and B are matrices or vectors of the
same size. r is a scalar double.
V.RESULTS AND DISCUSSION
This paper presents an effective approach for the
extraction of text from a documented video. In this
approach the video is converted into set of frames
from which a particular frame is selected which
consists of the text content for searching. The text
required for searching is extracted and provided to
Google automatically to get the related documents
and videos.
REFERENCES
Figure 2: Graphical user interface of main screen
The Figure 2 represents the graphical user interface
for Robust Pictorial Mapping onto Text keyword
content for search Engine
[1]:Haojin Yang and ChristophMeinel,IEEE ,”Content Based
Lecture Video Retrieval Using Speech and Video Text
Information”.
[2]:AnkurSrivastava,DhananjayKumar,Om
Prakash
Gupta,AmitMaurya,Mr.SanjaykumarSrivastava,Text,”Extraction
in Video”.
[3]:XuZhao,Kai-Hsiang Lin,YunFu,YuxiaoHu,Yuncai Liu and
Thomas S.HuangIEEE,”Text From Corners: A Novel Approach
to Detect Text and Caption in Videos”.
[4]:Avinash N Bhute and B.B.Meshram, VJTI, Matunga,
Mumbai-19,”Text Based Approach For Indexing AndOf Image
And Video:A Review”.
[5]:MariosAnthimopoulos,”Text Detection in Images and
Videos”.
[6]:ZuzanaCernekova,IoannisPitas,seniorMember,IEEE
and
ChristophorousNikou Member IEEE,”Information TheoryBased Shot Cut/Fade Detection and Video Summarization”.
[7]:MariosAnthimopoulos,Basilis
Gatos
and
IoannisPratikakis,”Multiresolution Text Detection In Video
Frames”.
[8]: He HUANG,Ping SHI,”A Method of Caption Detection in
News Video”.
[9]: Huanfeng Ma and David Doermann,Language and Media
processing Laboratory,Institute of Advanced Computer
studies,USA,”Adaptive OCR with Limited User Feedback”.
[10]:BaseemBouaziz,TarekZlitni,WalidMahdi,MIRACL:Multim
edia Information system and Advance
Computing Laboratory,”AViTExt:Automatic Video Text
Extraction”.
[11]:T. K. Ho and G. Nagy. Ocr with no shape training. In 15th
International Conference on Pattern Recognition, pages 27–30,
Barcelona, Spain, 2000.
Figure 3: Selection of frame and text extraction
The Figure 3 represents the particular frame
selection from the set of frames, text is extracted
from the frame and stored in PDF document
Figure 4: Searching for the selected keyword
Figure 4 shows the extracted text is given as input
to the search engine automatically
CONCLUSION
ISSN: 2231-5381
http://www.ijettjournal.org
Page 179
Download