Uploaded by totax32063

project ECE

advertisement
DEVELOPMENT OF IMAGE TO SPEECH/PDF CONVERSION USING DIGITAL
IMAGE PROCESSING
1
©M. S. Ramaiah University of Applied Sciences
Outline
•
•
•
•
•
•
•
•
•
Introduction
Motivation(Project concept and its relevance)
Aims and Objectives
Block diagram
Methods and Methodology
Project Concept
Results
Conclusions
References
2
©M. S. Ramaiah University of Applied Sciences
Title
Development of Image to Speech/PDF Conversion Using Digital
Image Processing
Aim
To develop a conversion algorithm of image into speech
format by the means of digital image processing for blind
people.
3
©M. S. Ramaiah University of Applied Sciences
Objectives
1. To conduct literature survey for conversion of existing
image to speech processing methodology .
2. To develop the image recognition algorithm
3. To develop speech conversion algorithm for the input
image
4. To implement algorithms for extracting desired
parameters
5. To integrate subsystems, test and evaluate the system for
its effective functionality
4
©M. S. Ramaiah University of Applied Sciences
Block Diagram
Overview of the system
INPUT
IMAGE
IMAGE
PREPROCESSING
IMAGE TO TEXT
CONVERTER
TEXT TO
AUDIO
AUDIO
AUDIO
OUTPUT
OUTPUT
5
©M. S. Ramaiah University of Applied Sciences
BLOCK DIAGRAM
6
©M. S. Ramaiah University of Applied Sciences
Methods and Methodology
Objective No.
Statement of the object
Methods/
methodology
Resources required
1.
To conduct literature
review various existing
image to speech converting
modules
Literature review on
different systems available
for image to conversion for
blind people both
hardware and software
modules
Articles, Journals, Technical papers and,
patents
2.
To arrive at a functional
block diagram and flow
chart with subsystems
Developing software based
design for proposed system
with functionality.
3.
Extraction of region of
interest from the input
image (or text region)
Maximally Stable Extremal
Region(MSER)- Algorithm
used to extract the text
regions from the input
image by varying the
threshold value of image
and
MATLAB R2021a
Stroke Width
algorithm(SWT) –This
algorithm is implemented
to increases the efficiency
and reliability of the
image extracted using
MSER algorithm.
MATLAB R2021a
©M. S. Ramaiah University of Applied Sciences
7
Methods and Methodology
Objectives no.
Statement of the object
Methods/
methodology
Resources required
4.
Character extraction from
the extracted text image
Optical Character
Recognition technique(OCR)
-Implemented to extract
each character image
features and thereby
classifying them with
respect to the pattern of
image.
MATLAB R2021a
5.
Conversion of text to speech
Speech synthesizerConversion of e-text to
speech is incorporated using
an interface which is known
as speech SAPI(Win 32 SAPI)
Microsoft SAPI,
MATLAB R2021a
8
©M. S. Ramaiah University of Applied Sciences
Project Concept
1. Image pre-processing
• This step helps to remove the Nosie present in the image so that , it can reduces
errors to happen in the later stages .
• RBG images are converted to grey scale images or black –white images
2. Maximally Stable Extremal Region
• MSER varies with the threshold of the image, given some threshold value the
pixels below that threshold value are white and all those above or equal are
black.
• The MSER feature detector works well for finding text regions because of the
consistent colour and high contrast of text leads to stable intensity profiles.
• The first step to implementing MSER is to sweep threshold of intensity from
black to white performing a simple luminance threshold of the image. Once that
is done extraction of the connected components or the Extremal Regions is
performed.
9
©M. S. Ramaiah University of Applied Sciences
Project Concept
• After that a threshold is found when an extremal region is maximally stable. In this
project we have taken 3 as our threshold.
• Remove Non-Text Regions Based On Basic Geometric Properties.
• Finally, the regions descriptors as features of MSER are obtained.
• Although the MSER algorithm detects most of the text, it also detects several other
stable regions in the image that are not text. Stroke width algorithm helps to solve
this.
3. Stroke Width Algorithm
•
•
•
The first step is the stroke width transform which is an operator which determines
the width of the most likely stroke containing the pixel for each and every pixel.
The output produces by the SWT is an image of the same size as of the input image
where each element contains the width of the stroke associated with that pixel.
We have now obtained a map of the most likely stroke-widths for each pixel in the
original image.
10
©M. S. Ramaiah University of Applied Sciences
Project Concept
•
The next step is to group all these pixels into letter candidate which is done by
selecting two neighbouring having similar stroke width, and then applying
several rules to distinguish the letter candidates.
4. Optical Character Recognition.
• Optical Character Recognition is process allows the application to automatically
recognize a character through an optical technique. OCR is the process of
translating acquired images of typewritten or printed text into digitally mutable
information
5. Text To Speech Conversion.
• Win 32 SAPI needs to be loaded to the computer, which converts text into
speech. Desired voice and pace are set, which initializes the wave player for
convert the text into speech. Finally the speech for given image is obtained.
11
©M. S. Ramaiah University of Applied Sciences
Conclusions
• An approach for image to speech conversion using
optical character recognition and speech synthesis is
attempted .
• The application developed is simple to use, very cost
effective, portable and applicable in the real time
• Tests have been conducted to check the conversion and
good results have been achieved.
12
©M. S. Ramaiah University of Applied Sciences
Download