R O I S

advertisement
ROLE OF OBJECT IDENTIFICATION
SONIFICATION SYSTEM FOR
VISUALLY IMPAIRED
Presented By,
Ranjan Bangalore Seetharama
IN
AGENDA

Introduction

Hardware of NAVI System

Object Identification

Stereo Sound Generation
INTRODUCTION
The Navigation Assistance for Visually Impaired (NAVI) System includes a

single board processing system (SBPS),

vision sensor mounted on headgear and

stereo earphones.
The vision sensor captures the vision information in front of the blind user.
The captured image is processed to identify the object in the image.
Object identification is achieved by a real time image processing
methodology using fuzzy algorithms.
FUZZY ALGORITHMS


Traditional logic has only two possible outcomes, true or
false. Fuzzy logic instead uses a graded scale with many
intermediate values, like a number between 0.0 and 1.0.
(Similar to what probability theory does.)
A fuzzy algorithm would then use fuzzy logic to operate on
inputs and give a result. Applications include control logic
(controlling engine speed, for instance, where it can be
handy to have some intermediate values between "full
speed" and "full stop") and edge detection in images.


The processed image is mapped onto stereo acoustic patterns and
transferred to the stereo earphones in the system.
The vOICe is one of the patented image sonification system.
Video camera is used as vision sensor. A dedicated hardware
was constructed for image to sound conversion. The image
captured is scanned in the left-right direction with sine wave as
sound generator.The top portion of the image is transformed
into high frequency tones and the bottom portion into low
frequency tones. The brightness of the pixel is transcoded into
loudness.



Background fills more area in the image frame than the objects,
as the sound produced from the unprocessed image will
contain more information of the background.
It is also noted that most of the background is of light colors and
the sound produced on it will be of high amplitude
compared to the objects in the scene.
Object identification is achieved using a clustering
algorithm. The identified objects are enhanced. Importance is
given to the objects in the environment than the background
of the environment for sound production. This will enable the
blind user to identify the obstacles easier.
HARDWARE OF NAVI SYSTEM



Navigation Assistance for Visually Impaired (NAVI)
The hardware model constructed for this vision substitution
system has a headgear mounted with the vision sensor,
stereo earphone and Single Board Processing System
(SBPS) in a specially designed vest for this application.
The SBPS is placed in a pouch provided at the backside of the vest.
Source: Fuzzy Learning Vector Quantization in Intelligent vision Recognition for Blind Navigation
By R Nagarajan, Yaacob and Sainarayanan
OBJECT IDENTIFICATION



Digital video camera mounted in the headgear captures the
vision information of scene in front of the blind user and the
image is processed in the SBPS in real time.
The processed image is mapped to sound patterns.
Since the processing is done in real time, the time factor has
to be critically considered.
OBJECT IDENTIFICATION



The proposed vision substitutive system, the nature of
object to be identified is undefined, un certain and time
varying.
One of important features needed by the blind user in
the image from the environment are the orientation and
size of the object and obstacles.
During sonification, the amplitude of sound generated
from the image directly depends on the pixel intensity.
In any gray image, pixel value of white color is of
maximum of 255 and black is with minimum of zero.




As the image pixels of light color produces sound of higher
amplitude than darker pixels.
If the image is transferred to sound without any enhancement,
it will be a complex task to understand the sound, which is
the major problem faced in early works.
The main objective of this work is to suppress' the
background and to enhance the object; for this, the gray levels
of the object and background have to be identified.
Image used for processing is of 32x32 pixel size and of four gray
levels namely black (BL), white (WH), dark gray (DG) and light
gray (LG).


Feature extraction is the most critical part in image processing.
The extracted features should represent the image with limited
data.

In this work each image will have four feature vector namely

XBL = [X1, X2, X3. X4],

XDG = [X1. X2, X3, X4],

XLG = [ X1, X2, X3, X4],

XWH = [X1. X2, X3, X4]




X1= Represents the number of respective gray pixel in the
image, this is a histogram value of the particular pixel.
X2 = Represents the number of respective gray pixel in the
central area of the image. Generally the object of interest will
be in the center of human vision.
X3 = Represents the pixel distribution gradient. x3 is calculated
by the sum of the gradient values assigned to the pixel location.
X4 = Represents the gray value of the pixel. Generally most of the
background in the real world are of light colors than the objects.
FLVG – FUZZY LEARNING VECTOR
QUANTIZATION





Artificial Neural Network (ANN) is playing a major role in
pattern classification.
It has the ability to learn and is fault tolerant, which makes it as
a powerful tool for pattern recognition.
One form of ANN is LVQ network.
The objective of the LVQ network is to identify the output node
that is nearest to the input vector.
The weights are updated by competitive learning.
FLVG – FUZZY LEARNING VECTOR
QUANTIZATION



Let, Go be gray level as classified to object class of FLVQ network,
Gb be the gray level as classified to background class of FLVQ
network and
I be the preprocessed image.
For i, j = 1, 2, …, 32
if I(i,j) == Go
then I(i,j) = K1
If I(i,j) == Gb
then I(i,j) = K2 (1)
End
I1 = I

where K1and K2 are chosen scalar constants, K1>>K2 and
SUPERIMPOSING AND NORMALIZATION

Use any edge detection algorithms to detect edges in image I. Let
the image of edges be I1.

Let I2 be the background suppressed image of previous stage.

I1 and I2 are superimposed to form an image matrix.

Thus, we have a normalized image which is background
suppressed, object enhanced and edge predominated.
Source: Fuzzy Learning Vector Quantization in Intelligent vision Recognition for Blind Navigation
By R Nagarajan, Yaacob and Sainarayanan
Source: Fuzzy Learning Vector Quantization in Intelligent vision Recognition for Blind Navigation
By R Nagarajan, Yaacob and Sainarayanan
Source: Fuzzy Learning Vector Quantization in Intelligent vision Recognition for Blind Navigation
By R Nagarajan, Yaacob and Sainarayanan
SONIFICATION




Transformation of data in relation to perceived associations to an
acoustic signal for the purpose of facilitating communication or
interpretation is defined as Sonification.
Human auditory system can sense frequencies between 20 Hz to
20,000 Hz.
From literature and experimentations it is observed that the
system is most sensitive to frequencies between 20 Hz to 4000 Hz.
This range is adopted in the proposed sonification module.
SONIFICATION


In order to create variations in pitch in the sonification
module, the pixel position in a column of the image pattern
is made to be inversely related to the frequency of sine
wave.
The loudness is made to depend directly on the pixel value
of the processed image.
SONIFICATION


The processed image is sonified to stereo acoustic patterns.
The image is sonified to stereo sound by proper mapping of
the image, by which information regarding image data
corresponding to left side of a blind are transferred to the
left earphone and the right half image data to the right
earphone.
SONIFICATION

Let fo be the fundamental frequency of the sound generator

G be a constant gain


FD, the frequency difference between adjacent pixels in vertical
direction.
The changes in frequency corresponding to (I,j)th of the pixel in
32x32 image matrix is given by.
Fi = fo + FD

Where FD = Gfo(32-i); i= 1,2,3,…,32
SONIFICATION

The generated sound pattern is hence given by

Where S(j) is the sound pattern for column j of the image

t = 0 to D and D depends on the total duration of the acoustic
information for each column of the image;


where f, is the frequency corresponding to row, i.
SONIFICATION




The sine wave with the designed frequency is multiplied with
gray scale of each pixel of a column and summed up to produce
the sound pattern.
The scanning is performed from leftmost column towards the
center and from right most column towards the center.
Sound pattern to the left earphone is SL = S(1) to S(n/2) appended
from the left side.
Sound pattern to the right earphone is SR = S(n) to S(n/2)
appended from the right side
where n is the total number of columns. In our case n = 32.
FUTURE WORK


In this research, information regarding depth of
the object is not considered.
An object is ‘perceived’ bigger through the
variation in sound pattern as the blind moves
near to the object.
REFERENCES



Fuzzy Learning Vector Quantization in Intelligent vision
Recognition for Blind Navigation
By R Nagarajan, Yaacob and Sainarayanan
Role of Object Identification in Sonification System for
Visually impaired
By R Nagarajan, Yaacob and Sainarayanan
http://en.wikipedia.org/wiki/Fuzzy_clustering
Download