A Simple and Efficient Approach to Barcode Localization

advertisement
A Simple and Efficient Approach to
Barcode Localization
Aliasgar Kutiyanawala and Xiaojun Qi
Jiandong Tian
Computer Science Department
Utah State University
Logan, UT 84322-4205
aliasgar_cc_usu@yahoo.com
Xiaojun.Qi@usu.edu
State Key Laboratory of Robotics
Shenyang Institute of Automation
Chinese Academy of Sciences
Shenyang, P.R. China, 110016
tianjd@sia.cn
Abstract— In this paper, we propose a simple and efficient
approach to localizing the barcode regions in an image. We first
apply the multichannel Gabor filtering technique to extract eight
directional texture features. We then apply a randomized
hierarchical search strategy to quickly find a sufficient number of
pairs of line segments, which have high frequency and high
similarity measures. We finally employ the histogram analysis
technique on the start and end points of each qualified pair of
line segments to localize the barcode regions. Our extensive
experimental results show that the proposed scheme outperforms
the two peer systems and can successfully localize the barcode
regions in an image with a precision of 96% and a recall of 86%.
In addition, the proposed system can be easily ported to a cell
phone to improve the ShopTalk system to aid the blind to
successfully retrieve common grocery products.
localize the barcode, recognize the barcode, and estimate the
location of the shopper within the store. Specifically, the
locations of all shelf barcodes in the store can be used to guide
the visually impaired shopper to the most probably location of
the target product on the shelf, because products are normally
located directly above their corresponding shelf barcodes. The
barcodes on the individual products can be used to inform the
visually impaired shopper about the product information such
as the brand and the price. The images captured by the cell
phone may contain text and graphics together with the
barcodes or without any barcode. As a result, identifying the
existence of a barcode and the location of a barcode in an
image is the first step to reading the barcode. In this paper, we
propose a simple and efficient technique to recognize the
Index Terms— Barcode localization, multichannel Gabor
filters, longest common subsequence, randomized hierarchical
search strategy
I.
INTRODUCTION
Barcodes have been widely used in many fields for
applications of great commercial value. Specifically, many
large- and medium-sized grocery stores use barcodes on
shelves to assist the store personnel in managing the product
inventory. That is, these inventory systems place barcodes on
the shelves immediately beneath every product area.
Nicholson and Kulyukin [1] presented ShopTalk, a wearable
small-scale system as shown in Figure 1 that enables a visually
impaired shopper to successfully retrieve common grocery
products using verbal route direction and barcode scans. In
ShopTalk, the visually impaired shopper will use the off-theshelf laser based barcode reader to read the barcodes from the
shelves and transmit data to a computing device for estimating
his location within the store. This requires the visually
impaired person carrying at least two separate pieces of
hardware, along with the inconvenience of maintaining them.
Recently, the availability of cellular phones with a built-in
camera has provided visually impaired people a mobile
platform for localizing and decoding barcodes. This recent
advance in cell phone technology can make the ShopTalk
system miniaturized to a cell phone where the camera of the
cell phone can function as a barcode reader and the processor
of the cell phone can be used as the computing device to
978-1-4244-4657-5/09/$25.00 ©2009 IEEE
Figure 1. ShopTalk: a wearable shopping system for the blind.
position and orientation of the barcode in an image.
II.
RELATED WORK
The barcode localization methods can be roughly classified
into two categories: spatial domain-based methods and
frequency domain-based methods. We will briefly review
some representative techniques in each category.
Spatial domain-based methods: Some algorithms [2, 3]
search for groups of lines with mono-oriented gradients to
locate the barcode region. However, they cannot achieve
adequate localization results in cluttered images. Ando [4-6]
gives promising results for local barcode searches in cluttered
3-D scenes by extracting connected regions with monooriented texture. However, this extraction becomes invalid
when the barcode region is large since it is regarded as
background. A Hough transformation-based global method [7]
ICICS 2009
extracts barcode lines due to its robustness against orientation
and size. However, it does not work well on images with bad
illuminations. Moreover, extra steps need to be performed to
get local information if a barcode is a small part of the image.
Other algorithms use morphological filters [8] or self-study
networks [9] to locate barcodes. However, they are capable of
locating barcode regions only under certain conditions and are
typically very time-consuming.
Frequency domain-based methods:
The following
frequency-based transformations, i.e., Fourier transform,
Gabor filtering, and wavelet transform, have been effectively
used for barcode localization. Jain and Chen [10] propose a
multichannel Gabor filtering technique together with either the
supervised one-layer feed forward neural network learning or
the unsupervised clustering technique to localize barcodes.
Their techniques are robust to variations in orientation, scale,
and shift of the barcode. Bai, Zhang, Wu, and Chen [11]
propose a Fourier transformed multichannel Gabor filtering
technique together with the back propagation network to
ensure rotation invariance in barcode localization. Oktem [12]
propose a wavelet transform based technique together with the
binary morphological operations to localize barcodes. All
these methods utilize the frequency domain-based features
together with the complicated learning methods or the
complicated morphological operations to achieve the effective
barcode localization results. However, the performance of
these learning methods highly depends on the chosen training
images. Furthermore, the learned topological structure needs
to be saved and used for localizing the barcode regions in an
image. This additional storage required for the learned
topological structure may not fit into the limited memory and
storage resources available on the cell phone. On the other
hand, the morphological operations may require a series of
dilation and erosion operations, which may drain up the
limited computational power available on the cell phone. As a
result, all these frequency domain-based method may not be
successfully ported on a cell phone.
In this paper, we propose a simple and efficient approach
to localizing the barcode regions in an image. We first apply
the multichannel Gabor filtering technique to extract eight
directional texture features. We then apply a randomized
hierarchical search strategy to quickly find a sufficient number
of pairs of line segments, which have high frequency and high
similarity measures. We finally employ the histogram analysis
technique on the start and end points of each qualified pair of
line segments to localize the barcode regions. Currently, the
entire algorithm is implemented on a personal computer but it
can be ported on a cell phone without much difficulty due to
its little memory and CPU consumption requirement. The
remainder of the paper is organized as follows: Section III
presents the proposed barcode localization technique. Section
IV shows the effectiveness of the proposed technique via
extensive experiments to compare with two peer systems.
Section V draws the conclusions and presents the future work.
III.
THE PROPOSED APPROACH
Figure 2 illustrates the block diagram of our proposed
barcode localization system. In the following subsections, we
explain each of the three blocks in detail.
Extract Gabor
Filter Based
Features
Compute Frequency
and Similarity
Measures
Estimate the
Histogram-Based
Bounding Box
Figure 2. The block diagram of the proposed system.
A. Extract Gabor Filter Based Features
Gabor filters can serve as excellent band-pass filters for
uni-dimensional signals. They are linear filters that can be
defined as a product of a Gaussian signal with a complex
sinusoid. They are defined as:
⎛ x ' 2 +γ 2 y ' 2 ⎞ ⎛
x'
⎞
⎟⎟ cos⎜ 2π + ψ ⎟ (1)
g ( x, y; λ , θ ,ψ , σ , γ ) = exp⎜⎜ −
2
2σ
λ
⎠
⎝
⎠ ⎝
where x' = x cos θ + y sin θ and y' = − x sin θ + y cos θ .
Here, λ represents the wavelength of the cosine factor, θ
represents the orientation of the normal to the parallel stripes
of a Gabor function, ψ is the phase offset, σ is the sigma of the
Gaussian envelop, and γ is the spatial aspect ratio and specifies
the ellipticity of the support of the Gabor function. In our
proposed system, we use σ=5, γ=1, ψ=0, and λ=42.25 as our
constants. We use eight orientations, i.e., θ = {0˚, 22.5˚, 45˚,
67.5˚, 90˚, 112.5˚, 135˚, 157.5˚}, to generate a total of eight
channels in the filter bank to find the orientation of the
barcode with respect to the image.
For each input image A, we first apply the canny edge
detector to get its edge image B. We then apply each of the
eight Gabor filters on the edge image B and rotate its Gabor
filtered result by a certain angle, which is consistent with the
θ’s used in the corresponding Gabor filter. That is, the Gabor
filtered image is rotated in a clockwise direction by 45˚ for the
Gabor filter of θ = 45˚. This rotation operation ensures that all
the strong responses align along the y-axis (i.e., have the
vertical orientation). We finally convert each of the eight
rotated Gabor filtered images to a binary image by applying a
simple thresholding approach. In our system, we set the
threshold as 0.9. That is, any Gabor filtered responses whose
values are bigger than 0.9 will be set as 1’s. The other smaller
responses will be set as 0’s.
Figure 3 illustrates the binarized Gabor filtering results for
an image, which contains a vertical barcode at the right corner.
It clearly shows that the Gabor filtered image with θ=0˚
achieves the highest responses. That is, the responses are
more prominent for barcode lines that are oriented at an angle
of 90˚ along the x-axis (i.e., θ=0˚ is the orientation of its
normal) and less prominent for other elements.
(a) Original image
scaling mechanism is able to handle barcodes of various sizes.
(b) Gabor result with θ=0˚
(d) Gabor result with θ=45˚
(f) Gabor result with θ=90˚
(c) Gabor result with θ=22.5˚
(e) Gabor result with θ=67.5˚
(g) Gabor result with θ=112.5˚
(h) Gabor result with θ=135˚
(i) Gabor result with θ=157.5˚
Figure 3. Eight Gabor filter results.
Figure 4 illustrates the binarized Gabor filtering results
around the expected angles for another image, which contains
a tilted (around 135˚) barcode. It clearly shows that the Gabor
filtered image with θ=45˚ achieves the highest response. The
clockwise rotation of 45˚ will make the tilted barcode have a
vertical orientation.
(a) Original image
(b) Gabor result with θ=67.5˚
(c) Gabor result with θ=45˚
(d) Gabor result with θ=22.5˚
Figure 4. Gabor filter results around the expected angles.
B. Compute Frequency and Similarity Measures
For each rotated and binarized Gabor filtered image, we
use a randomized hierarchical search strategy to quickly find
2,000 horizontal segments of the same lengths at each of the
eight scales. This search strategy guarantees that most
horizontal segments will be quickly searched starting from the
smallest possible length (i.e., the width of the image divided
by 10) at scale 1 to the largest possible length (i.e., the
smallest possible length multiplied by 8) at scale 8. This
For each horizontal segment, we then compute its
frequency measure. This frequency measure counts the
number of changes from 0’s to 1’s and from 1’s to 0’s. For
example, given the following segment of 1010001010, the
frequency measure is computed as 7. If the frequency measure
of a line segment is higher than a predefined threshold, we
consider this line segment as high frequency segment. We
experimentally decide the threshold to be 25.
For each high frequency line segment, we pick out a line
segment of the same length, which locates at 20 pixels below.
We then apply the LCS (Longest Common Subsequence)
algorithm [13] to find a common subsequence of two line
segments which has the maximum possible length. The
similarity between these two line segments is then computed
as the ratio of the length of the LCS to the length of the
original line segment. For example, the LCS of two line
segments, 1010001010 and 0101000100, is 10100010. The
similarity between the above two line segments is computed as
ratio of the length of 10100010 (i.e., 8) to the length of
1010001010 (i.e., 10). That is, the similarity is 80%. In our
system, we consider any pair of high frequency line segments
with similarity higher than 90% as candidate barcode regions.
This choice is mainly based on the four observations as
illustrated in Figure 5:
•
Case I (Red line segment pair): Both upper and
lower lines have a high frequency measure. The
similarity between these two lines is high.
•
Case II (Yellow line segment pair): Both upper and
lower lines have a high frequency measure. The
similarity between these two lines is high.
•
Case III (Blue line segment pair): The upper blue
line has a low frequency measure and the lower blue
line has a high frequency measure. The similarity
measure between these two lines is low.
•
Case IV (Green line segment pair): Both upper and
lower lines have a low frequency measure. The
similarity between these two lines is high.
In order to quickly find the potential barcode region, we
terminate the randomized hierarchical search once 50 pairs of
line segments with high frequency and high similarity
measures (i.e., 50 pairs of line segments satisfying the first two
conditions) are found.
That is, we claim that the
corresponding rotated and binarized Gabor filtered image
likely contains barcodes. All the 50 pairs will be passed to the
last step for barcode localization. On the other hand, we claim
that the rotated and binarized Gabor filtered image does not
contain barcodes at the corresponding angle, if there are fewer
than 50 pairs of line segments with high frequency and high
similarity measures after searching the horizontal segments at
all eight scales.
For the two images shown in Figure 3(a) and Figure 4(a),
50 pairs of line segments with high frequency and high
similarity measures are found in the binarized Gabor filtered
image shown in Figure 3(b) and Figure 4(c), respectively. The
other binarized and rotated Gabor filtered images yield fewer
than 50 pairs. Therefore, the start and end points of all the 50
pairs resulting from the correct Gabor filtered images will be
passed to the last step.
Figure 5. Illustration of four cases of line segment pairs.
C. Estimate the Histogram-Based Bounding Box
For each of 50 pairs of line segments obtained in a
binarized and rotated Gabor filtered image, we compute its
histogram along x-axis and y-axis, respectively. Specifically,
we use the x-coordinates of the start and end points of all pairs
as the input to compute the number of occurrences of each
input value. Similarly, we use the y-coordinates of the start
and end points of all pairs as the input to compute the number
of occurrences of each input value. If an image contains a
barcode, most of the lines will coincide with the barcode and
hence we expect the histograms along the x-axis and y-axis to
have a single peak whose amplitude is greater than a threshold.
If an image does not contain a barcode, we expect the
histograms along both x-axis and y-axis to be flat or to have
peaks whose amplitudes are smaller than a threshold. In our
proposed system, we consider the histogram contains a peak if
the occurrence of a certain value is higher than 30% of the
total number of occurrences (i.e., 50). Depending on the type
of histogram observed, we conclude whether an image
contains a barcode or not. The position of the barcodes is
obtained by analyzing the start and end points of the peak of
both histograms. Specifically, if the width of the histogram is
w, the barcode is assumed to lie between 0.02w and 0.98w.
The orientation of the barcode is the angle by which the image
is rotated (i.e., the corresponding θ value used in the Gabor
filtering operation).
Figure 6 demonstrates the histograms along both x-axis
and y-axis of all 50 pairs of line segments found for images
shown in Figure 3(a) and Figure 4(a). It also shows the
localized barcode region and the threshold lines on top of the
histograms.
Figure 6. Illustration of histogram-based bounding box results of the images
shown in Figure 3(a) and Figure 4(a).
IV.
EXPERIMENTAL RESULTS
We implemented the current system using Java on Pentium
IV Dual CPU at 3.0GHz PC running Windows XP operating
system. On average, it takes 4.05 seconds to localize the
barcode on images of size 300×300; it takes 5.29 seconds to
localize the barcode on images of size 400×400; and it takes
6.97 seconds to localize the barcode on images of size
500×500. The complexity for extracting Gabor filter based
features is O(nlogn), where n is the dimension (width or
length) of an image. The complexity for computing frequency
and similarity measures is O(Len2), where Len is the length of
the line segment.
The complexity for estimating the
histogram-based bounding box is O(Len).
To evaluate the performance of the proposed barcode
localization approach, we conducted experiments on 35
images taken in a real shopping trip using a cell phone camera.
We also compared the proposed approach with the two
approaches (e.g., one-layer feed forward neural network-based
supervised approach and clustering-based unsupervised
approach) proposed by Jain and Chen [10] using the same 35
images. We chose these two approaches for comparison
mainly because they also used the Gabor filters to obtain the
features. Table I compares the performance of the three
systems in terms of the TP (True Positive), TN (True
Negative), FP (False Positive), FN (False Negative), precision
(i.e., the ratio of TP/(TP+FP)), and recall (i.e., the ratio of
TP/(FN+TP)). It clearly shows that our proposed system
achieves the highest TP and TN, the best precision and recall,
and the lowest FP and FN. That is, our proposed system
achieves the best performance in successfully localizing the
barcode regions in an image.
TABLE I.
Methods
Ours
Unsupervised
Supervised
TP
24
19
17
COMPARISON OF THREE ALGORITHMS
TN
6
4
6
FP
1
3
1
FN
4
9
11
Precision
0.96
0.86
0.94
Recall
0.86
0.68
0.61
Figure 7 demonstrates our barcode localization results on
several images. Figure 7(a) shows four representative images
which contain barcodes of different sizes. Our system
successfully identifies and localizes all the barcodes inside
these four images. Figure 7(b) shows a representative image
which does not contain a barcode. Our system successfully
claims that barcodes do not exist in this image. Figure 7(c)
shows a representative image containing a blurred barcode and
its corresponding binarized and rotated Gabor filtered result.
Since it is impossible to read this blurry barcode, we consider
the qualified barcode does not exist in the image. The Gabor
filtering result clearly shows that the vertical lines in the
barcode are successfully detected. However, the barcode lines
are broken at a lot of places due to the blurred effect. As a
result, the system identifies the barcode but fails to localize the
barcode due to an insufficient number of pairs of line segment
with high frequency and high similarity measures at the
barcode portion. Figure 7(d) shows a representative image
containing a barcode and its corresponding binarized and
rotated Gabor filtered result. It clearly shows that the Gabor
filter successfully detects the barcode lines and other lines that
are perpendicular to the barcode and have high frequency and
similarity measures. As a result, our system identifies the
barcode and fails to precisely localize the barcode.
points of each qualified pair of line segments to
quickly estimate the location of the barcode.
We will develop a gradient-based method in the spatial
domain to replace the Gabor filters to speed up the localization
process. We will also experiment on a large set of varied
images to improve the robustness of our approach in localizing
the barcodes under different conditions. To further reduce the
false negative rate, we will adaptively decide the threshold for
computing the similarity measure used in the second step and
increase the number of pairs selected for the third step.
REFERENCES
[1]
(a) True positive cases
[2]
[3]
[4]
(b) True negative case
(c) False positive case
[5]
[6]
[7]
(d) False negative case
Figure 7. Illustration of representative images for each case.
V.
CONCLUSIONS
In this paper, we propose a simple and efficient barcode
localization approach to recognizing the position and
orientation of the barcode in an image. Our approach can be
easily ported on a cell phone to aid the blind to shop common
grocery products. The major contributions are:
•
Apply an eight-channel Gabor filter to find the
possible orientation of the barcode with respect to the
image.
•
Apply the randomized hierarchical search strategy to
quickly find a sufficient number of pairs of line
segments which have high frequency and high
similarity measures.
•
Apply the histogram analysis on the start and end
[8]
[9]
[10]
[11]
[12]
[13]
J. Nicholson and V. Kulyukin, “ShopTalk: Independent Blind Shopping
= Verbal Route Directions + Barcode Scans,” Proc. of the Rehabilitation
Engineering and Assistive Technology Society of North America
(RESNA) Conference. Phoenix, AZ, June 2007. Platform-presentation
and paper.
N. Normand, and C. Viard-Gaudin, “A Two-Dimensional Bar Code
Reader,” Pattern Recognition, Vol. 3, pp. 201-203, 1994.
C. Viard-Gaudin, N. Normand, and D. Barba, “Algorithm Using a TwoDimensional Approach,” Proc. of the Second Int. Conf. on Document
Analysis and Recognition, No. 20-22, pp. 45-48, October 1993.
S. Ando, and H. Hontanj, “Automatic Visual Searching and Reading of
Barcodes in 3-D Scene,” Proc. of the IEEE Int. Conf. on Vehicle
Electronics, No. 25-28, pp. 49-54, September 2001.
S. Ando, “Image Field Categorization and Edge/Corner Detection from
Gradient Covariance,” IEEE Trans. on Pattern Analysis and Machine
Intelligence, Vol. 22, No. 2, pp. 179-190, February 2000.
S. Ando, “Consistent Gradient Operators,” IEEE Trans. on Pattern
Analysis and Machine Intelligence, Vol. 22, No. 3, pp. 252-265, March
2000.
R. Muniz, L. Junco, and A. Otero, “A Robust Software Barcode Reader
using the Hough Transform,” Proc. of Int. Conf. on Information
Intelligence and Systems, No. 31, pp. 313-319, November 1999.
S. Arnould, G.J. Awcock, and R. Thomas, “Remote Bar-code
Localization Using Mathematical Morphology,” Image Processing and
its Applications, Vol. 2, No. 465, pp. 642-646, 1999.
S. J. Liu, H. Y. Liao, L. H. Chen, H. R. Tyan, and J. W. Hsieh, “CameraBased Barcode Recognition System Using Neural Net,” Proc. of Int.
Joint Conf. on Neural Networks, Vol. 2, pp. 1301-1305, 1993.
A. Jain and Y. Chen, “Bar Code Localization Using Texture Analysis,”
Proc. of the 2nd Int. Conf. on Document Analysis and Recognition, pp.
41-44, 1993.
Z. Bai, Z. Yang, J. Wu, and Y. Chen, “Region Localization Based on
Rotational Invariant Feature and Improved Self Organized Map,” Proc.
of the 3rd Int. Conf. on Intelligent System and Knowledge Engineering,
pp. 703-706, 2008.
R. Oktem, “Bar Code Localization in Wavelet Domain by Using Binary
Morphology,” Proc. of the 13th IEEE Conf. on Signal Processing and
Communications Applications, pp. 499-501, 2004.
D. Hirschberg, “Algorithms for the Longest Common Subsequence
Problem,” Journal of the Association for Computing Machinery, Vol. 24,
No. 4, pp. 664-675, 1977.
Download