Extended Multi-structure Local Binary Pattern for High

advertisement
EXTENDED MULTI-STRUCTURE LOCAL BINARY PATTERN FOR HIGHRESOLUTION IMAGE SCENE CLASSIFICATION
Xiaoyong Bian1,2, Chen Chen3, Qian Du4, Yuxia Sheng5
1
School of Computer Science and Technology, Wuhan University of Science and Technology,
Wuhan, 430065, China
2
Hubei Province Key Laboratory of Intelligent Information Processing and Real-time Industrial
System
3
Department of Electrical Engineering, University of Texas at Dallas, Richardson, TX 75080 USA
4
Department of Electrical and Computer Engineering, Mississippi State University, Mississippi State,
MS 39762 USA
5
School of Information Science and Engineering, Wuhan University of Science and Technology,
Wuhan, 430081 China
ABSTRACT
This paper presents a novel extended multi-structure local
binary pattern (EMSLBP) approach for high-resolution
image classification, generalizing the well-known local
binary pattern (LBP) approach. In the proposed EMSLBP
approach, three-coupled descriptors with multi-structure
sampling are proposed to extract complementary features
(pixel value and radial difference) from local image
patches. The anisotropic features derived from elliptical
sampling are also rotation invariant by averaging the
histograms over rotational angles and combined with the
isotropic features extracted from circular sampling.
Experimental results show that the proposed method can
effectively capture local spatial pattern and local contrast,
consistently outperforming several state-of-the-art
classification algorithms.
Index Terms—Spatial classification, High-resolution
image, Random texture, Feature extraction, Rotation
invariance
1. INTRODUCTION
High-resolution image classification is challenging due to
severe within-class variation. In recent years, the
exploitation of spatial information to enhance
classification performance has become an important part
of high-resolution remote sensing research [1]. There have
been a variety of studies that utilize spatial information for
high-resolution image classification or attempt to extract
informative and spatially invariant features [2].
The early representative spatial methods for highresolution image classification include knowledge transfer
framework [3] and Markov random fields model (MRF)
[4], which yield better accuracies than conventional
spectral classification algorithms. Later, a geostatistical
analysis of high-resolution data across space has been
This work was supported in part by the National Natural Science
Foundation of China under Grant 61501337, in part by the
Natural Science Fund of Hubei Province under Grants
2015CFC839 and Q20151101.
studied in [5] and better classification results are also
provided. A probabilistic modeling spatial classification
method based on extended random walkers (ERW) is
proposed in [6] and better performance is reported therein.
In addition, local binary pattern (LBP) operator, a simple
yet efficient operator to describe local image patterns, has
been presented and widely used in texture classification
[7]. Recently, some LBP-based impressive classification
results on hyperspectral images are also reported [8].
Although different from texture image, high-resolution
satellite image is still full of diversified textures with land
use and land cover (LULC) classes, especially true with
multi-cluster classes in a large scene. This fact motivates
us to develop new LBP descriptors for high-resolution
image classification. However, there are still some
unknown for LBP to be effectively designed to the
classification of high-resolution image. Inspired by the
work of Liu et al. [9] and Guo et al. [10], in this paper, an
extended multi-structure LBP (EMSLBP) based highresolution image classification paradigm is proposed. First,
high-resolution image datasets can be converted into
YCbCr color space for feature extraction purpose. Then,
the EMSLBP algorithm is adopted to extract spatial
feature histograms from the scene image and feature
histograms are combined. Finally, the support vector
machine (SVM) classifier is used to obtain classification
maps.
This paper is an extension of our previous work [11].
Here, the previously proposed framework is enhanced by
considering the pixel values and differences between
central and neighboring pixels in a local patch as new LBP
descriptors on the basis of extended multi-structure
sampling. These enhancements lead to a substantial
improvement in performance, as evidenced across
extensively experimental tests on a wide range of highresolution image datasets. The pixel values of a central
pixel and its neighboring pixels are both considered; while
for pixel differences, for simplicity, only radial difference
is studied. In our proposed approach feature code maps
share the same format as conventional LBP and are readily
combined to form the final feature histogram, and the
implementation is simple as well.
2
difference-based descriptors RD  LBPcriu
(abbreviated as
, p ,r
2. Extended Multi-structure LBP
The original LBP methods compute patterns on small local
patches and only consider the symmetric microstructures.
Their performance may be limited, because they merely
depend on single microstructure and oversimplify local
structure. Moreover, they are sensitive to noise and image
rotation. An improved method suggested by Ojala [7] is to
consider only the rotation invariant “uniform” descriptor
(called LBPpriu,r 2 ) defined as
LBPpriu,r 2
 P 1
 s( x r ,n - x0,0 ), if U ( LBPp ,r )  2
 
n 0
otherwise

p  1,

(1)
U ( LBPp ,r ) |
r ,n
- x0,0 ) - s( xr ,mod(n 1, p ) - x0,0 ) |,
n 0
(2)
1, x  0
s( x )  
0, x  0
(abbreviated
as
RD  LBPe, p ,r ) are introduced, here subscripts c and e
mean circularly symmetric sampling (isotropic properties)
and elliptically asymmetric sampling (anisotropic
properties), respectively. Similar to [9], we define the
CI  LBPc , p ,r , NI  LBPc , p ,r , RD  LBPc , p ,r descriptors as
follows:
CI  LBPc, p ,r  s( x0,0 -  )
(3)
where  is the mean of the image, and
 p 1
 s( x r ,n -  r ), if U ( LBPp ,r )  2
 
n 0
otherwise

p  1,

(4)
1 p 1
 xr,n . Obviously, NI  LBPc, p,r
p n 0
and
p -1
 s( x
2
RD  LBPeriu
, p ,r
),
NI  LBPc , p ,r
where
where  r 
LBPpriu,r 2 differ in the selection of thresholding value. The
Thus, the LBP methods encode the local image
information by circularly symmetric sampling gray values
at a central pixel x0, 0 and p points ( xr ,n ) np01 . Suppose the
coordinates of the central pixel are (0, 0), and let
a  2n / p , for circular neighborhood, the coordinates of
x r ,n are [ r sin(a ) , r cos(a ) ]; for elliptical neighborhood,
let the length of the minor axis be equal to the radius r of
circular neighborhood and set a certain ratio of elliptical
major and minor axis as m , X 0  mr cos(a   ) ,
Y0  r sin(a   ) , then the x-coordinate of x r ,n is
 X 0 sin( )  Y0 cos( ) , while its y-coordinate of x r ,n
equals
RD  LBPc , p ,r
X 0 cos( )  Y0 sin( ) ,
where
four
different
NI  LBPc , p ,r descriptor tends to be more robust to noise.
In addition,
RD  LBPc , p ,r
 p 1
 s( x r ,n - x r 1,n ), if U ( LBPp ,r )  2
(5)
 
n 0
otherwise

p  1,

with the objective that RD  LBPc , p ,r is to obtain local
radial difference patterns computed from the pixel values
of the pairs of neighboring pixels of the same radial
direction. Note that CI  LBPe, p ,r and CI  LBPc , p ,r share
the same definition format except that an elliptical
neighborhood structure is employed for the former. The
definitions for NI  LBPe , p ,r and RD  LBPe, p ,r are in the
rotational angles  {0 ,45 ,90 ,135 } in each ellipse.
Those locations not falling exactly on a pixel are estimated
by interpolation. As can be seen, the LBPpriu,r 2 has p  2
same way as their counterparts respectively. Let
EMSLBPx , p ,r be any of the three local feature descriptors
distinct output values, leading to local image
representation of low dimensionality. Note that the
following function s() shares the same definition as
Equation (2).
However, the LBPpriu,r 2 descriptor loses local texture
LBP pattern of each pixel (i, j ) , then feature extractor
hx of length K is computed as




information and may fail to classify the LULC classes in
high-resolution image, since only the sign of the difference
is utilized mentioned above; on the other hand, the LULC
classes are often of random distribution and anisotropic
microstructures of them are also observed, as stated in
[11]. To avoid such problems, an extended multi-structure
LBP sampling is particularly preferred, i.e., we extend the
neighbor distribution in the elliptical manner to capture
anisotropic properties of LULC classes. Specifically, four
2
local pixel value-based CI  LBPcriu
(abbreviated as
, p ,r
2
(abbreviated as NI  LBPc , p ,r ),
CI  LBPc , p ,r ), NI  LBPcriu
, p ,r
2
CI  LBPeriu
, p ,r
NI  LBP
riu 2
e , p ,r
(abbreviated
as
CI  LBPe, p ,r
)
and
(abbreviated as NI  LBPe, p ,r ); two local
aforementioned, and EMSLBPx , p ,r (i, j ) is the extracted
hx ( k ) 
N
M
i 1
j 1
 ( EMSLBP
x , p ,r
(i, j )  k )
(6)
where 0  k  K  1 , K  2 p is the number of LBP
codes, the subscript x represents ‘c’ or ‘e’, and  () is
the Dirac delta function. M and N are the size of the
image. Furthermore, the proposed three-coupled
CI  LBPx , p ,r , NI  LBPx , p ,r and NI  LBPx , p ,r histogram
features can be readily fused and also be utilized to extract
the macrostructure information by applying it on the multiscale down-sampled pyramid images.
3. Spatial classification with Extended MULTISTRUCTURE LBP
A schematic illustration of the proposed EMSLBP based
high-resolution image classification method is shown in
Fig. 1. First, the extended multi-structure LBP sampling is
adopted to obtain the initial feature histograms, which
measure the distribution that a feature descriptor
contributes to the discrimination power for a pixel to
predict. Second, the anisotropic features are averaged and
combined with isotropic features to be stacked as a final
feature vector. Finally, the class of each test sample is
determined by performing the SVM classification.
Fig. 1. A schematic illustration of the proposed extended multistructure LBP based spatial classification method.
4. EXPERIMENTAL RESULTS
4.1. Image Data and Experimental Setup
The first dataset used in our experiments is the 19-class
satellite scene dataset [13]. It consists of 19 classes of high
resolution satellite scenes. There are 50 images with sizes
of 600 × 600 pixels for each class. The second dataset is
the 21-class land-use dataset with ground truth labeling
[14]. The dataset consists of images of 21 land use classes,
and each class contains 100 images with sizes of 256 ×
256 pixels. This is a challenging dataset due to a variety of
spatial patterns in those 21 classes.
For the 19-class and 21-class datasets, we randomly
select 60%, 80% samples from each class as training
samples and others were used for testing, respectively,
with a fivefold cross-validation for free parameter
selection. The randomly partitioned process is repeated ten
times. The mean accuracy over the ten splits is used to
evaluate the algorithms. For simplicity, we fix the
sampling points p and alter different values of r to
achieve the optimal implementation, where the value of p
is set as 16 and three radii are chosen (i.e., r = [1, 2, 3]).
The proposed EMSLBP method is compared with several
classification methods including the NI/RD/CI-LBP in [9],
CLBP_CSM [10], LBPV [15], MS-CLBP [8] and LBPpriu,r 2
3.1. EMSLBP Rotation Invariant Feature Extraction
(the same as CLBP _ S priu,r2 , abbreviated as LBPriu 2 ) using
The proposed approach introduces three-coupled local
texture feature descriptors to extract spatial features for
local image as aforementioned. The isotropic part of the
proposed method is rotation invariant, and on the contrary
of the anisotropic features extracted by elliptical sampling.
Fortunately, there are some successful techniques that can
contribute to rotation invariance. For instance, it could be
done by globally searching for the corresponding angle (or
minimal distance) among the extracted anisotropic
histograms of all candidate samples but that would be
computationally expensive. We propose to derive rotation
invariance from anisotropic histograms by averaging the
histograms over different rotational angles. The reason is
that an average anisotropic histogram is insensitive to
local image fluctuation such as rotation and its use as
statistical feature of each image is globally invariant to
these changes.
overall accuracy (OA). To make the comparison as fair as
possible, we use the same experimental settings as in [8, 9,
10], [15]. For other compared methods, the multiresolution features are simply stacked.
3.2. Feature Descriptors Combination
After anisotropic part of the proposed method is
transformed into rotation invariance, all the extracted
feature histograms are directly stacked as a final
composite feature vector or combined jointly, where a 2-D
or 3-D joint histogram of them is built first, and then
converted to 1D histogram and concatenated with each
other. Based on the above analysis, the representation can
be compared using standard distance metrics, allowing
robust classification methods to be employed and class
label assignment on the test set is conducted using
LIBSVM [12].
4.2. Classification Results
Fig. 2(a) compares the best scores achieved by our
proposed method and those of state-of-the-art methods. It
can be seen that our approach have an increase of 5% to
9% with the best scores over LBPV and LBPriu 2 in the
same test sets, whereas LBPriu 2 always produces an
inferior performance in both cases, most likely in part due
to the limited discrimination of the same microstructure of
local image. Three different training rates are further
investigated as shown in Fig. 2(b)-(c), from which we see
that our method outperforms other methods with almost all
training rates. Thus, it can be confirmed that our method is
able to provide complementary texture features of image
patch at multi-resolution multi-structure without a
significant increase in computational complexity. Table 1
gives the average confusion matrices calculated from the
10 runs of random partitions of the training and testing
sets using the proposed method. As seen in Table. 1, most
of the LULC classes can be correctly classified, some even
achieving very high classification accuracies, e.g.,
commercial, desert, and railway station in 19 classes and
chaparral, harbor, and forest in 21 classes. However, there
are still a few difficult classes, e.g., buildings, dense
residential, storage tanks, and tennis court. This is partly
due to the high similarity of these scene.
achieve consistent improvements compared to the
conventional LBP method and its variants.
6. REFERENCES
Fig. 2. Comparison of our approach, NI/RD/CI-LBP,
CLBP_CSM, LBPV, MS-CLBP and LBPriu 2 . (a) Results with
the best classification scores for 19-class and 21-class datasets.
(b) Results with different training rates by SVM for 19-class. (c)
Results with different training rates by SVM for 21-class.
Table 1. Average confusion matrices for the proposed EMSLBP
method on two dataset (a) 19-class (b) 21-class.
5. CONCLUSIONS
This paper introduces a novel extended multi-structure
LBP based (EMSLBP) spatial method for high-resolution
image scene classification. Through combining threecoupled complementary descriptors with multi-structure
sampling in the classification framework, the classification
accuracy of SVM can be consistently improved. The
experimental results show the proposed approach can
[1] G. Camps-Valls, D. Tuia, L. Bruzzone, and J. A.
Benediktsson, “Advances in Hyperspectral Image
Classification: Earth monitoring with statistical learning
methods,” IEEE Signal Processing, vol. 31, no. 1, pp. 45–
54, Jan. 2014.
[2] L. Bruzzone and C. Persello, “A novel approach to the
selection of spatially invariant features for the classification
of hyperspectral images with improved generalization
capability,” IEEE Trans. Geosci. Remote Sens., vol. 47, no.
9, pp. 3180–3191, Sep. 2009.
[3] S. Rajan, J. Ghosh, and M. Crawford, “Exploiting class
hierarchies for knowledge transfer in hyperspectral data,”
IEEE Trans. Geosci. Remote Sens., vol. 44, no. 11, pp.
3408–3417, Nov. 2006.
[4] Y. Tarabalka, J. Benediktsson, M. Fauvel, and J. Chanussot,
“SVM- and MRF-based method for accurate classification
of hyperspectral images,” IEEE Geosci. Remote Sens. Lett.,
vol. 7, no. 4, pp. 736–740, Oct. 2010.
[5] G. Jun and J. Ghosh, “Spatially adaptive classification of
land cover with remote sensing data,” IEEE Trans. Geosci.
Remote Sens., vol. 49, no. 7, pp. 2662–2673, Jul. 2011.
[6] X. Kang, S. Li, L. Fang, M. Li, and J. A. Benediktsson,
“Extended random walker-based classification of
hyperspectral images,” IEEE Trans. Geosci. Remote Sens.,
vol. 53, no. 1, pp. 144–153, Jan. 2015.
[7] T. Ojala, M. Pietikäinen, and T. Mäenpää, “Multiresolution
gray-scale and rotation invariant texture classification with
local binary patterns,” IEEE Trans. Pattern Anal. Mach.
Intell., vol. 24, no. 7, pp. 971–987, Jul. 2002.
[8] C. Chen, B. Zhang, H. Su, W. Li, and L. Wang, “Land-Use
Scene Classification Using Multi-Scale Completed Local
Binary Patterns,” Signal, Image and Video Processing, vol.
10, no. 4, pp. 745–752, Apr. 2016.
[9] L. Liu, L. Zhao, Y. Long, G. Kuang, and P. Fieguth,
“Extended local binary patterns for texture classification,”
Image Vis. Comput., vol. 30, no. 2, pp. 86–99, Feb. 2012.
[10] Z. Guo, L. Zhang, and D. Zhang, “A completed modeling
of local binary pattern operator for texture classification,”
IEEE Trans. Image Process., vol. 19, no. 6, pp. 1657–1663,
Jun. 2010.
[11] X. Bian, X. Zhang, R. Liu, L. Ma, and X. Fu, “Adaptive
classification of hyperspectral images using local
consistency,” Electronic Imaging, vol. 23, no. 6, pp.
063014-1–063014-17, Nov. 2014.
[12] C.-C. Chang and C.-J. Lin, “LIBSVM: a library for support
vector machines,” ACM Trans. Intell. Syst. Technol., vol. 2,
no. 3, pp. 1–27, Apr. 2011.
[13] D. Dai and W. Yang, “Satellite image classification via
two-layer sparse coding with biased image representation,"
IEEE Geosci. Remote Sens. Lett., vol. 8, no. 1, pp. 173–176,
Jan. 2011.
[14] Y. Yang and S. Newsam, “Bag-of-visual-words and spatial
extensions for land-use classification,” in Proc. Int. Conf.
Advances in Geographic Information Systems, San Jose,
CA, pp. 270–279, Nov. 2010.
[15] Z. Guo, L. Zhang, and D. Zhang, “Rotation invariant
texture classification using LBP variance with global
matching,” Pattern Recogn., vol. 43, no. 3, pp. 706–719,
Mar. 2010.
Download