INF 386, V-2003 What is texture ?

advertisement
INF 386, V-2003
Selected Themes
from Digital Image Analysis
Lecture 3
05.03.2003
Statistical Texture Analysis
Static or Adaptive?
Fritz Albregtsen
Department of Informatics
University of Oslo
INF 386, 2003, Lecture 3, page 1 of 32
What is texture ?
• Visual textures are spatially extended visual patterns of more or less
accurate repetitions of some basic texture elements, called texels.
• Each texel usually contains several pixels.
• Its characteristics and placement can be periodic, quasi-periodic or
random. Thus, textures may have statistical or structural properties,
or both.
• Texture features characterize the statistical or structural relationship
between pixels (or texels), and provide measures of properties such as
contrast, smoothness, coarseness, randomness, regularity, linearity,
directionality, periodicity, and structural complexity.
• Morphometric features measure the size and shape of objects,
independent of the gray level values of the pixels within the object.
• Densitometric features measure the distribution of gray levels or
optical density within an object, but not the positions of the pixels.
INF 386, 2003, Lecture 3, page 2 of 32
Categories of Methods
A large number of texture analysis methods have been developed for
automated analysis of visual texture.
We can broadly divide them into three groups:
• Statistical methods are often based on accumulating second or higher
order statistics (matrices), and using feature vectors that descrive
these probability distributions directly, and therefore describe the
image texture only indirectly.
• Structural methods are based upon an assumption that textures are
composed of texels which are regular and repetitive.
Both texels and placement rules have to be described.
• Structural-statistical methods characterize the texel by a feature
vector and describe the probability distribution of these features
statistically
INF 386, 2003, Lecture 3, page 3 of 32
Surveys and Reviews
• R.M. Haralick, “Statistical and structural approaches to texture”,
Proc. IEEE, 67, 786-804, 1979.
• J.M.H. du Buf et al., “Texture feature performance for image
segmentation”, Pattern Recognition 23, 291-309, 1990.
• P.P. Ohanian and R.C. Dubes, “Performance evaluation for four classes
of textural features”, Pattern Recognition 25, 819-833, 1992.
• T.R. Reed and J.M.H. du Buf, “ A review of recent texture
segmentation and feature extraction techniques”,
CVGIP: Image Understanding, 57, 359-372, 1993.
• M. Tuceryan and A.K. Jain, “Texture analysis”, In C.H. Chen et al.
(eds.) “Handbook of Pattern Recognition and Computer Vision”,
World Scientific Publ., 235-276, 1993.
• Randen and Husøy, “Filtering for texture classification: A comparative
study”, IEEE PAMI 21, 291-310, 1999.
INF 386, 2003, Lecture 3, page 4 of 32
Gray Level Cooccurrence Matrices
• The matrix element P (m, n) gives 2. order
statistical probability of going from
graylevel m to n when moving a distance d
in direction θ within the image (or a
sub-image).
• Given a M × N image having G gray levels.
Let f (m, n) be the pixel value at (m, n).
Then we have
where
P (i, j | ∆x, ∆y) = W Q(i, j | ∆x, ∆y)
1
(M − ∆x)(N − ∆y)
Q(i, j | ∆x, ∆y) =
A=
(
• Angular Second Moment (ASM) :
ASM =
G−1
G−1 X
X
i=0 j=0
{P (i, j)}2
• Entropy :
EN T ROP Y = −
• Correlation:
CORR =
G−1
G−1 X
X
i=0 j=0
P (i, j) × log(P (i, j))
G−1
G−1 X
X
(i − µx)(j − µy )P (i, j)
σx × σ y
i=0 j=0
W =
and
GLCM Features
NX
−∆y MX
−∆x
n=1
• Contrast:
A
m=1
1 if f (m, n) = i and f (m + ∆x, n + ∆y) = j
0 elsewhere
• Alternativ notation, given distance and
direction (d, theta)
P (i, j | d, θ)
CON T RAST =
G−1
X
n=0
n2 {
G
G X
X
i=1 j=1
| i − j |= n
• Inverse Difference Moment (IDM) :
IDM =
G−1
G−1 X
X
i=0 j=0
1
P (i, j)
1 + (i − j)2
• Sum of Squares, Variance :
V ARIAN CE =
G−1
G−1 X
X
i=0 j=0
INF 386, 2003, Lecture 3, page 5 of 32
P (i, j)},
(i − µ)2 P (i, j)
INF 386, 2003, Lecture 3, page 6 of 32
GLCM Features
• Sum Average :
AV ER =
2G−2
X
Static GLCM features
iPx+y (i)
i=0
• Sum Entropy :
SEN T = −
2G−2
X
Px+y (i)log(Px+y (i))
i=0
DEN T = −
• Inertia :
IN ERT IA =
Px+y (i)log(Px+y (i))
i=0
SHADE =
G−1
G−1 X
X
• Examples of the first category:
ASM =
G−1 X
G−1
X
{i − j}2 × P (i, j)
i=0 j=0
{P (i, j)}2
G−1 G−1
X
X
i=0 j=0
{i + j − µx − µy }3 × P (i, j)
P (i, j) × log(P (i, j))
• Examples of the second category:
G−1 G−1
X
X
i=0 j=0
• Cluster Prominence :
G−1
G−1 X
X
G−1 X
G−1
X
EN T ROP Y = −
IDM =
i=0 j=0
P ROM =
1: weighting based on GLCM value
i=0 j=0
i=0 j=0
• Cluster Shade :
• Two general categories:
2: weighting based on GLCM position
• Difference Entropy :
G−1
X
• Static GLCM features are weighted sum of
the cooccurrence matrix element values.
IN ERT IA =
1
P (i, j)
1 + (i − j)2
G−1 G−1
X
X
i=0 j=0
{i − j}2 × P (i, j)
• Note that shape of feature function will depend on G.
{i + j − µx − µy }4 × P (i, j)
INF 386, 2003, Lecture 3, page 7 of 32
INF 386, 2003, Lecture 3, page 8 of 32
Sum and difference histogram
• Sum and difference define the principal
components of a 2. order probability
density function.
Gray Level Run Length
• Define normalized sum and difference
histogram for a domain D
• A “run” = is a set of consecutive
(8-neighbor), colinear pixels having the
same gray level value.
Ps (i | ∆x, ∆y) = W Card{(m, n) ∈ D, s∆x,∆y (m, n) = i}
• “Run length” = number of pixels in a “run”.
Pd(j | ∆x, ∆y) = W Card{(m, n) ∈ D, d∆x,∆y (m, n) = j}
W =
1
(M − ∆x)(N − ∆y)
s∆x,∆y (m, n) = f (m, n) + f (m + ∆x, n + ∆y)
d∆x,∆y (m, n) = f (m, n) − f (m + ∆x, n + ∆y)
• There are 2G − 1 possible values in the histogram.
• The most frequently used features from GLCM can
be found exactly from Ps and Pd.
Contrast from Pd
CON =
2G−2
X
j=0
j 2 Pd(j | ∆x, ∆y)
Contrast from GLCM
CON =
G−1
X
n=0
2
n{
G
G X
X
i=1 j=1
P (i, j)},
• “Run length value” = number of “runs” in
an image.
• Each element p(i, j | θ) of a GLRLM gives
the number of “runs” of gray level i, of
lenghth j, in a given direction θ.
• Let P (i, j | θ) be elements of the
normalized GLRLM, i.e.
p(i, j | θ)
p(i, j | θ)
=
P (i, j | θ) = PG PR
S
i=1
j=1 p(i, j | θ)
S is the total number of runs in the image.
• The number of gray levels must be low.
| i − j |= n
INF 386, 2003, Lecture 3, page 9 of 32
INF 386, 2003, Lecture 3, page 10 of 32
Simplifying GLRLM
• Let
r(j | θ) =
G
X
represent the number of runs of length j,
while
g(i | θ) =
R
X
j=1
p(i, j | θ)
is the number of runs having gray level i.
• Let S be the total number of runs in the
image:
S=
G X
R
X
i=1 j=1
p(i, j | θ) =
G
X
i=1
g(i | θ) =
R
X
j=1
r(j | θ)
4
1
3
3
0
0
0
1
0
1
0
0
0
0
0
0
4
2
3
4
11 1 1
0
S=13
LRE =
G
R
R
1 X X p(i, j | θ)
1 X r(j | θ)
=
S i=1 j=1
j2
S j=1 j 2
G
R
R
1 XX 2
1X
j p(i, j | θ) =
r(j | θ)j 2
S i=1 j=1
S j=1
!2
=
!2
=
G
R
1X X
GLN =
p(i, j | θ)
S i=1 j=1
R
G
1X X
RLN =
p(i, j | θ)
S j=1 i=1
G
RP =
r(j | θ)
gray level run length, j
i
1 2 3 4 g(i | θ)
1
2
3
4
• The expressions for the static GLRLM
features :
SRE =
p(i, j | θ)
i=1
Simplification GLRLM
R
1X
[r(j | θ)]2
S j=1
R
1 XX
1X
p(i, j | θ) =
r(j | θ)
n i=1 j=1
n j=1
LGRE =
HGRE =
R
G
1X
[g(i | θ)]2
S i=1
G
G
R
1 X g(i | θ)
1 X X p(i, j | θ)
=
S i=1 j=1
i2
S i=1 i2
G
G
R
1 XX 2
1X 2
i p(i, j | θ) =
i g(i | θ)
S i=1 j=1
S i=1
• Note that all features may be computed without
actually accumulating a 2D GLRL matrix.
• The feature weights contain run length or gray level,
and never both of them at the same time.
INF 386, 2003, Lecture 3, page 11 of 32
INF 386, 2003, Lecture 3, page 12 of 32
GLCM and GLRLM
• GLCM : probability of pixel gray level pairs
having a given spatial and intensity relation.
— expresses pixel pair contrast, not texel size.
— several matrices are needed to estimate texel properties
(e.g. size, quasi-periodicity, orientation).
• GLRLM : probability of several connected, colinear pixels
so close in gray level that they form "gray level runs".
— does not capture the true shape aspects of the texels
— comes much closer to doing this than the GLCM
— discards information on contrast between gray levels.
INF 386, 2003, Lecture 3, page 13 of 32
Generalized Cooccurrence Matrices
• Davis et al. (1979) introduced generalized matrices (GCM).
• GCM was based on local maxima of the gradient image of the texture.
• Coocurrence of gradient magnitude and direction, using spatial
constraint predicates instead of specific geometric distances.
• Could be “cooccurrence of anything”.
INF 386, 2003, Lecture 3, page 14 of 32
Cooccurrence of Gray Level Runs
• 1D Histograms → probability distribution of single pixel intensity.
• 2D GLCM’s → probability distribution of intensity of pixel pairs.
• 2D GLRLM’s → probability distribution of intensity of runs of pixels.
• 4D CGLRLM’s → probability of neighboring pairs of runs of pixels.
• An increasing amount of information involved,
⇒ better description of image texture
⇒ more matrix bins to populate.
INF 386, 2003, Lecture 3, page 15 of 32
From 2D GLCM to 1D Histograms
• The sum and difference
define the principal axes
of the second order GLCM
probability distribution function.
• Replace 2D GLCM by 1D sum and difference histograms.
• The usual (Haralick) texture features (or some close approximations)
associated with the 2D GLCM can be computed directly
from the sum and difference histograms.
• This is widely used, mostly for computational reasons.
INF 386, 2003, Lecture 3, page 16 of 32
From 4D to 2D Matrices
• Two independent runs of (graylevel,runlength) = (i, j) and (k, l)
may be viewed as two random variables with the same variance.
• A 4D CGLRLM probability matrix P (i, j, k, l) may be replaced by
- one 2D sum run length matrix, Ps(ξ, ψ),
- one 2D difference run length matrix, Pd(γ, δ).
• So for all neighboring runs in a given image :
— compute the sum ξ and difference γ of the two graylevels
and the sum ψ and difference δ of the two run lengths.
— accumulate the entries in the two 2D matrices.
— Finally, normalize the sum and difference run length matrices.
• A complexity reduction of GR/8 (G = gray levels, R = max run length).
INF 386, 2003, Lecture 3, page 17 of 32
Ad hoc Features
Fk =
X
P (i, j)Wk (i, j)
i,j
• As shown earlier, GLCM feature extraction is usually performed
by computing a number of non-adaptive pre-defined,
(ad hoc) weighted sums of matrix elements,
— either based on the value of each matrix element
— or based on the position of the element within the matrix
— shape of weight function depends on quantization.
• The GLRLM feature weights only contain run length or gray level,
never both of them at the same time.
• Complex matrix structures are not captured in any single feature.
• Features do not adapt to problem-specific matrix structures.
INF 386, 2003, Lecture 3, page 18 of 32
Adaptive Features - I
• Assume that the n-th image of class ωc gives a 2D matrix Pn(i, j|ωc).
• Calculate average matrix over all N (ωc) images in each class ωc
P (i, j|ωc) =
N (ωc )
1 X
Pn (i, j)
N (ωc)
n=1
• the class variance matrix
σP2 (i, j|ωc)
N (ωc )
1 X
=
(Pn (i, j) − P (i, j|ωc ))2
N (ωc) n=1
• the Class Difference Matrix
∆P (i, j|ω1, ω2 ) = P (i, j|ω1 ) − P (i, j|ω2 )
• and finally the Mahalanobis Class Distance Matrix
JP (i, j|ω1 , ω2 ) = 2
(P (i, j|ω1 ) − P (i, j|ω2 ))2
(σP2 (i, j|ω1 ) + σP2 (i, j|ω2))
INF 386, 2003, Lecture 3, page 19 of 32
Adaptive Features - II
• We use the squared Mahalanobis class distance as weights.
• We use the disjoint positive/negative parts of the class difference
matrices as the domains of the weighted summation.
• An image having a matrix Pk (i, j) → two adaptive features
F+ =
X
Pk (i, j) [JP (i, j|ω1, ω2)]2
∆P (i,j|ω1 ,ω2 )≥0
F− =
X
Pk (i, j) [JP (i, j|ω1, ω2)]2
∆P (i,j|ω1 ,ω2 )<0
• Highest weight on the most discriminatory parts of the matrices!!!
INF 386, 2003, Lecture 3, page 20 of 32
Difference and Distance Matrices
• Here, two class difference matrices ∆ have to be used,
- one from the two sum run length matrices (class 1 and 2)
- one from the two difference run length matrices (class 1 and 2)
∆s(ξ, ψ|ω1, ω2) = P̄s (ξ, ψ|ω1) − P̄s(ξ, ψ|ω2)
∆d(γ, δ|ω1, ω2) = P̄d(γ, δ|ω1) − P̄d(γ, δ|ω2)
where P̄k (., .|ωn) is the average normalized sum (k = s)
or difference (k = d) run length matrix for class ωn, n = 1, 2.
• Two Mahalanobis class distance matrices must be used,
- JPs (ξ, ψ|ω1, ω2) for the sum run length matrices,
- JPd (γ, δ|ω1, ω2) for the difference run length matrices.
INF 386, 2003, Lecture 3, page 21 of 32
The Four CGLRLM Features
• The four features from the sum and difference run length matrices
for an image from class ωn, where n ∈ {1, 2}, are then given by
X
F s+ =
Ps(ξ, ψ) [JPs (ξ, ψ|ω1 , ω2 )]2
∆s (ξ,ψ|P̄s )≥0
F s− =
X
Ps(ξ, ψ) [JPs (ξ, ψ|ω1 , ω2 )]2
∆s (ξ,ψ|P̄s )<0
F d+ =
X
Pd (γ, δ) [JPd (γ, δ|ω1 , ω2 )]2
∆d (γ,δ|P̄d )≥0
F d− =
X
Pd(γ, δ) [JPd (γ, δ|ω1 , ω2 )]2
∆d (γ,δ|P̄d )<0
• Note that the weighted summations are performed
over the two disjoint (+/-) partitions of each class difference matrix.
INF 386, 2003, Lecture 3, page 22 of 32
INF 386, 2003, Lecture 3, page 23 of 32
INF 386, 2003, Lecture 3, page 24 of 32
5.5
4.4
6.6
4.9
6.2
3.2
3.7
4.9
1.9
ERR, training ERR, test JB
Method
average of best GLCM
average of best GLRLM
adaptive CGLRLM
Best combinations of two features :
2.89
3.08
1.92
2.00
5.00
9.6
6.5
10.2
8.3
3.7
9.4
5.7
10.8
7.8
2.5
best GLCM average (CON)
average of best GLCM
best GLRLM average (RLN)
average of best GLRLM
adaptive CGLRLM
• Given 10 textures, we get 45 texture pairs.
ERR, training ERR, test
• Each sub-image was normalized to the
same mean value and standard deviation
(µ=127.5 and σ=50.0).
Method
• The 48 sub-images were divided randomly
but equally into a training set and a test set.
Best single features :
• Each texture image was partitioned into 48
non-overlapping 75 × 75 pixels sub-images.
Results - Brodatz Textures
• From 112 Brodatz textures we have
selected the 10 most relevant,
i.e. stochastic, isotropic, homogeneous
and fine-grained textures.
JB
Brodatz Textures
TEM images
of mouse liver cell nuclei
Liver cell results
Method
Features
JB
Error (%)
classical GLRLM
classical GLRLM
classical GLRLM
classical GLRLM
SRE
LRE
RLN
RP
1.50
1.21
1.45
1.06
10
10
10
10
adaptive CGLRLM
adaptive CGLRLM
adaptive CGLRLM
adaptive CGLRLM
F s+
F s−
F d+
F d−
0.87
1.30
1.35
1.33
10
5
5
10
q
1
−2J
(ω
,ω
)
B
1
2
1 − (1 − 4P (ω1)P (ω2 )e
) < ε1,2
2
p
ε1,2 < P (ω1 )P (ω2 )e−JB (ω1 ,ω2 )
25
20
Examples of liver cell nuclei from normal
(top) and noduli (bottom) samples.
The borders between the 30% peripheral
and 70% central part are outlined as a thin
white line.
15
10
5
0
0.8
INF 386, 2003, Lecture 3, page 25 of 32
0.9
1
1.1
1.2
1.3
1.4
1.5
1.6
INF 386, 2003, Lecture 3, page 26 of 32
Mouse liver cell nuclei in TEM
→ Class differences and distances are very localized!
→ Adaptive features give high class distances!
Class Difference Matrix
normal - premalignant
run length difference matrix.
Class Distance Matrix
normal - premalignant
run length difference matrix.
INF 386, 2003, Lecture 3, page 27 of 32
Ovarian Cancer
Eight monolayer cell nuclei from a good
prognosis sample (upper) and eight nuclei
from a bad prognosis sample (lower).
INF 386, 2003, Lecture 3, page 28 of 32
FIGO stage I Ovarian Cancer
peripheral and central segments of the cell nuclei
−3
x 10
6
0.01
4
0.005
2
0
0
−2
−0.005
−4
−6
20
−0.01
20
15
40
30
10
20
5
10
0
0
15
40
30
10
20
5
10
0
0
• The positive/negative parts of the class difference
are found in different locations in the matrices
from center (left) and periphery (right).
• These subtle texture differences are very hard to see
from the gray level images themselves.
INF 386, 2003, Lecture 3, page 29 of 32
FIGO stage I Ovarian Cancer
chromatin structure size and contrast differences
• A gray level difference of one between neighboring runs
is less probable in cell nuclei from good prognosis than in bad
prognosis.
• Larger gray level differences are more probable in good prognosis.
• Subvisual difference in texel size and contrast between classes.
INF 386, 2003, Lecture 3, page 30 of 32
Gray Level Entropy Matrices
• GLEM is a way of extracting higher order texture information.
• The GLEM element P (i, H|w) gives an estimate of the probability of finding a first
order (histogram) entropy H for a window of size w × w centered on a pixel having
gray level value i.
• The entropy value H is defined from the normalized gray level histogram p(g) within
the window by
G
X
P (g) log [p(g)] , P (g) 6= 0
(1)
H=−
g=1
• The GLEM may be computed for a variety of window sizes, w, and it is natural to
presume that the probability distribution within the matrix will vary as w is altered. It
is therefore important that P (i, H|w) is estimated for all possible locations where the
whole window is inside the image or the image segment. Otherwise, the results from
different window sizes would be mixed.
INF 386, 2003, Lecture 3, page 31 of 32
Complexity Graylevel Matrix
• The basic concept of CGM is that the texture information is extracted from a local
neighborhood of w × w pixelsm, w = 2k + 1, k >= 1.
• The local texture information is represented by the complexity value, which is the
number of black-to-white transitions within the neighborhood, when the center
pixel value is used as a threshold.
• The complexity value c will vary from 0 for a homogeneous binary to
c = 2w(w − 1) = 8k 2 + 4k for the most complex binary neighborhood (12 for a
checkered 3 × 3 pattern).
• Computing the local complexity value over the whole graylevel image, we
accumulate a 2D histogram N (i, j), giving the number of windows having center
pixel graylevel value i and complexity value j. Normalizing the 2D histogram N (i, j)
we get the CGM
CGM (i, j) =
N (i, j)
, i ∈ [0, 1, 2, ..., G − 1] , j ∈ [0, 1, 2, ..., 2w(w − 1)]
(Nx − 2k)(Ny − 2k)
where the Nx × Ny image has G graylevels, and a sliding w × w window has been used.
INF 386, 2003, Lecture 3, page 32 of 32
Download