Document 11052205

advertisement
HD28
.M414
Dewey
f»^.
.JUL JtfJ 1981
ALFRED
P.
WORKING PAPER
SLOAN SCHOOL OF MANAGEMENT
ASYMPTOTIC PROPERTIES OF BIVARIATE K-MEANS CLUSTERS
M. Anthony Wong
W.P. #1216-81
May 1981
MASSACHUSETTS
INSTITUTE OF TECHNOLOGY
50 MEMORIAL DRIVE
CAMBRIDGE, MASSACHUSETTS 02139
ASYMPTOTIC PROPERTIES OF BIVARIATE K-MEANS CLUSTERS
M. Anthony Wong
W.P. #1216-81
May 1981
M.I.T.
LIBRARIES
JUL 15
1981
RECEIVED
,
I
ASYMPTOTIC PROPERTIES OF BIVARIATE K-MEANS CLUSTERS
M. Anthony Wong
Sloan School of Management
Massachusetts Institute of Technology
Cambridge, MA 02139
Hey Words and Phrases:
k-means clusters; within cluster sum of squares;
graph theory; regular hexagons.
ABSTRACT
A bounded reeion in R
is
partitioned into
is minimized.
k
2
with a uniform density function defined over it
sub-regions such that the within cluster sum of squares
An asymptotic
(k-^<»)
lower bound for the within cluster sum of
This lower bound is
squares of this optimal k-means partition is obtained.
useful in suggesting that the graph-configuration of the optimal k-partition
would consist of regular hexagons of equal size when
k
is large enough.
An
empirical study illustrating these asymptotic properties of bivariate k-means
clusters is also presented.
1
7*r
\y;:'X.Ji
-
INTRODUCTION
1.
Let the observations
density function
x
be sampled from a distribution
...,x^
In cluster analysis,
f.
Hartigan (1975), Chapter
observations into
,
4)
is
with
the k-means clustering method (see
often used to partition the sample of N
clusters with means
k.
F
x_,...,x,
1
k
The resultant clus-
.
ters satisfy the property that no movement of an observation from one cluster
to another reduces the sample within cluster sum of squares
WSS^ = .1. ."^^;.
N
x=l l<j<k
II
X. - x.||'/(N-k)
1
J
For these k-means clusters, a k-partition of the sampled space can be defined
by associating each cluster mean
points in
R
closer to
with the convex polyhedron
x.
than to any other cluster mean.
x.
ing optimal k-means partition in the population
tion cluster means
y
,
X
.
.
.
,y
,
of all
C.
The correspond-
is defined by the popula-
F
which are selected in such a way that the
IS.
within cluster sum of squares
WSS = /
l^jlkl |x - M.|
For fixed number of clusters
dF
I
the asymptotic convergence (as N
k,
-»
of
°°)
the sample k-means clusters to the population k-means clusters has been studied
by MacQueen (1967),
Hartigan (1978), and Pollard (1981),
sult can be found in Pollard (1981)
,
The most recent re-
in which conditions are found that ensure
the almost sure convergence of the set of means of the k-means clusters.
ever,
How-
the asymptotic properties of k-means clustering in the case where the
number of cluster
k
increases with the sample size
N
did not receive much
attention.
In Hartigan and Wong (1979a)
(as k
It
is
->•
<=°)
and Wong (1980), some asymptotic properties
of the population k-means clusters in one dimension are obtained.
shown that the within sum of squares of the
-2-
k
clusters are asymptotically
equal, and that the length of the jth cluster interval (l<j<k) is inversely
proportional to
interval.
o(N/log N)
f(c.)
1/3
,
where
is the midpoint of the jth cluster
c.
It is also shown that if k(N)
1/3
,
->
«>
N
as
->
«>
with k(N) =
then the sample k-means clusters have asymptotic properties
similar to that of the population clusters.
Using these results, it can
also be shown that a uniformly consistent histogram estimate of
is constant over each k-means cluster interval,
£•
f,
which
can be constructed from the
imple using the k-means method.
Unfortunately, these univariate results cannot be easily generalized to
Only empirical evidence exists to support the con-
the multivariate case.
jecture that similar asymptotic results hold in several dimensions for
k-means clusters, and that a uniformly consistent histogram estimate of a
multivariate density
f
can be constructed by the k-means method.
uniform consistency result is of special practical interest as
The latter
it would just-
ify the usage of the computationally efficient k-means method (Hartigan and
Wong, 1979b) for estimating multivariate density functions from large samples.
In this paper, some asymptotic properties of the population k-means clus-
ters for uniform distributions in
R
2
are given,
In Section 2, an asymp-
totic lower bound for the WSS of the optimal k-means partition is obtained.
Since this lower bound is attained when all
k
clusters of the partition are
regular hexagons of equal area, this result suggests that the graph configuration
of the optimal k-means partition would consist of regular hexagons when
large enough.
k
is
An empirical study is performed to illustrate these asymptotic
properties of bivariate k-means clusters, and the results are given in Section
3.
-3-
The results given in this paper fall short in generalizing the asymp-
totic properties of the univariate population k-means clusters to the bi-
variate case.
However, they are the first results obtained in the investi-
gation of the properties of bivariate k-means clusters.
-4-
AN ASYMPTOTIC LOWER BOUND FOR THE WITHIN CLUSTER
2.
SUM OF SQUARES OF K-MEANS CLUSTERS
In this paper,
some asymptotic properties
(k
*
of the population
°°)
k-means clusters for the uniform density in two dimensions are given.
result is Theorem
The main
which gives an asymptotic lower bound for the within cluster
2,
sum of squares of the optimal k-means partition.
Theorem
2
Let
be a region of area
jlj,
:
the loundary of
A
A
with a connected interior in
1/A
has finite length and let
R
.
Suppose that
be the constant density over
d.
WSS
Let
be the minimum within cluster sum of squares over all k-partitions.
Then
(Remark:
k
Since the asymptotic lower bound given in Theorem
2
is attained
when all
clusters of the partition are regular hexagons of equal area, this result sug-
gests that the graph configuration of the optimal k-means partition would consist of
regular hexagons when
In outline,
n
is
large enough.)
the proof of Theorem
edges and area
(see Lemma 1).
k
A
2
requires first showing that the polygon with
which has the minimum within polygon sum of squares is regular
For any polygon divided into
k
clusters, a lower bound for the
limiting value (lim inf) of the within cluster sum of squares may then be found by
assuming the clusters are regular hexagons (Theorem
found by covering the polygon with regular hexagons.
bounds approaches
1
as
k ^
=°,
Theorem
Hence, to show the result in Theorem
-5-
2
2,
1)
.
An upper bound may also be
Since the ratio of the two
follows.
we need the following lemmas:
LEMMA
If
f
1
is the constant density over an n-sided polygon
WPSS^
then a lower bound for
^ fA^^^^^y^ + \ tan ^1
is given by
However, to prove Lemma
A
of area
sua of squares ot
the within polygon
,
'Ji
.
we need Lemma 1.1 and Lemma 1.2.
1,
Lemma 1.1
For a triangle
vertex
V
A
with fixed area
,
the minimum value of
,
-^ ^[-
/
A
,
and fixed angle
r^d(Area)
(where
+ v tan =- 9
3
o
is isosceles with equal edges adjacent to V .
tance from
V)
is
tan
Z o
2o
/, ^
l/zo
,
3
r
9
at the
is the dis-
achieved when
»
A
Proof .
Consider the triangle
SVT =
with angle
has an area of
Let
M
A
and
6
o
.
be the nidpoint
between the vertices
S
VTS
T
loss of generality, let
Let
put
if
j{TM[l=
ST
t
.
Fig.
Without
(see Fig. 1).
the origin.
and
e
be at
be the unit vector along the base
Also, let
is rotated by
M
1
d9
V
be represented by
about
M
-6-
to
S'T'
^
.
ST (x-axis) and
It follows that
(see Fig. 1)
the
,
^
,
increment in
(Area),
/ r^d
t
ll£
^o
= S^
o
4
=
J
^
+
^£ll
the second rncrnent about
^
"
II
z - 5^111^ 5vdxde
4(z-e) x2dxde
.3
,
U-£)t
And hence the miniTnum occurs when
negative.
the triangle
is given by
,
t
^^^^^
this increment is positive or negative as
Thus,
V
z-e
z-e =
is
positive or
;
that is, v;hen
isosceles.
it,
Now, if w^ move towards the isosceles position by such a rotation, the
area increases.
Thus, for a given triangle
2nd moment about
the 2nd moment about
-^A
[
2 o
we can decrease the
,
by first rotating to the isosceles position, and then
V
sliding the base back towards
equal to
VTS
V
»„
.
tan 1/2
V
until the triangle has area
A
for an isosceles triangle of area
+
„
e
o
—
3
tab -^
2
o
]
A
.
Since
is
the lemma folloys.
,
'
Lemma 1,2
Let
VT
be two lines in
VS
and
Suppose that
]R2
Q
A
O
.
Then
/ r^d (Area),
*
'
'Q
11
.
VT
and
VS
Let the area of
.
Q
the second moment about V, is minimized when
is an isoceles triangle.
Proof
.
[I]
Fix an integer
i
and real number
set of plane figures such that every
Q
u
C
,
or
VL.
II
VL
= IJVL
;
l|
and
L.
= u
.
l.
and let
^(i,u)
rilaterals (v;hose interiors are disjoint), all
IjVL
<
o
is a union of quadrilaterals (whose inreriors are dis-
Q
joint) all of whose vertices lie on
be
T\'S = a
angle
j^t'Tx
of
lie respectively on
,^(i,u)
is a union of quad-
whose vertices lie on
VT
and
VS
Using the labeling system shov/n in Fig.
-7-
be the
2,
,
with
it
is clear that every
Q
€
^2i=-^8
^(i,u)
is uniquely determined by the set
of
vertices
(x,
...,
•"'
,
"r
x, .)
where
,
4i'
x^.x^
the x 's satisfy
.
5 x^ <
(i)
•• _
< x^^
< u
X
-
,
and
T
.
Aa Example of an
can be identified
^(i,u)
Thus,
u
- x^^
V _ X^^^j _
{,?X}
I
with a compact subset of
[0,u]
4i
Q
^(4,u)
^
(x.'s are the
.
distances from
V
.)
from which it inherits a natural
topology (pointwise convergence of
Under this topology, the two mappings
the X 's).
f
3
f
:
^(i.u)
*'i^
where
f
^(i,u)
.
>
are continuous on
and
Q
f
^(i,u) =
{Q
€
= /^-^d (Area),
(0)
A^ 5 area of triangle
Therefore, if
^(i,u)
min
It follows that
Qe^^(i.u)
f
such that
" (Q)
f
(Q)
= A
is nonempty and compact.
}
is attained by sods
Next, we will show that, for any
triangle.
i
and
u
,
Q^
€
were not an isosceles triangle.
Q
Using
^
(i,u)
i = 2
Lensina
1.1
eliminate other cases, it is sufficient to consider the case when
= triangle
VBA
(1)
||VB[!
>
IIVAil
,
(2)
!|VFI|
>
IIVHll
.
u
triangle AFH
(see Fig.
and
-8-
3)
.
is an isosceles
It is enough to show that the result holds when
Suppose that
^o
and
->TR
the set
^1^2
[II]
(Q) = area of
:^(i,u)
^
where
.
to
Consider the triangle AFH
.
BF
,
Choose a point
close to
F
through
on
C
Let fhe line
.
parallel to
C
cut the triangle
AFH
the segment
.
C*G*
late
line
||C*G*||
and
CGHA
Trans-
along this
with
CGHA
trapezium
=
along
form a
t.'
||CG||
C*G*
VS
By this construction (see Fig.
.
have the same area, but
C*G*HA
/
the trapeziums
3),
r^ d (Area)
<
CGHA
r^d (Area).
/
C*G*HA
Produce
C*FG*
HG
to intersect
VT
at
D
.
has a larger area than triangle
=
Since
I|CG[[
CDC
Part of the area of
.
1|C*G*1|
,
can therefore be redistributed to complete the quadrilateral
a decrease in 2nd moment.
VBA
to produce triangle
points of triangle
triangle
[Ill]
ABB*
;
C*FG*
,
VB*A
.
If
are further fro:a
V
than all the points of
2nd moment is thus decreased.
nin
f
is attained by
^ (Q)
the isosceles triangle with equal edges adjacent to
that it is the same
/ r^d
with
is small enough, all the
||CFj|
From the result of [I] and [II],
value of
CDHA
C*FG*
The; remaining area can be added to triangle
Q-:.^^(i,u)
Q
triangle
Q
(Area)
o
for each
over
U
^(i.u)
^(i,u)
.
Thus
Qo
V
Notice
gives the minimum
.
Now, we can proceed to obtain the result given in Lemma 1.
-9-
.
Proof of Lemma
1.
Consider a given n-sided polygon tx/
[I]
A
of area
By joining the centroid
.
and occh of the
vertices of je/
n
the
Let
Fig. A).
jy in the
t^
of
defined by
tJv
cones radiating from
n
and let
subtended at
C
n
Let
29.
(see
C
be the subset of
j^.
ith cone.
we
,
,
obLain an n-partition of
C
be the area
A,
be the angle
(<n).
by the ith cone.
Then
n
= A
Z A.
and
^
1
each
where
(1)
r
From Lemma
.
is the distance from
UTS^
and Lemma
1.1
n
- f I
^
{^.
C
.
Summing over
^'^ (Area) > |f
I
1
Next, we will find the minimum of
i
A.2(^ -e7
1
E
,
n
the constraints:
(i)
.
^
1
and
(ii)
^
Z 9, = n
^
1
where
Thus,
c
and
c
J
undei:
1
^
are constants independent of
i
squaring (2) and then dividing by (3), we have
-10-
IT
;
6-
^
<
y
^ox
the mininium Eust satisfy:
1.1
tan^g.
taa 9,]
1
Now, using Lagrange multipliers,
2r
A/(-
J
A.^f
t- + ^ -an 9.]
1
X 'tan 9.
3
sec^9,
(3)
fojT
n
- A
Z A.
1
- i - n
we have
ve have
,
-^
1
1
1.2
,
1
rn]
II
^
< i < n
1
^
E 6,
1
.
(1
+
-r-
tan-2e./[-(l + tan-29.) +-^(1 + can^e.)]
t-in2e )2
and hence
(| tan-^eJ/Cl +
(A)
I
i
tan2e. +
Can'*6.) =
+ J-
1
= r
.
Since the left side of (4) is a strictly increasing function of
tan^e.
is a constant.
But
< n/2
b.
for all
i
so we .oust have
,
< i < n
Also,
A. = A/n
[Ill]
Using the result of [II], we have fron
for all
^
WPSS. >
>
1
^2
1
2n
1
"
f Z A.2[
X
^
f
Il/n
for all
^
< i < n
1
.
.
11
\
tane^
+
i-
3
(1)
n/n
that
tane.]
1'
A2[__L-7_ + i tan
"^tan
=
6.
tan^g.
3
and the equality holds when the given n-polygon
H]
n
,
*
tJS/
is regular, which
gives the leimna.
In the application of -this lemma,
(Remark:
f
is usually the constant
density over a region containing the n-sided polygon
f
<
1/A
Thus,
tjf .
for most applications.)
Next,
cluster
in Theorem 1, we will obtain a lower bound for the within
sum of squares of the optimal k-means partition of a
polygon in
R
2
.
However,
consider only "3-edge"
it is important to first establish that it is sufficient to
the property
k-partitions (k-partition whose corresponding graph configurations have
that all the interior vertices are associated with exactly
Lemma
2
-11-
3
edges.)
Hence, we need
.
Lemma
Let
2
t^
be a region with connected interior in
density over ^j^ is
f
and
x^xat
every k-partition and for every
R
Suppose that the constant
.
^^
is
partitioned into
>
0,
there exists a "3-edge" partition whose
e
k
regions.
Then for
within cluster sum of squares differs from that of the given partition by
at most
e
Proof .
WSS
Since the within cluster sum of squares,
,
can be expressed in the
form:
k
WSS = Z WSS^ ' ^
where
/j^
^^^^^^
'^i'^
= distance to ith cluster centroid and
r
»
»JSy.
i
WSS
it is clear that
1
is the ith cluster,
is a continuous function of the vertices of the
k
k-partition, for every fixed
.
The lemma follows.
Theorem
Let
^
1
:
be a polygon with a connected interior in
Suppose that
J^.
Let
j;/
WSS
has area
A
and that
f
F^-
.
is the constant density over
be the minimum within cluster sum of squares over all
possible k-partition of
J^
.
Then
^wss/(i^.
k>«'
k
5g)
54
-12-
>
1
.
•
Proof .
Let
be the area of the ith cluster and
A.
1
be the number of ed^es
°
E,
i
of the ith cluster.
k
We will first obtain an expression for
[I]
T.
1
E.
^
.
Consider the configuration of the optinal k-partiticr.
Using
Leiima
and by choosing an arbitrarily sniall
2
of jfy
£
.
,
it is
enough to examine a "3-edge" k-partition.
Let
n
be the number of vertices of the polygon tJ^
using a continuity argument, it can be
Then
n = n
consider partitions with
where
,
sho'-^m
that it is enough to
is the number of vertices
n
in the configuration of the given partition associated with exactly two
edges.
Let
be the number of vertices associated with 3-edges in
n
Using some results in graph tbeery, we have
the configuration.
(1)
2E = 3n
+ 2n
where
is the total number of edges.
E
Moreover, Euler's formula gives
E +
1
= F
+ V
,
where
F
is the number of faces (clusters)
and
V
is the number of vertices.
Therefore, from (1), we have
i(3n^ + 2n) +
1
= k
+ (n^ +
n)
,
which gives
(2)
n^ = 2(k - 1)
.
Hence from (1) and (2),
n
J E, = 2E - number of edges around the perimeter
^
1
(^3)
= 2E - (n^ + n)
= 6(k-l)
where
n^
+ n - n^
,
is the number of "3-edge" vertices on the boundary of
-13-
^
(Relationship (3) holds for all partitions in which the vertices of the
polygon
have
tjil
tv."o
edges meeting then, and all remaining vertices
have three.)
[II)
Let
Next, we will find a lower bound for
WSS
.
be the \dlthin polygon sum of squares of the ith cluster.
USS.
k
Then
WSS =
By Lernma
Z
WSS.
1
^
.
we have.
1,
Therefore,
1X1
k
where
g(E.) =
^ liiOTE"
X
"^
i
Now the minimura of
Z
k
1
I
^=1'
^°^
F^^
^>
••"
^^
'
X
with all the E.'s being real numbers is
A.2g(E,)
X
1
1
^^"^
1
J
not greater than its minimum with all the E.'s being integers, when
both are subjected to the constraints:
k
k
(i)
E A. = A
1
and
I
1
Consider the minimum value of
numbers.
(ii)
1
Z
E. = 6(k-l) + n - n„
B
1
.
with all the E.'s being real
A^g(E.)
Using Lagrange multipliers, this minimum must satisfy
(iii)
A^g(E^) = ci
where
c
and
(iii) and (iv)
and
(iv)
\h^^^ i\)
are constants independent of
c
i
= c^
—^1
= constant
for
_14-
i=l,
..-,
k
,
It follows from
.
that
=
g(Ei)
,
.
But it can be shown that
E
the first derivative of
,
is a monotone decreasing function of
E.
i
-1/g
.
1-
k
Therefore, the minimum of
Z
must
A.^g(E.)
ha-.'e
^
1
^
+
6 -
k
(v)
E
= E E./k = 6 - (n„
n)/k
and
A.
(vi)
for all
= A/k
5 i < k
1
Thus.
^
WSS = Z VSS^ -
(4)
Now, for
k > 4
g(E.)
f A^
8(6 _ (n^ +
*
a lower bound of
,
It follows that
Since
1
2k
6 - (n„
+
6 -
n
6
- n)/k)
.
is 3.
B
n)/k < 6(1 + (n -
9) /6k)
is a monotone decreasing function of
E.
,
g(6 - (n^ + 6 - n)/k) > g(6(l + (n - 9)/&k))
.
Therefore, from (4), we have
(5)
WSS S
Now for fixed
n,
Therefore, since
^
6(1
g
f a2
.
+ (n-
g(6(l + (n - 9)/6k))
9) /6k)
->
as
6
is continuous,
lim inf WSS/(-^
k-*-
^
•
-i^) ^^
and the theorem follows.
-15-
^
»
.
k
->
»
,
.
.
at
Corollary
Let t-^ be a region with a connected interior in
of finite length.
A
Let
^
constant density ov&r
of squares over all
be the area of
.
Let
^
and let
be the
1/A
be the ninimum within cluster sum
V^S
partitions of j^.
k
whose boundary is
IR^
Then
Proof .
For each
tJ!^
n
be an n-sidcj polygon of area
let jj/
,
from the inside.
A /A
Then
=^
+ C
1
n
where
,
n
C
A
^
n
by
WSS
n
Thus, by putting
Then, since jn/ c
n
.
= 1/A
f
,
^
l,'SS
'
n
< WSS
.
we have
i-wss/(l^.^)>iSwss
/(^^)
k-x"
k
54
n
k->«»
n
Letting
->
»
,
and using Theorem
lim „Q(;//fA2
Theorem
k
n
k->«»
1
,
54
n
we obtain
5/3.
,
2
Under the hypothesis stated in the Corollary
k->«°
Proof
k
54
.
it is sufficient to shoi;
Using the Corollary,
lira
WSS/(^
.
-^
<
1
.
Fig. 5
Now given the region
construct a region
jy
iJ)
,
we can always
of area B
consisting
-16-
as
-*
n
Denote the minimum within cluster sum of squares over all
of
approximating
k
n
->
«
.
partitions
of
k
(1)
connected regular hexagons
£8 ^jd
(2)
is
Let
V.'SS^
^/A =
1
.
be the minimuni within cluster
'
54
^,
>
—
Ak
k-partition of
such that
Fig. 5)
and
,
partitions of
Then
(.sen
1/A
and let
WSS/jT)
of squares over all k-
be the constant density over U)
since the
,
su.-a
k
.
regular hexagons form a
'^
^
.
Now, from (1),
5A
WSS
> WSS
,
and hence
Ale
Thus,
(3)
V.'SS/(-5g
.
^) S b2/a2
,
and the theorem follows from (2) and (3)
The result of Theorem
2
only gives a lower bound for the overall
within cluster sum of squares of the k-means partition.
It falls short in
showing that the within sum of squares of the k clusters are asymptotically
equal (a conjecture due
to Professor John A. Hartigan)
.
Much work has
yet to be done to prove the conjecture for two or more dimensional distributions,
-17-
3.
EMPIRICAL ILLUSTRATIONS
In order to illustrate the asymptotic properties of bivariate k-means
clusters obtained in Section
2,
an empirical study is performed using
bivariate samples generated according to the uniform distribution on the
unit square.
It
is necessary to estimate the
WSS for various values of
k
within cluster sum of squares
from generated samples because the WSS for the
optimal k-mean partition of the unit square cannot be obtained analytically
frc large values of k.
Here, the results of three sets of experiments us-
ing different sample sizes are reported.
In Experiment One,
four different samples of size
ated from the uniform distribution on the unit square.
N = 1500 are gener-
Using k = 40, 50,
60, and 70 for the different samples, unbiased estimates WSS
the different cluster sizes are obtained.
of WSS for
The values of WSS,
N
for the vari-
are given alongside the corresponding asymptotic lower
5/3
1
bounds for WSS (that is,
54
k ) in Table 1, and the corresponding
ous values of
k
pairs are found to be in close agreement with one another.
different samples of size
the values of
k
k
are generated in Experiment Two, and
used for the three samples are
in Experiment Three,
the values of
N = 2500
Similarly, three
k = 50
,
60, and 70; while
N = 4000, and
the three generated samples are of size
used are
k = 60, 80, and 100.
The resulting
for these six experimental trials are also given in Table 1.
values of
WSS.,
N
for the various values of
k
WSS
values
Again, the
are found to be in close
agreement with the corresponding lower bounds for WSS.
Hence, these empiri-
cal results tend to indicate that the asymptotic lower bound obtained in
Theorem
2
is the WSS
for the optimal k-means partition when
large
-18-
k
becomes
Table
Sample Size (N)
1
Asymptotic distributions for clustering criteria.
Hartigan, J. A. (1978)
Annals of Statistics , 6^, 117-131.
.
Proceedings of
Hartigan, J. A. and Wong, M.A. (1979a). Hybrid Clustering.
the 12th Interface Symposium on Computer Science and Statistics ed
Jane Gentleman, University of Waterloo Press, 137-143.
,
Hartigan, J. A. and Wong, M.A. (1979b). Algorithm AS 136:
ing algorithm. Applied Statistics 28 100-108.
,
A K-means cluster-
,
Some methods for classification and analysis of multiMacQueen, J. (1967).
variate observations. Proceedings: Fifth Berkeley Symposium on
Math. Statist. Prob
1, 281-297.
.
Strong consistency of k-means clustering.
Pollard, D. (1981).
Statistics 9., 135-140.
Annals of
,
Wong, M.A. (1980). Asymptotic Properties of k-means clustering algorithm as
Sloan School of Management Working
a density estimation procedure.
Cambridge.
M.I.T.,
Paper #2000-80,
-20-
Figure 1.
Graph Configuration of sample k-means partition (k = 50)
obtained for 1500 observations from the uniform distribution.
1.0
0.8:r
0.6
0.4
0.2
0.0
0.0
0.2
7267'^G3I
Date Due
Lib-26-67
HD28.M414 no.l216- 81
Wong, M. Antho/Asymptotic
742711
DxBKS
TOfiO
properties o
00.13.3.43.0..
0D2 DOS
flflT
Download