Summary

advertisement
Kernel Density Estimation
Theory and Application in Discriminant
Analysis
Thomas Ledl
Universität Wien
Contents:





Introduction
Theory
Aspects of Application
Simulation Study
Summary
Introduction
25 observations:
0
1
2
3
Which distribution?
Introduction
Introduction
Theory
Application Aspects
Simulation Study
Summary
4
0.0 0.1 0.2 0.3 0.4
?
0
1
0.2
0.4
2
2
3
?
0
1
3
2
4
0.8
1
0.6
0.6
0.0
0.1
0.2
0.3
0.4
?
0.2
1
4
3
4
0.4
0.2
0
0.0
0.0
0.0
0
2
3
0
0
4
?
1
2
1
3
2
4
?
3
4
0
1
2
3
Kernel density estimator model:
Introduction
Theory
Application Aspects
Simulation Study
Summary
K(.) and h to choose
4
0
2
3
4
„large“ h
0.4
0.6
„small“ h
0.2
0.1
0.0
0.0
triangular
0.2
0.3
0.4
kernel/
bandwidth:
1
1
2
3
4
0
1
2
3
4
0.2
0.3
0.4
0.1
0.2
0.0
0.0
gaussian
0.4
0.6
0
0
1
2
3
4
0
1
2
3
4
Question 1:
Which choice of K(.)
and h is the best for a
descriptive purpose?
Introduction
Theory
Application Aspects
Simulation Study
Summary
Classification:
0,14
0,12
0,1
0,08
0,06
-3
0,04
-1,5
0,02
0
2,6
1,2
0,5
-0,2
-0,9
-1,6
-2,3
-3
3
1,9
0
1,5
0,09
0,08
0,07
Introduction
Introduction
0,06
0,05
0,04
0,03
0,02
0,01
0,00
Theory
Application Aspects
2,6
1
1,8
-3
,2
-1,3
,6
Summary
4
Simulation Study
2,1
0,4
Levelplot – LDA (based on
assumption of a
multivariate normal
distribution):
Classification:
5
7
5
3
3 3
3 33 33
3
3
3
3 33 33
3 3 3
3
5
3
V2
5
51
Introduction
Introduction
-1
1
Theory
Application Aspects
-5
Simulation Study
1
5
5
5
5
55
5
1
1
1
1
1 1
5
1
1
1 1
1
1
1
1
2 1
44
4
4
22 2
4
4
5
2
4 4
2
2 1 22
2
2
2 4
2
4
2
2
2
2
2
2
5
5
5
5
5
5
4
4
44
4
4
4
4
4
4
Summary
-1
1
V1
3
5
7
Classification:
5
7
5
3
3 3
3 33 33
3
3
3
3 33 33
3 3 3
3
5
3
V2
5
51
Introduction
Introduction
-1
1
Theory
Application Aspects
-5
Simulation Study
1
5
5
5
5
55
5
1
1
1
1
1 1
5
1
1
1 1
1
1
1
1
2 1
44
4
4
22 2
4
4
5
2
4 4
2
2 1 22
2
2
2 4
2
4
2
2
2
2
2
2
5
5
5
5
5
5
4
4
44
4
4
4
4
4
4
Summary
-1
1
V1
3
5
7
Classification:
Levelplot – KDE
classificator:
5
5
7
5
5
V2
V2
3
Introduction
Introduction
-1
Theory
Application Aspects
-5
3
3 3
33 33333 33 33
3
3
33 3333 33 33
3 3 33
33 3 3
33 33 5
3
5
5
1
515
1
151 1
1 1
1 111
1
1
1
21 1
2 21 4
2 2 4
1
2 25
22 2
1
2 21 2225
2 2 2 1 22
2
2 2
2
2
2
2
2
22
22
2
2
5
5
1
5
55
5 55
55
5
5
5
5
5
5
55
5
11
11 1 15
5
5
1
5
5
11 11 1 4
4
4
4
1
41
4
4
4 44 4 4 4 4
44
44
4
4 4
4
4
4
4
4 4
4 4
2 4
4
4
42
4
5
5
5
5
4
4
4
4
2
2
Simulation Study
Summary
-1
1
V1
V1
3
5
7
Question 2:
Introduction
Introduction
Theory
Application Aspects
Simulation Study
Summary
Performance of
classification based
on KDE in more than
2 dimensions?
Theory
Essential issues



Introduction
Theory
Application Aspects
Simulation Study
Summary
Optimization criteria
Improvements of the standard
model
Resulting optimal choices of the
model parameters K(.) and h
Essential issues



Introduction
Theory
Application Aspects
Simulation Study
Summary
Optimization criteria
Improvements of the standard
model
Resulting optimal choices of the
model parameters K(.) and h
Optimization criteria
Lp-distances:
Introduction
Theory
Application Aspects
Simulation Study
Summary
0.4
f(.)
0.0
0.2
g(.)
-2
Introduction
Theory
Application Aspects
Simulation Study
Summary
-1
0
1
2
3
4
0.4
0.2
0.0
-1
0
1
2
3
4
0.15
-2
0.10
Introduction
0.05
Theory
Application Aspects
Summary
0.0
Simulation Study
-2
-1
0
1
2
3
4
=IAE
„Integrated absolute error“
=ISE
0.15
„Integrated squared error“
0.10
Introduction
0.05
Theory
Application Aspects
Summary
0.0
Simulation Study
-2
-1
0
1
2
3
4
=IAE
„Integrated absolute error“
=ISE
0.15
„Integrated squared error“
0.10
Introduction
0.05
Theory
Application Aspects
Summary
0.0
Simulation Study
-2
-1
0
1
2
3
4
Other ideas:
Introduction

Minimization of the maximum vertical
distance

Consideration of horizontal distances for
a more intuitive fit (Marron and Tsybakov,
1995)
Compare the number and position of
modes
Theory

Application Aspects
Simulation Study
Summary
Overview about some minimization
criteria

L1-distance=IAE

L-distance=Maximum
difference

„Modern“ criteria,
which include a kind of
measure of the
horizontal distances
L2-distance=ISE,
 Most commonly used
MISE,AMISE,...

Introduction
Theory

Application Aspects
Simulation Study
Summary

Difficult mathematical
tractability
Does not consider
overall fit
Difficult mathematical
tractability
ISE, MISE, AMISE,...
0.0
Introduction
Density
0.2 0.4
0.05
ISE is a random variable
MISE=E(ISE), the expectation of ISE
AMISE=Taylor approximation of MISE,
easier to calculate
0.04
-3
1 2 3
0.03
x
Simulation Study
Summary
MISE,IV,ISB
AMISE,AIV,AISB
0.0
Application Aspects
0.01
0.02
Theory
-1
-1.2
-1.0
-0.8
-0.6
-0.4
log10(h)
-0.2
0.0
0.2
Essential issues



Introduction
Theory
Application Aspects
Simulation Study
Summary
Optimization criteria
Improvements of the standard
model
Resulting optimal choices of the
model parameters K(.) and h
The AMISE-optimal bandwidth
Introduction
Theory
Application Aspects
Simulation Study
Summary
The AMISE-optimal bandwidth
dependent on the
kernel function K(.)
Introduction
minimized by
0.6
Theory
0.4
0.2
Application Aspects
0.0
Simulation Study
Summary
„Epanechnikov kernel“
-1.0
-0.5
0.0
0.5
1.0
The AMISE-optimal bandwidth
dependent on the unknown
density f(.)
Introduction
Theory
Application Aspects
Simulation Study
Summary
How to proceed?
Data-driven bandwidth selection
methods
Leave-one-out selectors


Criteria based on substituting R(f“)
in the AMISE-formula
Introduction
Theory
Application Aspects


Simulation Study
Summary
Maximum Likelihood CrossValidation
Least-squares cross-validation
(Bowman, 1984)

„Normal rule“ („Rule of thumb“;
Silverman, 1986)
Plug-in methods (Sheather
and Jones, 1991; Park and
Marron,1990)
Smoothed bootstrap
Data-driven bandwidth selection
methods
Leave-one-out selectors


Criteria based on substituting R(f“)
in the AMISE-formula
Introduction
Theory
Application Aspects


Simulation Study
Summary
Maximum Likelihood CrossValidation
Least-squares cross-validation
(Bowman, 1984)

„Normal rule“ („Rule of thumb“;
Silverman, 1986)
Plug-in methods (Sheather
and Jones, 1991; Park and
Marron,1990)
Smoothed bootstrap
Least squares cross-validation (LSCV)



Introduction
Theory
Application Aspects
Simulation Study
Summary

Undisputed selector in the 1980s
Gives an unbiased estimator for the ISE
Suffers from more than one local
minimizer – no agreement about which
one to use
Bad convergence rate for the resulting
bandwidth hopt
Data-driven bandwidth selection
methods
Leave-one-out selectors
Maximum Likelihood CrossValidation
Least-squares cross-validation
(Bowman, 1984)


Criteria based on substituting R(f“)
in the AMISE-formula
Introduction
Theory
Application Aspects


Simulation Study
Summary

„Normal rule“ („Rule of thumb“;
Silverman, 1986)
Plug-in methods (Sheather
and Jones, 1991; Park and
Marron,1990)
Smoothed bootstrap
Normal rule („Rule of thumb“)



Introduction
Theory
Application Aspects
Simulation Study
Summary
Assumes f(x) to be N(,2)
Easiest selector
Often oversmooths the function
The resulting bandwidth is
given by:
Data-driven bandwidth selection
methods
Leave-one-out selectors
Maximum Likelihood CrossValidation
Least-squares cross-validation
(Bowman, 1984)


Criteria based on substituting R(f“)
in the AMISE-formula
Introduction
Theory
Application Aspects


Simulation Study
Summary

„Normal rule“ („Rule of thumb“;
Silverman, 1986)
Plug-in methods (Sheather
and Jones, 1991; Park and
Marron,1990)
Smoothed bootstrap
Plug in-methods (Sheather and Jones,
1991; Park and Marron,1990)


Introduction
Theory
Application Aspects
Simulation Study
Summary


Does not substitute R(f“) in the AMISEformula, but estimates it via R(f(IV)) and
R(f(IV)) via R(f(VI)),etc.
Another parameter i to chose (the
number of stages to go back) – one
stage is mostly sufficient
Better rates of convergence
Does not finally circumvent the problem
of the unknown density, either
The multivariate case
h  H...the bandwidth matrix
Introduction
Theory
Application Aspects
Simulation Study
Summary
Issues of generalization in d dimensions




Introduction
Theory
Application Aspects
Simulation Study
Summary
d2 instead of one bandwidth parameter
Unstable estimates
Bandwidth selectors are essentially
straightforward to generalize
For Plug-in methods it is „too difficult“ to
give succint expressions for d>2
dimensions
Aspects of
Application
Essential issues



Introduction
Theory
Application Aspects
Simulation Study
Summary
Curse of dimensionality
Connection between goodness-of-fit and
optimal classification
Two methods for discrimatory purposes
Essential issues



Introduction
Theory
Application Aspects
Simulation Study
Summary
Curse of dimensionality
Connection between goodness-of-fit and
optimal classification
Two methods for discrimatory purposes
The „curse of dimensionality“
 The data „disappears“ into the
distribution tails in high dimensions
Probability mass NOT in the "Tail" of a Multivariate
Normal Density
100%
Introduction
80%
60%
Theory
40%
20%
Application Aspects
0%
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20
# of dimensions
Simulation Study
Summary
d : a good fit in the tails is desired!
The „curse of dimensionality“
 Much data is necessary to obey a
constant estimation error in high
dimensions
Introduction
Theory
Application Aspects
Simulation Study
Summary
Dimensionality
Required sample size
1
2
3
4
5
6
7
8
9
10
4
19
67
223
768
2790
10700
43700
187000
842000
Essential issues



Introduction
Theory
Application Aspects
Simulation Study
Summary
Curse of dimensionality
Connection between goodness-of-fit and
optimal classification
Two methods for discrimatory purposes
Essential issues
AMISE-optimal
parameter choice

L2-optimal


Optimal classification
(in high dimensions)

L1-optimal
(Misclassification rate)
Worse fit
in the tails

Estimation of tails
important
Calculation
intensive for
large n

Many observations
required for a reasonable
fit
Essential issues



Introduction
Theory
Application Aspects
Simulation Study
Summary
Curse of dimensionality
Connection between goodness-of-fit and
optimal classification
Two methods for discrimatory purposes
Method 1:


Introduction
Theory
Application Aspects
Simulation Study
Summary
Reduction of the data onto a subspace
which allows a somewhat accurate
estimation, however does not destoy too
much information  „trade-off“
Use the multivariate kernel density
concept to estimate the class densities
Method 2:



Introduction
Theory
Application Aspects
Simulation Study
Summary
Use the univariate concept to „normalize“
the data nonparametrically
Use the classical methods like LDA and
QDA for classification
Drawback: calculation intensive
Method 2:
f ( x)
0. 05
0. 10
0. 15
0. 20
a)
0
2
4
6
8
6
8
x
Introduction
1. 0
F(x)
G (x)
Simulation Study
0. 4
0. 0
0. 2
Application Aspects
0. 6
0. 8
Theory
b)
0
Summary
2
4
x
t(x) t(x+)
x x+
Simulation Study
Criticism on former simulation studies




Introduction
Theory
Application Aspects
Simulation Study
Summary

Carried out 20-30 years ago
Out-dated parameter selectors
Restriction to uncorrelated normals
Fruitless estimation because of
high dimensions
No dimension reduction
The present simulation study




Introduction
Theory
Application Aspects
Simulation Study
Summary

21 datasets
14 estimators
2 error criteria
 21x14x2=588 classification
scores
Many results
The present simulation study




Introduction
Theory
Application Aspects
Simulation Study
Summary

21 datasets
14 estimators
2 error criteria
 21x14x2=588 classification
scores
Many results
Each dataset has...



Introduction
Theory
Application Aspects
Simulation Study
Summary

...2 classes for distinction
...600 observations/class
...200 test observations, 100
produced by each class
... therfore dimension 1400x10
Normal- noise small
0. 0
0. 0
0. 2
0. 2
0. 4
0. 4
Normal
-4
0
2
4
-4
-2
0
2
4
2
4
Normal- noise large
0. 0
0. 0
0. 15
0. 2
0. 30
0. 4
Normal- noise medium
-4
-2
0
2
4
-4
-2
0
Bim odal - close
0. 0
0. 0
0. 4
0. 10
0. 8
0. 20
Exponent ial (1)
0
1
2
3
4
5
6
0. 10
0. 20
Bim odal - f ar
0. 0
Univariate prototype
distributions:
-2
-2
0
2
4
6
8
-2
0
2
4
6
8
Dataset Nr. Abbrev.
1
2
3
4
5
6
7
8
9
10
NN1
NN2
NN3
SkN1
SkN2
SkN3
Bi1
Bi2
Bi3
Bi4
contains
10 normal distributions with "small noise"
10 normal distributions with "medium noise"
10 normal distributions with "small noise"
2 skewed (exp-)distributions and 7 normals
5 skewed (exp-)distributions and 5 normals
7 skewed (exp-)distributions and 3 normals
4 normals, 4 skewed and 2 bimodal (close)-dist.
4 normals, 4 skewed and 2 bimodal (close)-dist.
8 skewed and 2 bimodal (far)-dist.
8 skewed and 2 bimodal (far)-dist.
10 datasets having equal covariance matrices
+10 datasets having unequal covariance matrices
+ 1 insurance dataset
21 datasets total
Simulation Study




Introduction
Theory
Application Aspects
Simulation Study
Summary

21 datasets
14 estimators
2 error criteria
 21x14x2=588 classification
scores
Many results
Method 1(multivariate density
estimator):
Principal component reduction onto
2,3,4 and 5 dimensions (4) x
multivariate „normal rule“ and
multivariate LSCV-criterion ,resp. (2)
8 estimators
Method 2(„marginal
normalizations“):
Univariate normal rule and
Sheather-Jones plug-in (2) x
subsequent LDA and QDA (2)
4 estimators
Classical methods:
LDA and QDA (2)
2 estimators
14 estimators
Simulation Study




Introduction
Theory
Application Aspects
Simulation Study
Summary

21 datasets
14 estimators
2 misclassification criteria
 21x14x2=588 classification
scores
Many results
Misclassification Criteria


Introduction
Theory
Application Aspects
Simulation Study
Summary
The classical Misclassification rate
(„Error rate“)
The Brier score
Simulation Study




Introduction
Theory
Application Aspects
Simulation Study
Summary

21 datasets
14 estimators
2 error criteria
 21x14x2= 588 classification
scores
Many results
Results
The choice of the
misclassification criterion is
not essential
Error rate vs. Brier score
0,80
Introduction
Theory
Application Aspects
Simulation Study
Summary
Brier score
0,70
0,60
0,50
0,40
0,30
0,20
0,10
0,00
0,0
0,1
0,2
0,3
Error rate
0,4
0,5
0,6
Results
The choice of the multivariate
bandwidth parameter (method 1)
is not essential in most cases
Error rates for method 1
0,500
0,450
0,400
0,350
Introduction
Theory
LSCV
0,300
0,250
0,200
0,150
Application Aspects
Superiority of LSCV in
case of bimodals
having unequal
covariance matrices
0,100
0,050
Simulation Study
Summary
0,000
0,000
0,100
0,200
0,300
"Normal rule"
0,400
0,500
0,600
Results
The choice of the univariate
bandwidth parameter (method
2) is not essential
Error rates for method 2
Introduction
Theory
Application Aspects
Simulation Study
Sheather-Jones selector
0,300
0,250
0,200
0,150
0,100
0,050
0,000
0,000
0,050
0,100
0,150
"Normal rule"
Summary
0,200
0,250
0,300
Results
The best trade-off is a
projection onto 2-3 dimensions
Error rate regarding different subspaces
0,350
0,300
Introduction
Theory
Application Aspects
Simulation Study
Summary
0,250
NN-distributions
0,200
0,150
0,100
SkN-distributions
Bi-distributions
0,050
0,000
2
3
4
# dimensions
5
Results
0,400
0,300
0,350
0,250
0,300
0,200
0,250
0,150
0,200
0,150
0,100
0,100
0,050
0,050
0,000
0,000
NN
11
NNNN11
21
NNNN21
31
Sk NN31
N1
1
SkSkN11
N2
1
SkSkN21
N3
SkN31
1
Bi
1Bi11
1
Bi
2Bi21
1
Bi Bi31
31
Bi Bi41
41
Error
Errorrate
rate
Equal
Method 21
Equal covariance
covariance matrices: Method
performs
inferior
against
LDA
sometimes
slightly
improves
LDA (classical)
LDA (classical)
Normal
rule (in -method
2)
LSCV(3)
method1
Results
Unqual covariance
covariance matrices:
Unequal
matrices:
Method2 2often
performs
quiteessentially
poor, but
Method
improves
not for skewed distributions
NN
1N2
N N 12
2N22
NN 2
3N
SkS 232
Nk1N
SkS 212
Nk2N
SkS 222
Nk3N
232
BiBi
1212
BiBi
2222
BiBi3
32 2
BiBi4
42 2
Error rate
Error rate
0,250
0,250
0,200
0,200
0,150
0,150
0,100
0,100
0,050
0,050
0,000
0,000
QDA (classical)
QDA (classical)
LSCV(3)
method1
Normal- rule
(in
method 2)
Results
Is the additional calculation
time justified?
Required calculation time
Introduction
Theory
Application Aspects
LDA,QDA
Simulation Study
Summary
multivariate "normal rule"
Preliminary univariate
normalizations,LSCV,
Sheather-Jones plug-in
Summary
Summary (1/3) – Classification Performance





Restriction to only a few dimensions
Improvements with respect to the classical discrimination
methods by marginal normalizations (especially for
unequal covariance matrices)
Poor performance of the multivariate kernel density
classificator
LDA is undisputed in the case of equal covariance
matrices and equal prior probabilities
Additional computation time seems not to be justified
Summary (1/3) – Classification Performance





Restriction to only a few dimensions
Improvements with respect to the classical discrimination
methods by marginal normalizations (especially for
unequal covariance matrices)
Poor performance of the multivariate kernel density
classificator
LDA is undisputed in the case of equal covariance
matrices and equal prior probabilities
Additional computation time seems not to be justified
Summary (1/3) – Classification Performance





Restriction to only a few dimensions
Improvements with respect to the classical discrimination
methods by marginal normalizations (especially for
unequal covariance matrices)
Poor performance of the multivariate kernel density
classificator
LDA is undisputed in the case of equal covariance
matrices and equal prior probabilities
Additional computation time seems not to be justified
Summary (1/3) – Classification Performance





Restriction to only a few dimensions
Improvements with respect to the classical discrimination
methods by marginal normalizations (especially for
unequal covariance matrices)
Poor performance of the multivariate kernel density
classificator
LDA is undisputed in the case of equal covariance
matrices and equal prior probabilities
Additional computation time seems not to be justified
Summary (1/3) – Classification Performance





Restriction to only a few dimensions
Improvements with respect to the classical discrimination
methods by marginal normalizations (especially for
unequal covariance matrices)
Poor performance of the multivariate kernel density
classificator
LDA is undisputed in the case of equal covariance
matrices and equal prior probabilities
Additional computation time seems not to be justified
Summary (2/3) – KDE for Data
Description




Great variety in error criteria, parameter selection
procedures and additional model improvements (3
dimensions)
No correspondence about a feasible error criterion
Nobody knows, what is finally optimized („upper bounds“
in L1-theory, L2-theory: ISEMISEAMISE,several
minima in LSCV,...)
Different parameter selectors are of varying quality with
respect to different underlying densities
Summary (2/3) – KDE for Data
Description




Great variety in error criteria, parameter selection
procedures and additional model improvements (3
dimensions)
No correspondence about a feasible error criterion
Nobody knows, what is finally optimized („upper bounds“
in L1-theory, L2-theory: ISEMISEAMISE,several
minima in LSCV,...)
Different parameter selectors are of varying quality with
respect to different underlying densities
Summary (2/3) – KDE for Data
Description




Great variety in error criteria, parameter selection
procedures and additional model improvements (3
dimensions)
No correspondence about a feasible error criterion
Nobody knows, what is finally optimized („upper bounds“
in L1-theory, L2-theory: ISEMISEAMISE,several
minima in LSCV,...)
Different parameter selectors are of varying quality with
respect to different underlying densities
Summary (2/3) – KDE for Data
Description




Great variety in error criteria, parameter selection
procedures and additional model improvements (3
dimensions)
No correspondence about a feasible error criterion
Nobody knows, what is finally optimized („upper bounds“
in L1-theory, L2-theory: ISEMISEAMISE,several
minima in LSCV,...)
Different parameter selectors are of varying quality with
respect to different underlying densities
Summary (3/3) – Theory vs. Application





Comprehensive theoretical results about optimal kernels
or optimal bandwidths are not relevant for classification
For discrimatory purposes the issue of estimating logdensities is much more important
Some univariate model improvements are not
generalizable
The – widely ignored – „curse of dimensionality“ forces
the user to achieve a trade-off between necessary
dimension reduction and information loss
Dilemma: Much data is required for accurate estimates –
Much data lead to a explosion of the computation time
Summary (3/3) – Theory vs. Application





Comprehensive theoretical results about optimal kernels
or optimal bandwidths are not relevant for classification
For discrimatory purposes the issue of estimating logdensities is much more important
Some univariate model improvements are not
generalizable
The – widely ignored – „curse of dimensionality“ forces
the user to achieve a trade-off between necessary
dimension reduction and information loss
Dilemma: Much data is required for accurate estimates –
Much data lead to a explosion of the computation time
Summary (3/3) – Theory vs. Application





Comprehensive theoretical results about optimal kernels
or optimal bandwidths are not relevant for classification
For discrimatory purposes the issue of estimating logdensities is much more important
Some univariate model improvements are not
generalizable
The – widely ignored – „curse of dimensionality“ forces
the user to achieve a trade-off between necessary
dimension reduction and information loss
Dilemma: Much data is required for accurate estimates –
Much data lead to a explosion of the computation time
Summary (3/3) – Theory vs. Application





Comprehensive theoretical results about optimal kernels
or optimal bandwidths are not relevant for classification
For discrimatory purposes the issue of estimating logdensities is much more important
Some univariate model improvements are not
generalizable
The – widely ignored – „curse of dimensionality“ forces
the user to achieve a trade-off between necessary
dimension reduction and information loss
Dilemma: Much data is required for accurate estimates –
Much data lead to a explosion of the computation time
Summary (3/3) – Theory vs. Application





Comprehensive theoretical results about optimal kernels
or optimal bandwidths are not relevant for classification
For discrimatory purposes the issue of estimating logdensities is much more important
Some univariate model improvements are not
generalizable
The – widely ignored – „curse of dimensionality“ forces
the user to achieve a trade-off between necessary
dimension reduction and information loss
Dilemma: Much data is required for accurate estimates –
Much data lead to a explosion of the computation time
The End
Download