Document

Boosting

Rong Jin

Inefficiency with Bagging

•

Inefficient boostrap sampling:

Every example has equal chance to be sampled

• No distinction between “easy” examples and “difficult” examples

Inefficient model combination:

• A constant weight for each classifier

• No distinction between accurate classifiers and inaccurate classifiers

D 1 h

1

Bagging

D

Boostrap Sampling

D 2

… h

2

 i

D k h k

Improve the Efficiency of Bagging

Better sampling strategy

• Focus on the examples that are difficult to classify

Better combination strategy

• Accurate model should be assigned larger weights

Intuition

Classifier1

+

Classifier2 +

Classifier3

X

1

Y

1

X

2

Y

2

X

3

Y

3

X

4

Y

4

X

1

Y

1

X

3

Y

3

X

1

Y

1

AdaBoost Algorithm

AdaBoost Example:

 t

=ln2

Sample

D

0

: x

1

, y

1 x

2

, y

2

1/5 1/5 x

3

, y

3

1/5 x

4

, y

4

1/5 x

5

, y

5

1/5 x

1

, y

1 x

Training

3

, y

3 x

5

, y

5 h

1

D

1

: h

2

D

2

:

 x

1

, y

1

 x

2

, y

2

2/7 1/7

 x

3

, y

3

1/7

 x

4

, y

4

2/7

 x

5

, y

5

1/7 x

1



, y

1

 x

2

, y

2

2/9 1/9

 x

3

, y

3

1/9

 x

4

, y

4

4/9

 x

5

, y

5

1/9

Update

Weights h

1

Sample

Update

Weights x

1

, y

1

Training x

3

, y

3 h

2

Sample …

How To Choose

 t in AdaBoost?

How to construct the best distribution D t+1

(i)

1.

D t+1

(i) should be significantly different from D t

(i)

2.

D t+1

(i) should create a situation that classifier h t performs poorly

How To Choose

 t in AdaBoost?

Optimization View for Choosing

 t h t

(x): x



{1,-1}; a base (weak) classifier

H

T

(x): a linear combination of basic classifiers

Goal: minimize training error

Approximate error swith a exponential function

AdaBoost: Greedy Optimization

Fix H

T-1

(x), and solve h

T

(x) and

 t

Empirical Study of AdaBoost

AdaBoosting decision trees

• Generate 50 decision trees by

AdaBoost

• Linearly combine decision trees using the weights of AdaBoost

In general:

• AdaBoost = Bagging > C4.5

• AdaBoost usually needs less number of classifiers than Bagging

Bia-Variance Tradeoff for AdaBoost

• AdaBoost can reduce both variance and bias simultaneously variance bias single decision tree

Bagging decision tree

AdaBoosting decision trees

Document

Boosting

Related documents

Products

Support

Document

Boosting

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib