Assigment 4 (Paper Based)

advertisement
Name: Abdulaziz Alobaid
Assignment 4
ID: 1215830
In this assignment I will summarize this paper: Y. Freund and R.
Schapire, Experiments with a new boosting algorithm, International
Conference on Machine Learning, 1996.
In this paper the authors try to assess the performance of AdaBoost algorithm on real
learning problems. They achieve this situation by performing two experiments. The
first experiment is comparing the AdaBoost with bagging algorithm when used to
aggregate various classifiers. This paper compared the performance of these two
methods by using the machine learning benchmarks. The second experiment is
measured the performance of AdaBoost algorithm by using a nearest-neighbor
classifier on an OCR problem. This paper chosen bagging algorithm to compared with
AdaBoost because both methods work by combining many classifiers but each has
different disruption for each round. The weak algorithms which have been chosen to
compared these two algorithms are: An algorithm that searches for very simple
prediction rules which test on a single attribute (They called it FindAttrTest) , An
algorithm that searches for a single good decision rule that tests on a conjunction of
attribute tests (They called it FindDecRule) and C4.5 decision-tree algorithm.
The result of the first experiment shows that boosting algorithm performance is better
than bagging algorithms when the weak learning algorithm generates fairly simple
classifiers but in the C4.5 algorithm there is no big different in performance.
Moreover, this experiment shows that boosting algorithm could be used with very
simple algorithm to construct classifiers that are quite good relative. In the second
experiment the result shows that there is improvement in performance in boosting
algorithm on a nearest neighbor classifier. This improvement is due to two reasons.
The first reason is that the boosting generates a hypothesis whose error on the training
set is small by combining many hypotheses whose error may be large. The second
reason is that the boosting take a weighted majority over many hypotheses thus, it has
the effect of reducing the random variability of the combined hypothesis.
Download