The introduction of the 40 algorithms whose correct rate on training set is higher than 50%. BayesNet BayesNet learns Bayesian networks under the assumptions that normal attributes and no missing values with two different algorithms for estimating the conditional probability tables of the network. K2 or TAN algorithm or more sophisticated methods is employed to search. ComplementNaiveBaye ComplementNaiveBaye builds and uses a Complement class Naive Bayes classifier. (Jason et al. 2003) NaiveBayes NaiveBayes implements the probabilistic Naive Bayes classifier. And kernel density estimators is employed in this classifier. (George and Pat 1995) NaiveBayesMultinomial NaiveBayesMultinomial implements the multinomial Bayes classifier which is a modified form of Naive Bayes by accommodating words frequencies. (Andrew and Kamal 1998) NaiveBayesSimple NaiveBayesSimple builds and uses a simple Naive Bayes classifier. Normal distribution is employed to model numeric attributes. (Richard and Peter 1973) NaiveBayesUpdateable NaiveBayesUpdateable is the updateable version of NaiveBayes which can process only one instance at a time. Kernel estimator but not discretization is employed in this classifier. (Jason et al.2003) Logistic Logistic builds and uses a multinomial logistic regression model with a ridge estimator which can guard against overfitting by penalizing large coefficients. (le and van 1992) MultilayerPerceptron MultilayerPerceptron is a neural network that trains using backpropagation to classify instances. The network can be built either by hand or an algorithm which can also be monitored and modified during training time. SimpleLogistic SimpleLogistic builds linear logistic regression models. In order to fit this models, LogitBoost with simple regression functions as base learners is employed. The optimal number of iterations toperform is determined by using cross-validated, which supports automatic attribute selection.(Niels et al. 2005, Marc et al. 2005 ) SMO SMO implements John Platt's sequential minimal optimization algorithm, using polynomial or Gaussian kernels, for training a support vector classifier. (Platt 1998, Keerthi 2001, Trevor and Robert 1998) IB1 IB1 is a nearest-neighbour classifier. Normalized Euclidean distance is employed to find the training instance closest to the given test instance, and it predicts the same class as this training instance. If several instances have the same (smallest) distance to the test instance, the first one found is used. (Aha and Kibler 1991) IBk IBK is a k-nearest-neighbour classifier that uses Euclidean distance metric. The number of nearest neighbors can be determined automatically using leave-one-out cross-validation. (Aha and Kibler 1991) Kstar KStar is a nearest-neighbor classifier using a generalized distance function which is defined as the complexity of transforming one instance into another. It uses an entropy-based distance function which is different from other instance-based learners. (John and Leonard 1995) BFTree BFTree builds a best-first decision tree which uses binary split for both nominal and numeric attributes. (Shi 2007, Jerome et al. 2000) J48 J48 generates a pruned or unpruned C4.5 decision tree. (Ross 1993) J48graft J48graft generates a grafted (pruned or unpruned) C4.5 decision tree. (Geoff 1999) NBTree NBTree is a hybrids between decision tree and Naive Bayes which creates trees whose leaves are Naive Bayes classifiers for instances that reach the leaf. (Ron 1996) RandomForest RandomForest constructs random forests by bagging ensembles of random trees. (Leo 2001) REPTree REPTree builds a decision or regression tree using information gain or variance, and reduced-error pruning is employed to prune this tree. SimpleCart SimpleCart implements minimal cost-complexity pruning which deals with missing values by using the method of fractional instances instead of surrogate split method. (Leo 1984) DecisionTable DecisionTable builds a simple decision table majority classifier which evaluates feature subsets using best-first search and use cross-validation for evaluation. (Ron 1995) Jrip Jrip implements Repeated Incremental Pruning to Produce Error Reduction (RIPPER), which is an optimized version of IREP. (William 1995) PART PART generates a PART decision list using separate-and-conquer. It builds a partial C4.5 decision tree in each iteration and makes the best leaf into a rule. (Eibe and Ian 1998) AttributeSelectedClassifier AttributeSelectedClassifier selects attributes to reduce the data’s dimensionality before passing it to the classifier. Bagging Bagging bags a classifier to reduce variance which can do classification and regression depending on the base learner. (Leo 1996) ClassificationViaClustering ClassificationViaClustering uses a cluster for classification which uses a fixed number of clusters in cluster algorithms. The number of clusters to generate is equal to the number of class labels in the dataset in order to obtain a useful model. ClassificationViaRegression ClassificationViaRegressions performs classification using regression methods. Class is binarized and one regression model is built for each class value. (Frank et al. 1998) Dagging Dagging creates a number of disjoint, stratified folds out of the data and feeds each chunk of data to a copy of the supplied base classifier. Since all generated base classifiers are put into the vote classifier, majority voting is employed to predict. (Ting and Witten 1997) Decorate Decorate builds diverse ensembles of classifiers by using specially constructed artificial training examples. (Melville and Mooney 2003, Melville and Mooney 2004) END END builds an ensemble of nested dichotomies to handle multi-class datasets with 2-class classifiers. (Dong et al.2005, Eibe and Stefan 2004) EnsembleSelection EnsembleSelection uses ensemble selection method to combine several classifiers from libraries of thousands of models which are generated using different learning algorithms and parameter settings. (Caruana 2004) FilteredClassifier FilteredClassifier runs an arbitrary classifier on data which has been passed through an arbitrary filter whose structure is based exclusively on the training data. And test instances will be processed by the filter without changing their structure. LogitBoost LogitBoost performs additive logistic regression using a regression scheme as the base learner. And it can handle multi-class problems. (Friedman 1998) MultiClassClassifier MultiClassClassifier handles multi-class datasets with 2-class classifiers using any of the following methods: one versus all the rest, pairwise classification using voting to predict, exhaustive error-correcting codes and randomly selected error-correcting codes. RacedIncrementalLogitBoost RacedIncrementalLogitBoost learns large datasets by way of racing LogitBoosted committees and operates incrementally by processing that datasets in batches. RandomCommittee RandomCommittee builds an ensemble of randomizable base classifiers which are built using a different random number seed (but based one the same data). The final prediction is a straight average of the predictions generated by the individual base classifiers. RandomSubSpace RandomSubSpace constructs a decision tree based classifier that maintains highest accuracy on training data and improves on generalization accuracy as it grows in complexity. The classifie consists of multiple trees constructed systematically by pseudorandomly selecting subsets of components of the feature vector, that is, trees constructed in randomly chosen subspaces. (Tin 1998) ClassBalancedND ClassBalancedND handles multi-class datasets with 2-class classifiers by building a random class-balanced tree structure. (Dong et al.2005, Eibe and Stefan 2004) DataNearBalancedND DataNearBalancedND handles multi-class datasets with 2-class classifiers by building a random data-balanced tree structure. (Dong et al.2005, Eibe and Stefan 2004) ND ND handles multi-class datasets with 2-class classifiers by building a random tree structure. (Dong et al.2005, Eibe and Stefan 2004) Reference: Aha, D., Kibler, D. 1991. Instance-based learning algorithms. Machine Learning. 6:37-66. Andrew Mccallum, Kamal Nigam. 1998. A Comparison of Event Models for Naive Bayes Text Classification. In: AAAI-98 Workshop on 'Learning for Text Categorization'. Caruana, Rich, Niculescu, Alex, Crew, Geoff, and Ksikes, Alex. 2004. Ensemble Selection from Libraries of Models, The International Conference on Machine Learning (ICML'04). Dong Lin, Eibe Frank, Stefan Kramer. 2005. Ensembles of Balanced Nested Dichotomies for Multi-class Problems. In: PKDD, 84-95. Eibe Frank, Ian H. Witten. 1998. Generating Accurate Rule Sets Without Global Optimization. In: Fifteenth International Conference on Machine Learning, 144-151. Eibe Frank, Stefan Kramer. 2004. Ensembles of nested dichotomies for multi-class problems. In: Twenty-first International Conference on Machine Learning. Frank, E., Wang, Y., Inglis, S., Holmes, G., Witten, I.H. 1998. Using model trees for classification. Machine Learning. 32(1):63-76. Friedman, J., Hastie, T., Tibshirani, R. 1998. Additive Logistic Regression: a Statistical View of Boosting. Stanford University. Geoff Webb. 1999. Decision Tree Grafting From the All-Tests-But-One Partition. In, San Francisco, CA. George H. John, Pat Langley. 1995. Estimating Continuous Distributions in Bayesian Classifiers. In: Eleventh Conference on Uncertainty in Artificial Intelligence, San Mateo, 338-345. Ian H. Witten, Eibe Frank. 2005. Data Mining Practical Machine Learning Tools and Techniques (Second Edition), MORGAN KAUFMANN PUBLISHER. Jason D. Rennie, Lawrence Shih, Jaime Teevan, David R. Karger. 2003. Tackling the Poor Assumptions of Naive Bayes Text Classifiers. In: ICML, 616-623. Jerome Friedman, Trevor Hastie, Robert Tibshirani. 2000. Additive logistic regression : A statistical view of boosting. Annals of statistics. 28(2):337-407. John G. Cleary, Leonard E. Trigg. 1995. K*: An Instance-based Learner Using an Entropic Distance Measure. In: 12th International Conference on Machine Learning, 108-114. Keerthi, S.S., Shevade, S.K. C. Bhattacharyya, K.R.K. Murthy. 2001. Improvements to Platt's SMO Algorithm for SVM Classifier Design. Neural Computation. 13(3):637-649. le Cessie, S., van Houwelingen, J.C. 1992. Ridge Estimators in Logistic Regression. Applied Statistics. 41(1):191-201. Leo Breiman, Jerome H. Friedman, Richard A. Olshen, Charles J. Stone. 1984. Classification and Regression Trees. Wadsworth International Group, Belmont, California. Leo Breiman. 1996. Bagging predictors. Machine Learning. 24(2):123-140. Leo Breiman. 2001. Random Forests. Machine Learning. 45(1):5-32. Marc Sumner, Eibe Frank, Mark Hall. 2005. Speeding up Logistic Model Tree Induction. In: 9th European Conference on Principles and Practice of Knowledge Discovery in Databases, 675-683. Melville, Mooney, R.J. 2003. Constructing Diverse Classifier Ensembles Using Artificial Training Examples. In: Eighteenth International Joint Conference on Artificial Intelligence, 505-510. Melville, Mooney, R.J. 2004. Creating Diversity in Ensembles Using Artificial Data. Information Fusion: Special Issue on Diversity in Multiclassifier Systems. Niels Landwehr, Mark Hall, Eibe Frank. 2005. Logistic Model Trees. Platt, J. 1998. Machines using Sequential Minimal Optimization. In B. Schoelkopf and C. Burges and A. Smola, editors, Advances in Kernel Methods - Support Vector Learning. Richard Duda, Peter Hart. 1973. Pattern Classification and Scene Analysis. Wiley, New York. Ron Kohavi. 1995. The Power of Decision Tables. In: 8th European Conference on Machine Learning, 174-189. Ron Kohavi. 1996. Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid. In: Second International Conference on Knoledge Discovery and Data Mining, 202-207. Ross Quinlan. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo, CA. Shi Haijian. 2007. Best-first decision tree learning. Hamilton, NZ. Tin Kam Ho. 1998. The Random Subspace Method for Constructing Decision Forests. IEEE Transactions on Pattern Analysis and Machine Intelligence. 20(8):832-844. Ting, K. M., Witten, I. H. 1997. Stacking Bagged and Dagged Models. In: Fourteenth international Conference on Machine Learning, San Francisco, CA, 367-375. Trevor Hastie, Robert Tibshirani. 1998. Classification by Pairwise Coupling. In: Advances in Neural Information Processing Systems. William W. Cohen. 1995. Fast Effective Rule Induction. In: Twelfth International Conference on Machine Learning, 115-123.