IJCSMS International Journal of Computer Science and Management Studies, Vol. **, Issue **, Month Year ISSN (Online): 2231-5268 www.ijcsms.com Grouping Tournament: A New Email Classification Approach Sabah Sayed1, Samir AbdelRahman2 and Ibrahim Farag3 1TA, 2 Computer Science, Faculty of Computers and Information Cairo, Egypt s.sayed@fci-cu.edu.eg Assoc. professor, Computer Science, Faculty of Computers and Information Cairo, Egypt s.elsayed@fci-cu.edu.eg 3 Professor, Computer Science, Faculty of Computers and Information Cairo, Egypt i.farag@fci-cu.edu.eg Abstract Email communication has become one of the fastest, easiest, and cheapest ways of communication. The increased amount of emails has resulted in introducing many automated tools for email management. Email classification has recently taken much attention to help the user organize emails data into folders. This paper presents a new email classification approach called grouping tournament, in which classes or folders compete with each other for a new incoming email depending on rules similar to the rules within the world final cup. It uses the Un-weighted Pair-Group Method with Arithmetic mean (UPGMA) similarity algorithm in grouping the classes according to their similarities. The proposed Grouping Tournament approach is applied on Winnow and Maximum Entropy classifiers using the crossvalidation evaluation method. The experimental evaluation on Enron corpus proves that the introduced classification method outperforms the compared email classification methods. Keywords: Email Classification, Round Robin Tournament, Elimination Tournament, Grouping Tournament, multi-class binarization. 1. Introduction Text mining [1] refers generally to the process of extracting interesting and non-trivial patterns or information from text documents. It can be viewed as an extension of data mining or knowledge discovery from databases [2, 3]. It also deals with higher complexity because text is by nature unstructured. Text mining is a multidisciplinary field [4] involving information retrieval, text analysis, information extraction, clustering, classification, database technology, and machine learning. Text classification (or text categorization) is an important text mining application. It is the task of classifying new (unseen before) text to one or more pre-defined class(es) based on the knowledge accumulated during the training process. Email has been an efficient and popular communication mechanism as the number of internet users increase. We waste more time in filtering e-mail messages and organizing them into folders to facilitate retrieval and searching when necessary[5]. Therefore, email management is an important and growing problem for individuals and organizations as amount of email messages increases by every minute. During past decades, many tools are presented for automatically managing email data as detecting spam, searching emails, and organizing emails into specific folders. For these reasons, email classification [5] is one of the most important text classification problems in which the data storage is the user email-inbox that includes set of email messages. Email classification is the process of automatically assigning new emails to predefined classes or specific folders based on their contents and properties [5]. Email classification is different and more complex than document classification. This is because the email domain presents several challenges and categorization habits vary from one user to another; that is why automated categorization methods may perform well for one user, and fail for other [6]. An email is a semi-structured document with few, and un-formal sentences that make the preprocessing process more difficult. Email users create new folders, and do not use the other folders so many folders and messages are created, destroyed, and reorganized in other folders. They may create folders to contain group of different topics, project groups, certain senders, or unachieved tasks which mean that these folders do not necessarily correspond to unified semantic topics. Also some of users may categorize email messages, not according to its content, but according to their re-action to IJCSMS www.ijcsms.com IJCSMS International Journal of Computer Science and Management Studies, Vol. **, Issue **, Month Year ISSN (Online): 2231-5268 www.ijcsms.com these messages as reply, forward to specific persons, and make it as a reminder to do some tasks and so on. Also folder hierarchies contain a large number of different folders that may be similar in contents, and a subfolder may not be a subtopic of the parent folder content. We present a new approach for classifying E-mails; grouping tournament in which the multi-class classification process is broken down into a set of binary classification tasks. The grouping tournament has a simple implementation and it reduces the execution time by dividing the main problem into smaller ones and running them in parallel. It groups classes into groups of four and inherits its rules from World final Cup. Grouping tournament approach combines the clustering thinking within the email classification problem in a simple way. It utilizes the Un-weighted Pair-Group Method with Arithmetic mean (UPGMA) [7] as a similarity measure in forming the classes' clusters according to their similarities. In grouping tournament approach all email parts are exploited without ignoring any field to extract hidden useful information from the user email data. This is very effective in feature construction and selection steps. To accomplish our goals, the following four decisions were made: 1) A large set of email messages, Enron corpus [8] was chosen as a benchmark data for our experiments. The Enron corpus was made public during the legal investigation concerning the Enron Corporation [8]. 2) Wide-margin winnow and Maximum Entropy classifiers were chosen for applying the email classification task. 3) Cross-validation evaluation method was used in calculating accuracies and other evaluation measures. Up to our knowledge it is the best way to measure the real effectiveness of an email classifier because it considers each email message in both training and testing phase. 4) Our approach was tested against the current tournament, the classical N-way, and a proposed voting classification method results. The remainder of this paper is organized as follows: Section 2 gives a brief overview of related work in email classification. In Section 3 we state our design decisions in some issues as data preprocessing and features construction, etc. In section 4 we briefly explore the classification procedures that we used to compare our proposed approach with. Section 5 explains the proposed email classification approach. How we implement those classification methods and on what data, are described in Section 6 with the experimental analysis and results. Finally, Section 7 concludes the work and focuses on some important future directions. 2. Related Works There are many studies; with different features, design aspects and using different types of classifiers; for categorizing emails into spam/ non-spam or into multitopic folders. Svetlana Kiritchenko and Stan Matwin [9] proposed an email classification system that explores cotraining. Co-training is an algorithm that uses unlabeled data along with a few labeled examples to boost the performance of a classifier on the email domain. Their results showed that the performance of co-training depends on the learning algorithm it uses. Also it argued that Support Vector Machines significantly outperforms Naive Bayes on email classification. Svetlana Kiritchenko, Stan Matwin, and Suhayya AbuHakima [10] proposed a method of combining the temporal features (time-related features) with the traditional content-based features in email classification. They noticed that, with adding temporal features to traditional features SVM and Naïve Bayes performed better than decision trees. They concluded that temporal characteristics had complex dependencies with contentbased features and when tested alone, the results were not promising. They had a value only when combined with content in more sophisticated ways such as in Naive Bayes and SVM. Bekkerman, McCallum, and Huang [6] presented a benchmark case study of email foldering on both the Enron and SRI email datasets by comparing email classification accuracies of four classifiers (Maximum Entropy, Naive Bayes, SVM and Winnow). They found that SVM outperforms the other 3 classifiers. Naïve Bayes was found to be the worst classifier. They also proposed an evaluation method for the classification performance that divided the dataset into time-based data splits and incrementally testing classifiers performance on each split while training on all the previous splits. Unfortunately, this evaluation method is more complex in implementation and resulted in low accuracies if it was compared with the random training/testing splits. Yunqing Xia, Wei Liu, Louise Guthrie, Kam-Fai Wong [11] implemented tournament-like classification scheme using a probabilistic classification method and showed that the tournament methods round robin and elimination outperform the N-way method by 11.7% regarding precision. Also, they found that round robin takes more execution time due to its complex implementation. They used an email collection for ten users and 50 folders by mixing all the folders up and extracting the biggest 15 folders regarding the number of emails the folders contain. This corpus isn't a standard email corpus for the email categorization evaluation purpose. Also, these 15 folders belonging to ten users which is not enough to show the IJCSMS www.ijcsms.com IJCSMS International Journal of Computer Science and Management Studies, Vol. **, Issue **, Month Year ISSN (Online): 2231-5268 www.ijcsms.com users habits in the email categorization task and makes the folders' relationships disappear. 3. Design Issues There exist many design choices for how to conduct the email classification task [6]. In this section we state our design decisions in preprocessing the data, constructing features, evaluating our experiment performance, and choosing classifier algorithms. 3.1 Dataset Preprocessing Raw email datasets are usually unstructured. The Enron dataset is not an exception in that [6]. We apply some preprocessing and organization steps on the data before experiment execution. In [6] they removed the non-topical folders "all documents", "calendar", "contacts", "deleted items", "discussion threads", "inbox", "notes inbox", "sent", "sent items" and "sent mail" and then flatten all the folder hierarchies. Also they removed the attachments and the Xfolder field in the message headers that actually contain the class label. We follow the same steps but we also ignore small folders that contain less than ten messages to provide the classifiers with enough number of training examples. 3.2 Feature Construction The accuracy of a classifier was increased for all classification methods as the feature size increases [12]. From this we conclude that the process of how to extract features from the email is very important for the classifier. Each email field (as subject, sender, recipient, and body) gives a piece of information about the email. To build an effective classifier we should use all email fields without ignoring any of them. This help in extracting hidden useful information from the user email documents. So all emails fields are parsed and tokenized then down-cased. Also date-related information are considered, we combine these information with other email fields information. The bag of words representation is used and each e-mail message is represented as vector of word counts. Stop words that contains auxiliary words, verbs, adverbs, conjunction are removed because it is helpless in the classification process. Finally, features that appear more than 100 times or just once in all messages are ignored. 3.3 Evaluation Method There are many approaches that can be employed in the evaluation phase. In [6] one approach is employed which is an incremental time-based splits. This approach also has a problem that emails are usually related to other received emails in closed periods, rather than to emails received long after or before it so low classification accuracies are resulted[6]. In [11] they split the email collection into training and testing splits, and then they use 20% for training and 80% for testing. The problem of this approach is that only a small portion of emails is used in training; which means if this portion is changed, the classification performance may differ. Ten-fold cross-validation method is chosen in evaluating our experiment. The data is divided into ten partitions. All these partitions are randomly taken but with the attention to include all categories in both training and testing split. The classification task is applied ten times. In each time, one partition is selected for training and the other partitions are randomly mixed for testing. The employed approach here ensures that all email messages in all categories are used in training and testing sets which is more reliable in email classification. 3.4 Classifiers Algorithms A classifier is a system that classifies text into discrete sets of predefined categories [12]. Here, 2 classification algorithms are used which are Maximum Entropy and Wide Margin Winnow. 3.4.1 Maximum Entropy Classifier Maximum entropy classifier [13] does not assume statistical independence of the independent variables or features that serve as predictors. As mentioned in [6] the unique distribution that satisfies the constraints in the training data and has the maximum entropy belonging to the exponential family is as in Eq. (1): Where is the normalization factor that depends on the document, is a feature, and is the weight or relevance of the feature. Maximum Entropy classifier is implemented as a part of the Mallet software system [14]. 3.4.2 Wide-Margin Winnow Winnow [6] belongs to the family of on-line learning algorithms. It attempts to minimize the number of incorrect guesses during the training phase while training examples are presented to it one by one. It uses a multiplicative scheme that allows it to perform much better when many dimensions are irrelevant. The winnow training implementation in [14] is used with θ = 0.5, α = 2.0, and β = 2.0. IJCSMS International Journal of Computer Science and Management Studies, Vol. **, Issue **, Month Year ISSN (Online): 2231-5268 www.ijcsms.com 4. Classification Procedures In this section we briefly describe the candidate classification methods which are used to compare against our proposed classification approach. 4.1 N-Way Classification It is a multi-class classification method where one classifier trains on a portion of data as training examples, and then is tested on the rest portions of the data, with a condition that all categories are included in training and testing portions of data. 4.2 Tournament Classification The tournament classification methods aim to perform classification on multi-class tasks using a tournament-like decision process, in which classes compete against each other to produce the final winner class [11]. Tournament classification method has two implementation ways, elimination and round robin tournament. We will briefly explain each type. 4.2.1 The Elimination Tournament Method (ET) The idea behind elimination tournament classification is inherited from the Wimbledon Championship [11]. In this method, every class is required to compete against a set of classes that is pre-determined before. When one competition is over, one of the two classes obtains the chance in the next round competition and the unlucky one is eliminated. The winner class in the last round is the optimal class to win with the incoming email message. 4.2.2 The Round Robin Tournament Method (RRT) Round robin tournament classification follows the competitions rules of the English Premiership [11]. As in elimination method but instead of eliminating the loser classes, the Round Robin Method assigns scores to both classes after each competition. Then, every class will accumulate a total score after all competitions are over. The class that accumulates the highest score is considered the optimal class that the incoming email message belongs to. Here, a score table for winner and loser classes is maintained. Under round robin tournament method, a tie may occurs in the final score table [11]. This tie occurs when more than two classes have the same final score in the competition table. 5. The Proposed Classification Approach: "Grouping Tournament" Grouping tournament classification approach (GTA) is another way for multi-class classification, where a multiclass problem is broken into a set of binary-class problems. All classes will then compete with each other in groups with certain competition rules similar to the rules within the World Cup Final. There exist two approaches to map one multi-class problem to a set of binary-class problems: 1) One-against-all approach or unordered binarization [15], in which number of binary classifiers equals to number of classes (c). Each classifier is trained on all the training data set where training examples of one class are considered as positive examples and all the others as negative examples. 2) pair-wise classification or Round Robin binarization [15], in which there is one classifier for each pair of classes, which means that number of classifiers is equal to C(C-1)/2 where C is number of classes and classifier for classes(i,j) is not different of classifier for classes(j, i). A classifier of two classes is trained only on training data for these two classes and ignores all other training data. Figure 1 shows 4-class problem with multi-class, pair-wise, and one-against-all approaches. In 1(a) one classifier separates all classes from each other. One-against-all approach need 4 classifiers for this problem, one classifier is shown in 1(b), which separates between class2 and all other classes. Pair-wise approach produces 6 classifiers, one for each pair of classes; 1(c) shows one classifier that separates class1 and class4. Fig. 1 Pair-wise versus One-against-all approach IJCSMS www.ijcsms.com IJCSMS International Journal of Computer Science and Management Studies, Vol. **, Issue **, Month Year ISSN (Online): 2231-5268 www.ijcsms.com It is clear from the figure that in pair-wise approach each classifier uses fewer training examples which gives an efficient boundary between the two classes than classifiers in one-against-all approach. The conclusion is that, the pair-wise approach may result in significant improvements over the one-against all approach without having a high risk of decreasing performance [15]. The GTA rely on the pair-wise classification approach by learning one classifier for each pair of classes, using only training data for these two classes and ignoring all others. In GTA classes are organized into groups where each group contains four classes. RRT is then applied within each group to determine a winner class of the group. ET will then be applied on the winner classes from all groups. The winner class in the last round is the optimal class to win with the incoming email message. GTA for 3 groups is illustrated in Figure 2. calculated as the average distance between all pairs of emails in the two different classes i.e. the similarities of all emails in the first class with all emails in the second class, but it doesn’t include the internal similarities of the emails in each class. UPGMA differs from the similarity methods in that it uses information about all pairs of distances, not just the nearest or the furthest. For this reason, it is preferred more than the single and complete linkage or similarity methods for grouping analysis [7]. We calculate similarity between 2 classes C1, C2 according to Eq. (2): Where: - N1, N2: number of emails in first class C1, and second class C2. : email documents in C1, C2. 5.2 Tie-Breaking Technique Fig. 2 Grouping Tournament method The difficult point here exists in class grouping for finding the way which produces the optimal grouping scheme. One simple class grouping approach is random pick, in which classes are randomly picked out to form groups. Another approach is to group classes in the order that the classes have been seen by the algorithm in the training step. In GTA classes are clustered according to their similarities using the UPGMA similarity measure mentioned in Section 5.1. The ET and RRT methods are designed and implemented according to [11]. 5.1 The UPGMA Similarity Measure The Un-weighted Pair-Group Method with Arithmetic mean (UPGMA) [7] is a popular distance analysis method. In this method, the distance between two classes is The tie that may occur in the score table of the RRT method has two types: 1) Two-Way Tie occurs when two classes have the same final score in the competition table. In this case, the binary classification process between these two classes is investigated and the winner class is the class that won in this classification process. 2) Many-Way Tie occurs when three or more classes have the same final score in the competition table. In this case, the many-way tie is divided into many two-way ties. These new two-way ties are solved one by one, where the winner class from one two-way tie will compete with another class from many-way tie to create another two-way tie and so on until reaching only one two-way tie. The winner class in the many-way tie is the winner in the last two-way tie. 5.3 Voting Classification Voting classification method performs the multi-class classification using all mentioned classification methods. It applies N-way, tournament, and GTA methods then performs voting strategy between the winner classes of these methods. It checks if the three winner classes (or at least two) are the same, then this class is the final winner class. On the other hand, if the three winner classes are different, it applies the binary classifier of the first two classes and the resulted class from this classifier will compete with the third winner class with the binary IJCSMS International Journal of Computer Science and Management Studies, Vol. **, Issue **, Month Year ISSN (Online): 2231-5268 www.ijcsms.com classifier produced from it. The resulted class is the winner class in the voting method. It is notable from the table that the least number of emails is larger than 1000 emails which is a suitable number for an email classifier. 6. Experiments 6.2 Evaluation Criteria The email classification system evaluation depends on two main factors namely, the corpus data collection and the evaluation measure(s). This section first demonstrates the corpus data analysis and states the used evaluation measures. Finally, it describes and analyzes the experimental results. We used the Accuracy measure to evaluate all mentioned classification methods. Also we use Micro Average Precision, Micro Average Recall, and Micro Average Fmeasure to present the overall system performance for each test split of cross-validation splits. Suppose we have n classes for one Enron dataset user, where: : Number of manually classified emails (by the user) to class i. : Number of emails that are classified by the program to class i. : Number of corrected classified emails by the program to class i. Micro Average Precision (MAP), Micro Average Recall (MAR), Micro Average F-measure (MAF), and Accuracy are calculated according to Eq. (3), Eq. (4), Eq. (5), and Eq. (6). 6.1 Corpus Description Currently there are some public email corpora, such as Ling-spam, PU1, PU123A, Enron Corpus, etc [5]. We have evaluated our experiment on the Enron benchmark corpus for the email classification problem. The complete Enron dataset, along with an explanation of its origin, is available at [8]. We decided to use the email directories of seven former Enron employees that are especially large: beck-s, farmer-d, kaminski-v, kitchen-l, lokay-m, sanders-r and williams-w3. From each Enron user directory, we chose the largest folders according to the number of emails. This resulted in sixteen folders for beck-s, farmer-d, kaminski-v, kitchen-l, and sanders-r. For lokay-m and williams-w3 we chose the largest folders that contain more than ten messages which are eight and twelve folders in order. The reason behind that is to provide the classifier with enough training examples and to be sure that all folders share some examples in each data split in the tenfold cross-validation execution. Our filtered dataset version contains around 17,861 emails (~89 MB) for 7 users distributed among 100 folders. Table 1 shows statistics on the seven resulting datasets. Table 1: Statistics on our portion of the Enron datasets User Number of folders Number of messages Size of smallest folder (messages) Size of largest folder (messages) beck-s 16 1030 39 166 Farmer-d kaminskiv Kitchen-l 16 3578 18 1192 16 3871 21 547 16 3119 21 715 lokay-m 8 2446 51 1159 sanders-r williamsw3 16 1077 20 420 12 2740 11 1398 6.3 Experiment Description In this experiment, the effectiveness of email classification methods is evaluated. A java-based open source package called MALLET [14] (A Machine Learning for Language Toolkit) is used in implementing our email classification system. Our implemented email classification system applies tokenization method described in [14]. Tokenization is used to retrieve emails containing words as terms or units. Then the stop-words from the Standard English stop-list in [14] are removed. After that, feature construction steps in Section 3 are applied. The 4 classification methods are applied on each user in Enron dataset with ten-fold cross-validation evaluation method. For each Enron User data, the following steps are performed: (1) Randomly divide all data set into ten splits where each split contains email messages from all folders or classes. IJCSMS www.ijcsms.com IJCSMS International Journal of Computer Science and Management Studies, Vol. **, Issue **, Month Year ISSN (Online): 2231-5268 www.ijcsms.com (2) Prepare the training set in each execution time by randomly combining nine splits of all data; the remaining split is the testing set for this execution time. (3) Split the training set examples of each class to separate training set. (4)Produce all binary classifiers; one classifier for each two different classes by randomly combining the separate training examples of only these two classes. (5) Produce the multi-class classifier using all training data in point (2) to be used in the N-way classification method. (6) Apply the four classification methods ten times with different training and testing sets in each time and calculate the evaluation measures according to Section 6.2 6.4 Experimental Results To evaluate the performance of our proposed email classification approach GTA, it is compared with the current popular email classification methods tournament and N-way. Also it is compared with the proposed voting classification method mentioned in Section 5.3. Our experimental study is performed through two experiments. From Figure 3 we note that voting classification method accuracy values are almost similar to accuracy values of GTA at all users except for lokay-m user. GTA accuracy value for Lokay-m user is 53.81% but its accuracy value with voting method is 47.59%. This could be explained by noticing that, voting method produces accuracy values as average between the GTA, tournament, and N-way methods accuracy values. For Lokay-m user, the difference between the three methods' accuracy values is large so the voting accuracy value is less than the GTA accuracy value. We note in Figure 3 that in williams-w3 dataset the accuracies for all methods are so high. This is explained by observing that, in this dataset 4 classes hold 2587 email of total 2740 and the remaining 8 classes hold only 153 emails. So predicting one of the big four classes for any new email has a high probability. Figures 4, 5, and 6 show Micro average Precision, Micro average Recall, and Micro average F-measure values of the proposed GTA, tournament, and N-way methods for the ten cross-validation execution times for 3 Enron users. 6.4.1 Experiment1 This experiment was conducted on the seven Enron email directories mentioned in Section 6.1 using Wide-Margin Winnow classifier. Figure 3 reports the results of this experiment. The figure shows the average accuracy values for the four classification methods. It is clear from the figure that GTA outperforms tournament, N-way, and voting methods at all the seven users. Fig. 4 Kitchen-l Micro Avg Precision Fig. 3 Winnow Accuracy values of the 4 methods for the 7 Enron users Fig. 5 Farmer-d Micro Avg Recall IJCSMS International Journal of Computer Science and Management Studies, Vol. **, Issue **, Month Year ISSN (Online): 2231-5268 www.ijcsms.com Figures 8, 9, and 10 show Micro average Precision, Micro average Recall, and Micro Average F-measure values of the proposed GTA and the N-way for the ten crossvalidation execution times for 3 Enron Users. Fig. 6 Beck-s Micro Avg F-Measure All values in the figures demonstrate that GTA method is superior to both N-way and tournament methods in all the mentioned evaluation measures. Also this is the case for all other Enron users. Fig. 8 Williams-w3 Micro Avg Precision 6.4.1 Experiment2 In this experiment, we study the effect of applying the proposed GTA on Maximum Entropy classifier. This experiment was conducted on the seven Enron email directories mentioned in Section 6.1. Figure 7 shows the accuracy average values for the four classification methods. It is clear from the figure that GTA is the superior to tournament, N-way, and voting methods at all the seven users. Fig. 9 Kaminski-v Micro Avg Recall Fig. 7 Maximum Entropy Accuracy values of the 4 methods for the 7 Enron users Fig. 10 Beck-s Micro Avg F-Measure IJCSMS www.ijcsms.com IJCSMS International Journal of Computer Science and Management Studies, Vol. **, Issue **, Month Year ISSN (Online): 2231-5268 www.ijcsms.com It is clear from the figures that GTA outperforms the Nway classification method. Also it is the case for all other Enron users. We can conclude from all values in these figures that GTA improves the Maximum Entropy classifier accuracy for all seven used Enron Users. 7. Conclusions and Future Work In this paper, we presented a new Email classification approach to categorize user email data into predefined folders. This can be used in multi-class classification problems. Our classification approach utilizes the UPGMA similarity algorithm to cluster user email classes according to their similarities into groups applying rules similar to the World Cup Final rules. Our classification method is applied using winnow classifier. In addition, it is applied on Maximum Entropy. In all experiments, cross-validation evaluation technique is used, which applies random training/test splits obtaining much higher results compared with the step-incremental time-based splits. We carried out some empirical evaluations on email data of seven users, from Enron email dataset. The results showed that our proposed Grouping Tournament outperforms the current classification methods. Hence, we proved that our classification method reflects the email users' needs in the classification task, where it obtains both better performance and lower execution cost. The results of the carried experiments showed that, running Winnow and Maximum Entropy classifiers using Grouping Tournament classification method slightly improves their accuracy. Also we noted that the improvement in Winnow is slightly greater than in the Maximum entropy. In the future, we plan to investigate other similarity strategies to find the optimal grouping scheme for our Grouping Tournament classification method. We also will study how to get benefit from the user profile to enhance it. References [1] R. Feldman and I. Dagan, “Knowledge Discovery in Textual Databases”, in Proceedings of the First International Conference on Knowledge Discovery and Data Mining KDD95 August 2021 1995 Montreal Canada, 1995, pp. 112117. [2] P. Fayyad, U., Piatetsky-Shapiro, G. & Smyth, “From data mining to knowledge discovery: An Overview”, 1996, pp. 136. [3] E. Simoudis, “check for data mining”, IEEE Expert, 1996,11(5). [4] A.-hwee Tan, “Text Mining : The state of the art and the challenges Concept-based”, Proceedings of the PAKDD 1999 Workshop on Knowledge Disocovery from Advanced Databases, 1999, pp. 65–70. [5] P. Li, J. Li, and Q. Zhu, “An Approach to Email Categorization with the ME Model”, Artificial Intelligence, 2005, no. 2, pp. 229-234. [6] R. Bekkerman, A. Mccallum, and G. Huang, “Automatic Categorization of Email into Folders : Benchmark Experiments on Enron and SRI Corpora”, Science, 2004, vol. 418, pp. 1-23. [7] G. McLachlan, “Cluster Analysis”, Methods, 1984, pp. 361392. [8] Email Dataset, http://www.cs.cmu.edu/~enron/ [9] S. Kiritchenko and S. Matwin, “Email Classification with CoTraining”, Machine Learning, 2001. [10] S. Kiritchenko, S. Matwin, and S. Abu-hakima, “Email Classification with Temporal Features”, International Intelligent Information Systems, 2004, pp. 523-533. [11] Y. Xia, W. Liu, and L. Guthrie, “Email Categorization with Tournament Methods”, in Lecture Notes in Computer Science Natural Language Processing and Information Systems, vol. 351, Y. Xia, W. Liu, and L. Guthrie, Eds. Springer Berlin / Heidelberg, 2005, pp. 150-160. [12] S. Youn and D. Mcleod, “A Comparative Study for Email Classification”, Group, 2006,pp. 387-391. [13] Maximum Entropy Classifier, http://en.wikipedia.org/wiki/Maximum_entr opy_classifier [14] McCallum and A. Kachites, “MALLET: A Machine Learning for Language Toolkit”, httpmalletcsumassedu Mohri M Pereira F Riley M, 2002,vol. 231. [15] J. Fürnkranz, “Round robin classification,” The Journal of Machine Learning Research, 2002, vol. 2, no. 4, pp. 721-747.