By Adeyemo O.O. ,Adewole A.P, Ogunbiyi T.D, Oni Samson. ABSTRACT Decision tree is a data mining technique that can accurately classify data and make effective predictions, it has been successfully employed for data analyses as a comprehensible knowledge representation in a broad range of fields such as customer relationship management, engineering, medicine, agriculture, computational biology, business management, fraudulent statement detection. In this paper, we provide a review of research publications that have explored the accuracy of the prediction and classification capabilities of decision tree to develop data mining model in comparison with several other algorithms in different application domains ,this will enable researchers to have a general overview of knowledge gap in decision tree data mining algorithm. Data mining takes advantage of the large set of data that is available to carry out prediction and classification activities , So we used data consisting of records of Heart disease patients that have been gathered over the years and data mining processes is performed on them using Decision Tree, an approach to achieving data mining. INTRODUCTION Decision tree is a classification and prediction tool, it is used widely because knowledge discovered from it in illustrated in a hierarchical structure which makes it to be easily understood by people who are not experts in data mining. It is a predictive modeling based technique developed by Rose Quinlan. It is a sequential classifier in the form of recursive tree structure. The data set in decision tree is analyzed by developing a branch like structure with appropriate decision tree algorithm. Each internal node of tree splits into branches based on the splitting criteria. Each test node denotes a class. Each terminal node represents the decision. They can work on both continuous and categorical attributes. Manpreet Singh et. al. (2013). RESEARCH OBJECTIVES Adopting a fast and reliable means of predicting or detecting heart disease which is a disease that has claimed several lives in Nigeria, Africa and the World at large disease so that it will be possible to eradicate it. With the use of a decision making system that implements Decision Tree (which predictive capability in the heart disease prediction and some other domain is critically reviewed in this paper), heart disease could be eradicated or reduced to a very minimal level in Nigeria. PROCESSES OF DEVELOPING A DECISION TREE MODEL TREE GROWING The initial stage of creating a decision tree model is tree growing, which includes two steps: tree merging and tree splitting. Tree merging : The non-significant predictor categorizes and the significant categories within a dataset are grouped together. Tree splitting: To remove the impurities within the model (which increases as the tree grows and may result in reducing the accuracy of the model) into different leaves Mutasem Sh. Alkhasawneh et.al, (2012) TREE PRUNING To remove irrelevant splitting nodes. The removal of irrelevant nodes can help reduce the chance of creating an over-fitting tree. Such a procedure is particularly useful because an over-fitting tree model may result in misclassifying data in real world applications. Mutasem Sh. Alkhasawneh et.al, (2012) TREE SELECTION The final stage of developing a decision tree model is tree selection. At this stage, the created decision tree model will be evaluated by either using cross-validation or a testing dataset. This stage is essential as it can reduce the chances of misclassifying data in real world applications, and consequently, minimize the cost of developing further applications. Mutasem Sh. Alkhasawneh et.al, (2012) DECISION TREE ALGORITHMS The different decision tree algorithms are o ID3 o C4.5 o C5.0 o CHAID o CART. ALGORITHM FOR DECISION TREE INDUCTION BASIC ALGORITHM (A GREEDY ALGORITHM) - Tree is constructed in a top-down recursive divide-and-conquer manner -At start, all the training examples are at the root -Attributes are categorical (if continuous-valued, they are discretized in advance) -Examples are partitioned recursively based on selected attributes -Test attributes are selected on the basis of a heuristic or statistical measure (e.g., information gain) CONDITIONS FOR STOPPING PARTITIONING -All samples for a given node belong to the same class -There are no remaining attributes for further partitioning –majority voting is employed for classifying the leaf. -There are no samples left Jiawei Han, (2006) DECISION TREE APPLICATIONS Decision tree has been used to develop models for prediction and classification in different domains some of which are Business Management , Customer Relationship Management, Fraudulent Statement Detection, Engineering, Energy Consumption, Fault Diagnosis, Healthcare Management , Agriculture as explained in the studies below. CLASSIFICATION Decision trees algorithm used for classification in different domains independently and also in combination with other algorithms by different researchers are discussed below: Mohd Najwadi Yusoff and Aman Jantan, 2011 Proposed the usage of Genetic Algorithm (GA) as an approach to optimize Decision Tree (DT) in malware classification in comparison with Current techniques in malware classification.New classifier was developed by combining GA with DT and named Anti-Malware System (AMS) Classifier in order to classify unique type of malware.Their result shows AMS Classifier shows an accuracy increase from 4.5% to 6.5% from DT Classifier. Baisen Zhang Tillman, Russ 2007 investigated the potential of a decision tree approach for modelling NFUE(Nitrogen fertilizer Use Efficiency) in New Zealand pastures. The researchers validated their models for 11 of the 16 trials tested with a predictive accuracy of 69%. D.Senthil Kumar Et al, in their research focused on the aspect of Medical diagnosis by learning pattern through the collected data of diabetes, hepatitis and heart diseases and to develop intelligent medical decision support systems to help the physicians, they proposed the use of decision trees C4.5 algorithm, ID3 algorithm and CART algorithm to classify these diseases and compare the effectiveness, correction rate among them. Abolfazl Kazemia ET. Al, 2011 researched the use of “CRT”, “QUEST” and “C5.0” “CHIAD”, Decision Tree algorithm to help organizations determine the criteria needed for the identification of potential customers in the competitive environment of their business. The tree obtained based on C5.0 algorithm provided the most optimal variable and decision tree by 83.96% accuracy which is closer to field results used for the comparison and performs better in action. Baisen Zhang Tillman, Russ 2007 investigated the potential of a decision tree approach for modelling NFUE(Nitrogen fertilizer Use Efficiency) in New Zealand pastures. . It was concluded that this type of modelling approach can be used to predict NFUE and thereby to assist decisions on when and where to apply N fertilizer in pastures for increasing productivity while reducing the environmental impact. Abishek Suresh, Et. Al. Investigated the application of decision tree models for the formation of protein homodimer complexes for molecular catalysis and regulation. The decision tree model produced positive predictive values (PPV) of 72% for 2S, 58% for 3SMI and 57% for 3SDI in cross validation. It was thus concluded that the method finds application in assigning homodimers with folding mechanism. Majoobi , J , 2007 studied the performances of Decision trees classification for prediction of wave parameters which are necessary for many applications in coastal and offshore engineering. According to the researchers several and various prediction models have been proposed in the literature for this purpose, decision tree models was found to give a better accuracy. Wang Wei, 2012, In his study, used decision tree to classify image classification, which was established based on the analysis of the spectrum characteristics, the texture characteristics and other auxiliary information, such as NDVI, NDBI and topography characteristics. The result of their study indicated that the accuracy of decision tree classification was 4.06% higher than that of the maximum likelihood classification and Kappa coefficient was increased by 5.61%. Kuldeep Kumar, Et. Al 2006 in their study discussed the effectiveness of using decision trees for classification in mammography. The results obtained using algorithms based on decision trees were compared with that produced by neural network and decision tree was reported to have higher classification rate. Micheal D Twa, 2011 described the application of decision tree induction, an automated machine learning classification method, to discriminate between normal and keratoconic corneal shapes in an objective and quantitative way in other to solve with the aim of providing solution to the challenge of interpretation of volume and complexity of data produced during videokeratography examinations. . In their research the proposed method was compared with other known classification methods and decision tree classifier performed equal to or better than the other classifiers tested. Gregor Stiglic, ET. Al. 2012, in their research, presented an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The results demonstrate a significant increase of accuracy in fewer complexes visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, higher accuracy gains were observed in bioinformatics datasets. Peng Du, Ding Xiaoqing 2008, in their research presented a method based on decision tree classifier to identify the gender of a person. . The result of their research shows that the performance of decision tree classifier is superior to the ordinary classifier. Felipe Lirra ,2013 in their research developed a decision tree model, which indicated the action range of peptides on the types of microorganisms on which they can exercise biological activity in other to assist in the recent attempts to find effective substitutes to combat infections that have been directed at identifying natural antimicrobial peptides in order to circumvent resistance to commercial antibiotics. ). The results of their study showed that the use of decision trees to evaluate the antimicrobial activity of synthetic peptides enables the creation of more effective models for use in the development of new drugs. PREDICTION Decision trees algorithm used for prediction in different domains independently and also in combination with other algorithms by different researchers are discussed below: Jay Gholap, 2013 used attribute selection and boosting meta-techniques to tune the performance of J48 decision tree algorithm on the large amounts of data that are harvested along with the crops in predicting the soil fertility class since achieving and maintaining appropriate levels of soil fertility. J48 gives accuracy of 96.73% which makes a good predictive model in predicting the soil fertility in agriculture. Mohammad Taha Khan ET. Al. 2012 primarily researched the application of two decision tree algorithms C4.5 and the C5.0 was used for breast cancer as well as heart disease prediction. Over running the dataset of breast cancer of 400 records C4.5 shows 5 train error whereas C5.0 show only 3 train errors. C5.0 produces rules in a very easy readable form but C4.5 generates the rule set in the form of a decision tree. Yoshikazu Goto, ET. Al. 2010 in their study developed a simple and generally applicable bedside model for predicting outcomes after cardiac arrest (OHCA). This simple prediction model may provide clinicians with a practical bedside tool for the OHCA patient’s stratification in the emergency department. Atul Kumar Pandey ET. Al 2013 studied the comparison of Pruned J48 Decision Tree with Reduced Error Pruning Approach prediction model against simple pruned and unpruned approach using for classifying heart disease based on clinical data of patients and also developed a heart disease prediction model that can assist medical professionals in predicting heart disease status based on these clinical features. the result obtained it was discovered that fasting blood sugar is the most important attribute which gives better classification against the other attributes but its gives not better accuracy. A. R. Senthil kumar, ET. Al.2013 Investigated the performance of soft computing techniques in modeling qualitative and quantitative water resource variables such as stream flow. It was found that REPtree(decision tree) model performed well compared to other soft computing techniques such as MLR, ANN, fuzzy logic, and M5P. B.S. ZHANG, ET. Al. 2004 applied Decision tree models to predict annual and seasonal pasture production and investigated the interactions between pasture production and environmental and management factors in the North Island hill country. . The decision tree models for annual, spring, summer, autumn and winter pasture production correctly predicted 82%, 71%, 90%, 88% and 90 % of cases in the model validation. Sevgi Zeynep Dogan, ET. Al., 2008 In their study compared the performance of three different decision-tree-based methods of assigning attribute weights to be used in a case-based reasoning (CBR) prediction model. The study compares the impact of attribute weights generated by three different methods and, hence, highlights the fact that the prediction rate of models such as CBR largely depends on the data associated with the parameters used in the model. Bark Cheung Chiu ET. Al. 2013 adopted the used of Input-Output Agent Modelling (IOAM) which is an approach to modelling an agent in terms of relationships between the inputs and outputs of the cognitive system together with a leading inductive learning algorithm, C4.5 to build a subtraction skill modeller, C4.5-IOAM. Experimental results from their investigation shows in the domain of modelling elementary subtraction skills, showed that the tree quality and the leaf quality of a decision path provided valuable references for resolving contradicting predictions and a single tree model representation performed nearly equally well to the multi-tree model representation. Middendorf et al. used alternating decision trees to predict whether an S. cerevisiae gene would be up- or down regulated under particular conditions of transcription regulator expression given the sequence of its regulatory region. In addition to good performance predicting the expression state of target genes, they were able to identify motifs and regulators that appear to control the expression of the target genes. Lee S, Park I. 2013 in their study, analyzed the hazard to ground subsidence using factors that can affect ground subsidence and a decision tree approach in a geographic information system (GIS). The highest accuracy was achieved by the decision tree model using CHAID algorithm (94.01%) comparing with QUEST algorithms (90.37%) and frequency ratio model (86.70%). These accuracies are higher than previously reported results for decision tree. Decision tree methods can therefore be used efficiently for GSH analysis and might be widely used for prediction of various spatial events. Heiko Milde, ET. Al 1999, In their research, introduced the MAD system which generates decision trees based on a new method for qualitative electrical circuit analysis. In particular, their new approach towards qualitative reasoning about faults in electrical circuits has reached a level of achievement so that it can be utilized to generate diagnosis systems employed in industry. SMITHA.T, DR.V.SUNDARAM 2012 studied the application of ID3 algorithm to build a decision tree model to predict the chances of occurrences of disease in an area by identify the significant parameters for prediction process. 95% of the prediction accuracy was achieved employing the decision tree classification model in the research which made the researchers conclude that mostly female inhabitant with a hereditary history living in a poor environment condition and having an average age of greater than 35 is suffering the disease. Methodology In this research, decision tree algorithm ID3 (Iterative Dichotomized 3) was used. These classification algorithm was selected because it have potential to yield good results in prediction and classification applications. Heart Disease Data Record set with medical attributes was obtained online from a Hospital. With the help of the dataset, the patterns significant to the heart attack prediction are extracted using the developed ID3 Datamining model. The records were split equally into two datasets: training dataset and testing dataset. To avoid bias, the records for each set were selected randomly. The data include values for the following: Heart Disease Predictor Interface The result page shows result of the prediction which can either be Heart disease Present or Absent Results A decision tree is a flowchart-like structure in which internal node represents test on an attribute, each branch represents outcome of test and each leaf node represents class label (decision taken after computing all attributes). A path from root to leaf represents classification rules. The java program consists of several packages but ID3 Logic is the package that does the main work. The system has been built into a jar file which once double- clicked on a system with java run time. CONCLUSION Decision tree has been found useful in classification and prediction modeling due to the fact that it can capability to accurately discover hidden relationships between variables, it is capable of removing insignificant attributes within a dataset. Twenty One studies published between 1999 and 2014 in more than three application domains have been studied in this research and met the minimum criteria for inclusion in our literature review. Decision tree-a data mining model developed and employed in this research was used in predicting the existence of heart disease in any diagnosed patient which has provided a solution that helps remove the bottleneck at hospitals. It also provides a means of giving an idea of the possible heart disease status of a patient without carry out laboratory test simply by using the symptoms being felt by the patient. Interestingly, anybody can make use of the system since training of the system is required just once for any particular data set.