Uploaded by BHARGAVI T L 1CD18EC016

decision tree

advertisement
Bhargavi TL
Bhaskar Raj R
What is a Decision Tree?
• Decision Trees
are tree-like structure model which resembles an
upside-down tree.
• Decision Trees
build the tree by asking a series of questions to the
data to reach a decision.
• Hence
it is said that Decision Trees mimic the human decision
process. During the tree-building process, it divides the entire
data into subsets of data until it reaches a decision.
DECISION TREE TERMINOLOGIES
FEW TERMINOLOGIES IN DECISION
TREES:
Root Node: The topmost node of the tree corresponds to the Root Node. All the data will
be present at this Root Node. The arrows in the decision tree are generally pointed away
from this Root Node.
Leaf Node or Terminal Node: Also called as Terminal Node. If a particular node cannot
be split further that it is considered as Leaf Node. The Decisions or the Predictions are
held by this Leaf Node. The arrows in the decision tree are generally pointed to this Leaf
Node.
Internal Node or Decision Node: The nodes between the root node and the leaf node are
said to be internal nodes. These nodes can be split further into sub-nodes.
3 ELEMENTS OF DECISION TREE
Decisions
Uncertainties
Payoffs(Get/Pays)
EXAMPLE
How to Create a Decision Tree?
A decision
tree is created in simple ways with the top-down
manner.
They
consist of nodes that form a directed node which has
root nodes with no incoming edges all other nodes are called
decision -nodes with at least one incoming edges.
The
main goal of the data sets is to minimize generalization
errors by finding the optimal solution in the decision tree.
The steps involved in the tree building
process is as follows:
1. Recursive partition of the data into multiple subsets.
2. At each node identifying the variable and the rule associated with the variable
for the best split.
3. Applying the split at that node using the best variable using the rule defined
for the variable.
4. Repeating steps 2 and 3 on the sub-nodes.
5. Repeating this process until we reach a stopping condition.
6. Assigning the decisions at the leaf nodes based on the majority class label
present at that node if performing a classification task or considering the
average of the target variable values present at that leaf node if performing a
regression task.
CREATING A DECISION TREE
• Let us consider a scenario where a new planet is discovered
by a group of astronomers. Now the question is whether it
could be ‘the next earth?’
• Let us create a decision tree to find out whether we have
discovered a new habitat.
The habitable temperature falls into the
range 0 to 100 Celsius.
Whether water is present or not?
Whether flora and fauna flourishes?
The planet has a stormy surface?
EXAMPLE 2:
Advantages of Decision Trees:
Versatile
Fast
Minimal data preprocessing
Easy Interpretable
Able to handle non-linear relationships
Handles Multicollinearity
Disadvantages of Decision Trees:
Loss of Inference
Loss of the numerical nature of the variable
Unstable
Biased response
Overfitting
Download