Uploaded by Abir Hasnat

Entropy

advertisement
d) Design a classifier based on decision tree and predict the target attribute for the new instance
"5 —1.1— 2 — 2". Show detail calculation when you design the classifier.
Step 1: Root of the tree
To determine the root of the tree, we need to select the attribute that has the highest information gain.
Information gain is a measure of how much an attribute reduces the uncertainty of the target variable.
To calculate information gain, we first need to calculate the entropy of the target variable. Entropy is a
measure of uncertainty.
Entropy(Decision) = - (p(high) * log2(p(high)) + p(medium) * log2(p(medium)) + p(low) * log2(p(low)) +
p(very-low) * log2(p(very-low)))
where p(x) is the probability of the target variable being equal to x.
Entropy(Decision) = - (3/8 * log2(3/8) + 2/8 * log2(2/8) + 1/8 * log2(1/8) + 2/8 * log2(2/8)) = 1.625
Next, we need to calculate the entropy of the target variable after splitting on each attribute. We can do
this by calculating the weighted average of the entropy of each child node.
Entropy(Decision | Case Length)
If Case Length <= 4.5: Entropy(Decision) = - (2/4 * log2(2/4) + 1/4 * log2(1/4) + 1/4 * log2(1/4)) = 0.75
If Case Length > 4.5: Entropy(Decision) = - (2/4 * log2(2/4) + 2/4 * log2(2/4)) = 1
Weighted average entropy:
Entropy(Decision | Case Length) = (4/8 * 0.75) + (4/8 * 1) = 0.875
Similarly, we can calculate the entropy of the target variable after splitting on the other attributes:
Entropy(Decision | Height) = 1.125
Entropy(Decision | Width) = 0.875
Entropy(Decision | Weight) = 1.125
Information gain for each attribute:
Information Gain(Case Length) = Entropy(Decision) - Entropy(Decision | Case Length) = 1.5 - 0.875 =
0.625
Information Gain(Height) = Entropy(Decision) - Entropy(Decision | Height) = 1.5 - 1.125 = 0.375
Information Gain(Width) = Entropy(Decision) - Entropy(Decision | Width) = 1.5 - 0.875 = 0.625
Information Gain(Weight) = Entropy(Decision) - Entropy(Decision | Weight) = 1.5 - 1.125 = 0.375
As we can see, Case Length has the highest information gain, so we will make it the root node of the
tree.
Step 2: Split the data on the root node
We will split the data on the root node (Case Length) into two branches:
If Case Length <= 4.5:
If Height <= 1.6:
The target variable is medium.
Else:
The target variable is high.
If Case Length > 4.5:
If Weight <= 1.3:
The target variable is very-low.
Else:
The target variable is low.
Step 3: Repeat steps 1 and 2 for each child node
We will repeat steps 1 and 2 for each child node until we reach a point where all of the data points in a
node belong to the same class, or until there are no more attributes to split on.
The resulting decision tree is shown below:
Case Length <= 4.5
| Height <= 1.6
| | medium
| Else:
| | high
Else:
| Weight <= 1.3
| | very-low
| Else:
| | low
Prediction for the new instance "5 —1.1— 2 — 2"
1. The new instance has a Case Length of 5, which is greater than 4.5. So, we follow the right
branch of the tree:
Case Length is greater than 4.5, so we go to the right branch.
Weight is greater than 1.3, so we go to the bottom branch.
Therefore, the predicted target attribute for the new instance is low.
Download