Score 7 out of 10
Assignment question 1.
If age <= 43 & sex = female
Then life insurance promotion = yes
6 / 7 = 86%
If age <= 43 & sex = male & credit card insurance = no
Then life insurance promotion = no
4 / 7 = 57%
IF Age > 43
THEN life insurance promotion = no
IF age <=43 & sex = male & credit card insurance = yes
THEN life insurance promotion = yes
Simplified rule:
If age <= 43 & credit card insurance = no
Then life insurance promotion = no
4 / 7 = 57%
If the individual is male, we can ignore the attribute AGE.
Score 9 out of 10
Assignment question 2. age
> 43
<= 43 yes no 3/0 female female 6/0 male 7
Janet L. Austell
Total Score 80 out of 90
Janet L. Austell
One possibility is to split the Credit Card Insurance No branch on age >29 and age<=29.
The two instances following age >29 will have life insurance promotion = no. The two instances following age <=29 once again split on attribute age. This time, the split is age
<=27 and age > 27.
Score 10 out of 10
Assignment question 3.
Very good
If magazine promotion = yes
Then watch promotion = no
Confidence 4 / 7 = 57%
Support 4 / 11 = 36%
If credit card insurance = yes
Then life insurance promotion = no
Confidence 5 / 8 = 63%
Support 5 / 11 = 45%
If sex = male
Then credit card insurance = no
Confidence 4 / 6 = 66%
Support 4 / 11 = 36%
Score 9 out of 10
Assignment question 4.
For third iteration
Center of cluster C1 = (1.33, 2.5) C2 = (3.33, 4.00)
The new cluster center for cluster 1 is (1.5,5). The new cluster center for cluster 2 is
(4.0,4.25).
1. Distance (c1 – 1 ) = 1.05 Distance (c2 – 1 ) = 3.42
2. Distance (c1 – 2) = 2.03 Distance (c2 – 2) = 2.38
3. Distance (c1 – 3 ) = 1.20 Distance (c2 – 2) = 2.83
4. Distance (c1 – 4) = 1.20 Distance (c2 – 4) = 1.42
5. Distance (c1 – 5 ) = 1.67 Distance (c2 – 5 ) = 1.54
6. Distance (c1 – 6) = 5.07 Distance (c2 – 6) = 2.61 c1 contains instances 1, 2, 3, 4, 5 c2 contains instance 6
Score 10 out of 10
Assignment question 5.
Good
10
~Line is blank
18
~Bad numerical data for attribute Age
Janet L. Austell
Score 10 out of 10
Assignment question 6. a.
two classess :
43% are male
Very good
These males are less likely to participate in a promotional purchase
47% are female
These females are more likely to participate in a promotional purchase b.
Males predictability :
50% these males do not participate in Magazine Promo
50% these males do not participate in Watch Promo
63% of these males do not participate in participate Life Ins Promo
75% of these males do not participate in participate Credit Card Ins.
Males predictiveness:
With a certainty of 57% we can predict that those that do not participate in
Magazine Promo are male
With a certainty of 57% we can predict that those that do not participate in Watch
Promo are male
With a certainty of 83% we can predict those that do not participate in Life Ins
Promo are male
With a certainty of 50% we can predict that those that do not participate in Credit
Card Ins are male
females predictability :
57% these females participate in Magazine Promo
50% these females participate in Watch Promo
86% of these females participate in participate Life Ins Promo
86% of these females do not participate in participate Credit Card Ins. females predictiveness:
With a certainty of 50% we can predict that participates in Magazine Promo are female
With a certainty of 50% we can predict those that Watch Promo are female
With a certainty of 67% we can predict those that participate in Life Ins Promo are female
With a certainty of 50% we can predict that those participates that do not participate in Credit Card Ins are female c.
*******************************
Rules for Class Male
8 instances
*******************************
Life Ins Promo = No
:rule accuracy 83.33%
:rule coverage 62.50%
Janet L. Austell
**Total Percent Coverage = 62.50%
*******************************
Rules for Class Female
7 instances
*******************************
38.00 <= Age <= 41.00
:rule accuracy 100.00%
:rule coverage 57.14%
38.00 <= Age <= 41.00
and Life Ins Promo = Yes
:rule accuracy 100.00%
:rule coverage 57.14%
**Total Percent Coverage = 57.14%
Score 10 out of 10 Very good
Assignment question 7. a.
Res. Score:
Class Sick
0.553
Class Healthy
0.581
Domain
0.52 b.
Male 140 of 203 = 69%
Female 63 of 203 = 31% c. Flat d. 51.945 e. Normal f. blood pressure
125
130 g. 16 of 93 = 17%
Female 16 0.17 h. With a certainty of 75 % I can predict that patient with the condition angina are sick i. #colored vessels j. 82% k. We have a 95% confidence in the test instance being classed correctly l. 13 m. thal = Rev and chest pain type = Asymptomatic
:rule accuracy 91.53%
# Covered = 54
Janet L. Austell
# remaining= 39
**Total Percent Coverage = 58.06%
Score 8 out of 10
Assignment question 8. j. 138 sick instances exist. 45% of all the test instances are classed as sick. This has an error rate of 49.3% to 60.7%. Accuracy rate 81% k. 165 healthy instances exist. 54% of all test instances are classed as healthy. This has am error rate of 51.7% to 54% Accuracy rate 34%
Score 7 out of 10
Assignment question 9. a.
Yes Age and Sex
The predictiveness score for sex = female is 0.81 for the survivors. The predictiveness score for sex = male is 0.77 for the non-survivors. For the non-survivors, class =
third has a predictiveness score of 0.72 and class = crew has a predictiveness score of
0.76. b.
77% Good c.
A 95% test set accuracy would result in a coverage of 00.0%. The error rate is
19.8% to 26.2% therefore an accuracy of 95% is unachievable
The lower-bound accuracy is 73.8%. The upper-bound accuracy is 80.2%.
d.
The example in section 4.8 uses randomly selected test data
The test set for the example in section 4.8 contains 190 non-survivors and 10 survivors. That is, 95% of the test data holds non-survivor instances. The test data does not reflect the ratio of survivors to non-survivors seen in the entire dataset. The test set for this problem contains 77% non-survivors and 23% survivors. This nonsurvivor to survivor ratio more closely matches the ratio seen in the entire domain of data instances.