Test2_2016.docx

advertisement
Test 2, Data Mining Spring 2016
1. (20 pts.) In running a regression tree model to predict salary in thousands of dollars from years of
postgraduate education, I split the root node into (exactly) two child nodes, one with 200 people and
predicted salary $30 thousand dollars and a second with 800 people and predicted salary $50 thousand
dollars. Find, if possible from this, the following (put “NP” if not possible).
(A) The number of people ________in the root node
(B) The average salary ________ in my training sample
(C) The misclassification rate __________in the first child node
(D) What do we minimize in choosing a regression tree split point?
2. (20 pts.) In a city park 80% of visitors are men and 20% women. I know that 30% of men litter and 40%
of women litter.
What proportion of visitors litter? __________
I see a piece of litter. What is the probability ________ that it was left by a woman?
3. (18 pts.) My odds of finding a parking place are 1 to 4 (odds=1/4)
(A) What is my probability _____ of finding a parking place?
(B) The odds would double if I had a handicap placard. What would be my probability ______of
finding a parking place then?
4. (10 pts.) I want to find the X for which f(x) =X3+ 3- ex is 0. I start with a guess that X=0. What will X
become _____ after taking one full Gauss-Newton step from X=0?
5. (16 pts.) I use discriminant analysis to predict from which of 3 subpopulations a defendant in a trial
comes. Each subpopulation contains 1/3 of the overall population and all have the same variancecovariance matrix .
Using equal priors my overall error rate comes out to 12%, however I find that using priors 10%, 60%,
and 30% I get error rates of 40%, 1%, and 2% respectively for the three subpopulations.
(A) What is the overall error rate _____ for the model with unequal priors?
(B) Explain which model I should use in testimony and why.
6. (16 pts) This function gives the probability of an event as a function of features X1 and X2.
e4 2 x1  x2
f ( x1 , x2 ) 
1  e4 2 x1  x2
(A) Give the formula for the relationship between X2 and X1 that makes events and
nonevents equally likely.
X2 = ______________________________
(B) For X1=3 and X2=1, I observe an event. For X1=5 and X2=3 I observe a non-event.
Is this pair of points concordant? Show, with numbers, how you came to this conclusion.
**************answers **************************
200+800 = 1000 in root node and training sample.
Total dollars 200($30 K) + 800($50 K) = 6000+40000 so average is $46 K
NP (would require a decision tree)
Criterion: Minimize error sum of squares (summed over both child nodes)
or average squared error (pooled across the 2 child nodes), a weighted average
of estimated variances.
(.8)(.3) + (.2)(.4) = .24+.08 = .32 = Pr{Litter} (32% of visitors litter)
Pr{Woman | Litter} = .08/.32 = 0.25 (the odds are 1 to 3)
p/(1-p)=1/4 so 4p=(1-p) and p = 0.20
odds 2 to 4 so p/(1-p) = 2/4 and 4p=2-2p so p = 1/3.
f(0) = 0 + 3 – e0 = 3-1 = 2.
f’(0) = 0 + 0 – e0 = -1 so change is 2/(-1) = -2 and new X = old X - (-2) = 0+2.
Proportion .10 .60 .30
Misclassify .40 .01 .02 overall .04+.006+.006 = .052 (< 0.12)
Even though the results using priors that are not the true ones has a smaller misclassification
rate, the results have nothing to do with reality or truth so use equal priors.
We recognize a logistic function here so when the logit is 0 the probability is ½.
-4+2X1-X2=0 implies X2=4-2X1.
X1 X2 L Y
3 1 1 1
5 3 3 0 This one has larger L so larger p but Y is 0 => discordant.
No need to compute the actual probabilities as they are monotonically increasing in L.
Download