Programming: R or Python

A marketing research firm was engaged by an automobile manufacturer to conduct a pilot study to examine the feasibility of using logistic regression for ascertaining the likelihood that a family will purchase a new car during the next year. A random sample of 33 suburban families was selected. Data on annual family income (X 1 , in thousand dollars) and the current age of the oldest family automobile (X 2 , in years) were obtained. A follow-up interview conducted 12 months later was used to determine whether the family actually purchased a new car (Y = 1) or did not purchase a new car (Y = 0) during the year. You can find all the data in the CarPurchaseData.txt attachment. X X Y i i1 i2 j 1 32 3 0 2 45 2 0 3 60 2 1 . . . . . . . . . . . . 31 21 3 0 L = large S = small A = adult I = immature 32 32 5 1 33 17 1 0 Multiple logistic regression model with two predictor variables in first-order terms is assumed to be appropriate. a. Find the maximum likelihood estimates of β o , β 1 , and β 2 . State the fitted response function. b. Obtain exp(

*b*

1 ) and exp(

*b*

2 ) and interpret these numbers. c. What is the estimated probability that a family with annual income of $50 thousand and an oldest car of 3 years will purchase a new car next year?

Knight & Skagen (1988) collected the data shown in the table during a field study on the foraging behavior of wintering Bald Eagles in Washington State, USA. The data concern 160 attempts by one (pirating) Bald Eagle to steal a chum salmon from another (feeding) Bald Eagle. The abbreviations used are: Total number of attempts Size of pirating eagle Age of pirating eagle Size of feeding eagle Number of successful attempts 17 29 17 20 1 15 0 1 24 29 27 20 12 16 28 4 L L L L S S S S A A I I A A I I L S L S L S L S Report on factors that explain the success of the pirating attempt and give a prediction formula for the probability of success

The following data are part of a survey by Dr Mutch of low-weight births in Scotland between 1981 and 1988. The table refers to 661 children with birth weights between 650g and 1749g all of whom survived for at least one year. The variables of interest are: Cardiac: mild heart problems of the mother during pregnancy; Comps: gynaecological problems during pregnancy; Smoking: mother smoked at least one cigarette per day during the first 6 months of pregnancy; BW: was the birth weight less than 1250g? Cardiac Comps Smoking BW proportions

*p*

j Yes No to be any outlying cases? 10 7 25 5 Yes No Yes No Yes No Yes No Yes No Yes No Yes No 12 22 15 19 18 10 12 42 45 12 202 205 Analyse this table.

Refer to

a. To assess the appropriateness of the logistic regression function, form three groups of 11 cases each according to their fitted logit values . Plot the estimated against the midpoints of the any cases here appear to be outlying? π 'ˆ intervals. Is the plot consistent with a response function of monotonic sigmoidal shape? Explain. b. Obtain the deviance residuals and present them in an index plot. Do there appear c. Construct a half-normal probability plot of the absolute deviance residuals. Do