Bootstrapping – worked solutions Before you start: 1. Insert your name into this document’s header 2. Save this file in your drive with your name in the file name 3. Now follow the instructions below, and fill in the gaps where appropriate. Computer package: Open iNZight: subjects/maths/MAS301/Randomisation DA/iNZight Select VIT module – Bootstrap confidence interval construction Import appropriate data Drag data headers down to bottom of sheet Click on Analyse tab at top Select MEDIAN or MEAN, then “Record my choice” Go play…. iNZightVIT guide is saved under subjects/maths/MAS301/Randomisation DA if needed Task 1 – Auckland Zoo A manager at the Auckland Zoo was interested in the median attendance at the zoo during summer weekend days and also whether it is affected by how nice the weather is on the day. The number of visitors to the Auckland Zoo on a random sample of weekend days in the months of December, January, February and March (summer months) and whether or not the day was a nice day (based on the number of sunshine hours on the day) was recorded. Import the data file “Zoo Data.csv” The data contains two variables Attendance: The number of visitors to the zoo on the day NiceDay: whether or not the day was classified as nice (Yes or No) (a) (i) Generate a bootstrap confidence interval for the median daily attendance at the zoo on summer weekends. Copy the output (final graphs) into the document here (ii) What is the parameter we are estimating using this bootstrap confidence interval? The parameter we are interested in estimating is the median number of visitors to the zoo for all summer weekend days (iii) Do we know the value of this parameter? NO – we only have sample information we can use to estimate it. (iv) Interpret the bootstrap confidence interval. We’re pretty sure (it’s a fairly safe bet) that the median number of visitors to the zoo on summer weekend days is somewhere between 1096 and 1296 people. (v) Briefly explain why students in this class will not all get the same values for their bootstrap confidence interval. Bootstrap confidence intervals are generated by randomly re-sampling from the initial sample. Since this is a random process there will be random variation between different bootstrap confidence intervals. Task 1 – Auckland Zoo - continued (b) (i) Generate a bootstrap confidence interval for the difference in the median daily attendance at the zoo on summer weekends between days that are classified as nice and days that are classified as not nice. Copy the output (final graphs) into the document here (ii) What is the parameter we are estimating using this bootstrap confidence interval? The parameter we are interested in estimating is the difference in the median daily attendance at the zoo on summer weekends between days that are classified as nice and days that are classified as not nice. (iii) Interpret the bootstrap confidence interval. We’re pretty sure (it’s a fairly safe bet) that the median number of visitors to the zoo on summer weekend days that are classified as nice is somewhere between 302 lower and 274 higher than the median number of visitors to the zoo in summer weekend days that are classified as not nice. (iv) Based on the bootstrap confidence interval, is it believable that the median attendance at the zoo on summer weekends is the same on days that are classified as nice as for days that are classified as not nice? Briefly justify your answer. It is believable that the median attendance at the zoo on summer weekends is the same on days that are classified as nice as for days that are classified as not nice. This is because 0 is in the bootstrap confidence interval, so a difference of 0 – ie no difference – is believable. Task 2 – Fastest speed This data for this task had information on the fastest speed a student has driven at, and is one of the variables in the full Auckland University Statistics Students Survey. This data set also includes information on Gender and Ethnicity. PART A – summary question Question: I wonder what the median fastest speed driven at of male Stage 1 Statistics students at Auckland University is? Import the data file “fastest speed males.csv” Use iNZight to answer this question Make sure you include: 1) A bootstrapped confidence interval for the median, and screen shot of your graphs 2) An interpretation of the confidence interval From our bootstrap confidence interval - we are pretty sure (it is a fairly safe bet) that the median fastest speed driven at of male Stage 1 Statistics students at Auckland University is between 100 km/hr and 120 km/hr 3) An answer to the question above, with justification See above! Task 2 – Fastest speed PART B – does our method work? Import the data file “fastest speed POPmales.csv” Use iNZight and the Confidence interval coverage module to complete the statement below: From the software output, approximately 99.7%! of bootstrap confidence intervals contain the population median. Put a screen shot of your graphs to support your answer. Task 2 – Fastest speed PART C – comparison question Question: I wonder what the difference in fastest speed driven at is between males and females at Auckland University (first year stats students)? Import the data file “fastest speed.csv” Use iNZight to answer this question Make sure you include: 1) A bootstrapped confidence interval for the difference in medians, and screen shot of your graphs 2) An interpretation of the confidence interval We’re pretty sure (it’s a fairly safe bet) that the median fastest speed driven at for males first year statistics students at Auckland University is somewhere between 10 km per hour lower and 35 km per hour higher than the median fastest speed driven at for females first year statistics students at Auckland University 3) An answer to the question above, with justification Based on these samples I would not make the call that there is a difference in fastest driven at speeds between males and female first year statistics students at Auckland University. That is, I would not make the call that male first year statistics students at Auckland University tend to have a fast driven at speed than females in the whole population. The bootstrap confidence interval for the difference between population median fastest time driven at of male first year statistics students and the population median fastest time driven at of female first year statistics students does contain zero, indicating that there is no difference in fastest driving speeds of the two genders.