Bootstrapping worksheet 1 solutions - Dr-Ds-wiki

advertisement
Bootstrapping –
worked solutions
Before you start:
1. Insert your name into this document’s header
2. Save this file in your drive with your name in the file name
3. Now follow the instructions below, and fill in the gaps where appropriate.
Computer package:








Open iNZight: subjects/maths/MAS301/Randomisation DA/iNZight
Select VIT module – Bootstrap confidence interval construction
Import appropriate data
Drag data headers down to bottom of sheet
Click on Analyse tab at top
Select MEDIAN or MEAN, then “Record my choice”
Go play….
iNZightVIT guide is saved under subjects/maths/MAS301/Randomisation DA if
needed 
Task 1 – Auckland Zoo
A manager at the Auckland Zoo was interested in the median attendance at the zoo
during summer weekend days and also whether it is affected by how nice the weather is
on the day. The number of visitors to the Auckland Zoo on a random sample of weekend
days in the months of December, January, February and March (summer months) and
whether or not the day was a nice day (based on the number of sunshine hours on the
day) was recorded.
Import the data file “Zoo Data.csv”
The data contains two variables
 Attendance: The number of visitors to the zoo on the day
 NiceDay: whether or not the day was classified as nice (Yes or No)
(a) (i)

Generate a bootstrap confidence interval for the median daily attendance at
the zoo on summer weekends. Copy the output (final graphs) into the
document here
(ii)
What is the parameter we are estimating using this bootstrap confidence
interval?
 The parameter we are interested in estimating is the
median number of visitors to the zoo for all summer
weekend days
(iii)

Do we know the value of this parameter?
NO
– we only have sample information we can use
to estimate it.
(iv) Interpret the bootstrap confidence interval.
 We’re pretty sure (it’s a fairly safe bet) that the
median number of visitors to the zoo on summer weekend
days is somewhere between 1096 and 1296 people.
(v)
Briefly explain why students in this class will not all get the same values for their
bootstrap confidence interval.
 Bootstrap confidence intervals are generated by
randomly re-sampling from the initial sample. Since
this is a random process there will be random variation
between different bootstrap confidence intervals.
Task 1 – Auckland Zoo - continued
(b) (i)
Generate a bootstrap confidence interval for the difference in the median daily
attendance at the zoo on summer weekends between days that are classified
as nice and days that are classified as not nice. Copy the output (final graphs)
into the document here

(ii)
What is the parameter we are estimating using this bootstrap confidence
interval?
The
parameter we are interested in estimating is the
difference in the median daily attendance at the zoo
on summer weekends between days that are classified
as nice and days that are classified as not nice.
(iii)
Interpret the bootstrap confidence interval.
 We’re
pretty sure (it’s a fairly safe bet) that the
median number of visitors to the zoo on summer weekend
days that are classified as nice is somewhere between 302
lower and 274 higher than the median number of visitors to
the zoo in summer weekend days that are classified as not
nice.
(iv) Based on the bootstrap confidence interval, is it believable that the median
attendance at the zoo on summer weekends is the same on days that are
classified as nice as for days that are classified as not nice? Briefly justify your
answer.

It is believable that the median attendance at the zoo
on summer weekends is the same on days that are
classified as nice as for days that are classified as not
nice. This is because 0 is in the bootstrap confidence
interval, so a difference of 0 – ie no difference – is
believable.
Task 2 – Fastest speed
This data for this task had
information on the fastest
speed a student has driven
at, and is one of the
variables in the full
Auckland University
Statistics Students Survey.
This data set also includes
information on Gender
and Ethnicity.
PART A – summary question
Question: I wonder what the median fastest speed driven at of
male Stage 1 Statistics students at Auckland University is?
Import the data file “fastest speed males.csv”
Use iNZight to answer this question
Make sure you include:
1) A bootstrapped confidence interval for the median, and screen shot of
your graphs
2) An interpretation of the confidence interval
From our bootstrap confidence interval - we are pretty
sure (it is a fairly safe bet) that the median fastest speed
driven at of male Stage 1 Statistics students at Auckland
University is between 100 km/hr and 120 km/hr
3) An answer to the question above, with justification
See above!
Task 2 – Fastest speed
PART B – does our method work?
Import the data file “fastest speed POPmales.csv”
Use iNZight and the Confidence interval coverage module to complete the
statement below:
From the software output, approximately 99.7%! of bootstrap
confidence intervals contain the population median.
Put a screen shot of your graphs to support your answer.
Task 2 – Fastest speed
PART C – comparison question
Question:
I wonder what the difference in fastest speed
driven at is between males and females at Auckland University
(first year stats students)?
Import the data file “fastest speed.csv”
Use iNZight to answer this question
Make sure you include:
1) A bootstrapped confidence interval for the difference in medians, and
screen shot of your graphs
2) An interpretation of the confidence interval
 We’re
pretty sure (it’s a fairly safe bet) that the median
fastest speed driven at for males first year statistics
students at Auckland University is somewhere between 10
km per hour lower and 35 km per hour higher than the
median fastest speed driven at for females first year
statistics students at Auckland University
3) An answer to the question above, with justification
Based on these samples I would not make the call that
there is a difference in fastest driven at speeds between
males and female first year statistics students at Auckland
University. That is, I would not make the call that male
first year statistics students at Auckland University tend
to have a fast driven at speed than females in the whole
population.
The bootstrap confidence interval for the difference
between population median fastest time driven at of male
first year statistics students and the population median
fastest time driven at of female first year statistics
students does contain zero, indicating that there is no
difference in fastest driving speeds of the two genders.
Download