Homework 4.

advertisement

Stat 410/510

Lab Week 4

Due: Friday, February 21.

(1) Create a short R function that will calculate a confidence interval for a population mean. The input will be a vector of data and the desired percentage for the confidence interval. Below is the start of the function and some test output. Comment your code.

CI <- function(x,pct=0.95)

{

YOUR CODE HERE

return(

A VECTOR OF LENGTH 2

)

}

## Test the function

# Make practice data set.seed(410510) x <- rnorm(50,mean=20,sd=4)

## First test

CI(x) t.test(x)$conf.int

## Second test

CI(x,pct=.90) t.test(x,conf.level=.90)$conf.int

(2) For this problem you will create a function that demonstrates the Central Limit Theorem. The input will be a large vector which represents the population. The output will be four graphs on one page. The first graph will be a histogram of the original population data. The second graph will be a qq-plot of the population data to demonstrate how close it is to a normal distribution.

The third graph will be histogram of the N sample means, each calculated from a sample of size n from the population. The fourth graph will be a qq-plot of the sample means. For the histograms, use breaks=30. Below I provide you some code to start with. a.

Show your commented code. b.

Demonstrate your code works by showing the output from the following code:

CLT(population=runif(1000),n=100,N=500) c.

Use the provided test code to test your data. What happens to the standard deviation of the sample means as n increases? What happens to the shape of the distribution of sample means? d.

What does the distribution of the sample means look like when n=1?

1

# Make practice population and see it’s not normal set.seed(410510) x1 <- rchisq(5000,df=6)+rnorm(5000,mean=60,sd=1) x2 <- rchisq(10000,df=6)+rnorm(10000,mean=30,sd=2) y1 <- rnorm(10000,mean=45,sd=12) z <- sample(x=c(x1,x2,y1),size=90000,replace=T) hist(z, breaks=30) qqnorm(z) qqline(z)

# Beginning of function

CLT <- function(population,n=30,N=1000)

{

dev.new()

par(mfrow=c(2,2))

hist(population, freq=F, breaks=30, main="Population",

sub=paste("mean=",signif(mean(population),4),

" sd=",signif(sd(population),4)) )

qqnorm(population)

qqline(population)

xbar <- rep(NA,N)

Your code goes here

}

CLT(population=z,n=1,N=500)

CLT(population=z,n=4,N=500)

CLT(population=z,n=16,N=500)

CLT(population=z,n=64,N=500)

CLT(population=z,n=256,N=500)

CLT(population=z,n=1024,N=500)

2

Download