CSSS 508: Intro to R 1/27/06 Homework 3 Solutions 1) Using the distribution commands: a) What is P(X <=1) for a normal with mean 2 and st dev 1? > pnorm(1,2,1) [1] 0.1586553 P(X > 2.5)? > 1-pnorm(2.5,2,1) [1] 0.3085375 b) Test grades in an introductory statistics class are distributed normally with a mean of 65 and a st dev of 10. What is the probability of getting a grade between 60 and 80? P(60 <= X <= 80) = P(X<=80) – P(X<=60) (Remember that in a continuous distribution, there is no probability for a single value. i.e P(X <=80 = P(X< 80)) > pnorm(80,65,10)-pnorm(60,65,10) [1] 0.6246553 c) Same scenario. What grade do you have to get to be in the 90th percentile? > qnorm(.90,65,10) [1] 77.81552 d) Generate 15 values from a Poisson distribution with rate 10. > rpois(15, 1/10) [1] 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 e) Flip 5 fair coins and count the number of heads. Repeat 49 more times. What percentage of the 50 trials did you get 3 or more heads? > n.heads<-rbinom(50,5,.5) > n.heads [1] 2 1 2 5 1 3 1 3 2 2 0 1 3 3 1 2 1 3 4 3 4 0 3 3 2 3 2 2 2 2 2 4 3 4 1 4 2 2 1 2 3 3 2 2 4 1 3 4 4 3 Each number indicates the number of heads after flipping 5 coins once. Then we count how many are >=3 and divide by the total number of trials. > sum(n.heads>=3)/50 [1] 0.46 Rebecca Nugent, Department of Statistics, U. of Washington -1- f) In the t-distribution, what is P(X <=-1): If df = 2? > pt(-1,2) [1] 0.2113249 If df = 5? > pt(-1,5) [1] 0.1816087 If df = 10? > pt(-1,10) [1] 0.1704466 Note that the higher the degrees of freedom, the probability gets smaller. Smaller degrees of freedom mean heavier tails (more prob). The higher the degrees of freedom, the closer to a standard normal distribution we get. > pnorm(-1,0,1) 0.1586553 2) Build a 4 by 4 matrix using a for loop or a while loop where each column is a random sample from the numbers 1 through 20. Using a for loop: > m.2<-matrix(0,4,4) > for(i in 1:4){ + m.2[,i]<-sample(seq(1,20),4) + } > m.2 [,1] [,2] [,3] [,4] [1,] 5 18 9 15 [2,] 1 7 4 5 [3,] 18 15 8 9 [4,] 17 10 6 12 Using a while loop: > m.2<-matrix(0,4,4) > j<-1 > while(j<=4){ + m.2[,j]<-sample(seq(1,20),4) + j<-j+1 + } > m.2 [,1] [,2] [,3] [,4] [1,] 3 10 20 8 [2,] 17 4 7 2 [3,] 4 5 6 10 [4,] 13 19 3 5 (Initializing the matrix) (Initializing the matrix) Rebecca Nugent, Department of Statistics, U. of Washington -2- 3) Write a for loop that prints out the sum of the consecutive integers from 1 to i. To find the sum of consecutive integers, we can use the sum function on a sequence of numbers: sum(seq(1,i)). > for(i in + cat("The + } The sum of The sum of The sum of The sum of The sum of 1:5){ sum of 1 to",i,"=",sum(seq(1,i)),"\n") 1 1 1 1 1 to to to to to 1 2 3 4 5 = = = = = 1 3 6 10 15 4) Generate 2 vectors of random uniform data from [1,3], 100 observations each. You now have 100 pairs of two-dimensional data on the square [1,3] by [1,3]. Think of it as a box on a graph where x ranges from 1 to 3 and y ranges from 1 to 3. You could divide this box into four smaller squares by dividing each dimension in half. Create a categorical variable that indicates in which quadrant each observation lies. First, creating the data. > box.data<-cbind(runif(100,1,3),runif(100,1,3)) Initializing the categorical variable: > box.cat<-rep(0,100) > + + + + + for(i in 1:100){ if(box.data[i,1]<=2 if(box.data[i,1]<=2 if(box.data[i,1]>2 if(box.data[i,1]>2 } & & & & box.data[i,2] box.data[i,2] box.data[i,2] box.data[i,2] <=2) > 2) <=2) > 2) box.cat[i]<-1 box.cat[i]<-2 box.cat[i]<-3 box.cat[i]<-4 We can look how many points we have in each category/smaller square. > table(box.cat) box.cat 1 2 3 4 22 24 26 28 Could do this with conditional statements as well. > > > > box.cat[box.data[,1] box.cat[box.data[,1] box.cat[box.data[,1] box.cat[box.data[,1] <=2 <=2 > 2 > 2 & & & & box.data[,2] box.data[,2] box.data[,2] box.data[,2] <=2]<-1 > 2]<-2 <=2]<-3 > 2]<-4 Rebecca Nugent, Department of Statistics, U. of Washington -3- Rebecca Nugent, Department of Statistics, U. of Washington -4-