CSSS 508: Intro R

advertisement
CSSS 508: Intro R
2/03/06
Homework 4 Solutions
These solutions are just one way you could write these functions. There will be lots of
correct solutions. I will post clever/good/different ones for your perusal.
Coding style is personal, but I’ve tried to mix it up.
1) Write a function that takes in a vector of numbers and returns only the minimum and
the maximum of the vector. Generate 10 random normal values (print them) and test
your function on them.
########################################
##Function 1
##
##Arguments: vector of numbers
##Outputs:
min and max of the vector
########################################
min.max<-function(vector){
min.number<-min(vector)
max.number<-max(vector)
return(min.number, max.number)
}
> test.vec<-rnorm(10,0,1)
> test.vec
[1] -2.39799479 -0.90877966 -0.04567755 0.83787039 -0.79376690
-0.55464859 -0.27002418 -0.79613343 -0.95182715 0.31625419
> min.max(test.vec)
$min.number
[1] -2.397995
$max.number
[1] 0.8378704
2) Write a function that adds up the odd numbers between 1 and n, a passed-in number.
Return the sum. Test on n = 16 and n = 17.
########################################
##Function 2
##
##Arguments: n: the max number
##Outputs:
sum of all odd numbers between 1 and n
########################################
add.odd<-function(n){
#starting at 1 and stepping by 2
#this will find all odds if n is odd or even.
odd.numbers<-seq(1,n,by=2)
sum.odd.numbers<-sum(odd.numbers)
return(sum.odd.numbers)
}
> add.odd(5)
[1] 9
> add.odd(6)
[1] 9
> add.odd(16)
[1] 64
> add.odd(17)
[1] 81
If you’re fancy, you could write it to handle both positive and negative numbers.
add.odd<-function(n){
#starting at 1 and stepping by 2
#this will find all odds if n is odd or even.
if(n>0)odd.numbers<-seq(1,n,by=2)
else odd.numbers<-seq(1,n,by=-2)
sum.odd.numbers<-sum(odd.numbers)
return(sum.odd.numbers)
}
> add.odd(-1)
[1] 0
> add.odd(-5)
[1] -8
> add.odd(-6)
[1] -8
(1 + -1)
(1 + -1 + -3 + -5)
(1 + -1 + -3 + -5)
3) Write a function that only takes in the dimensions of a matrix and a vector of n
means. Generate a matrix of the given size where the first row is a random sample from
a normal with the first mean and a standard deviation of 1, the second row is a random
sample from a normal with the second mean and a standard deviation of 1, etc. (All
standard deviations should be 1.) Find the median number in each row. Return both the
matrix and the vector of n medians.
########################################
##Function 3
##
##Arguments: n: nrow of a matrix; p: ncol of a matrix; mean.vec:
vector of n means
##Outputs:
generated matrix; vector of medians
########################################
gen.norm.matrix<-function(n,p,mean.vec){
##Two ways to initialize variables:
##NULL then build as you go or build space ahead of time
##We're going to do one of each here
##Initializing the median.vector variable (to build as we go)
median.vector<-NULL
##Build space for the matrix ahead of time
norm.matrix<-matrix(NA,n,p)
##Now filling in the matrix row-by-row
##each row has p elements & is from rnorm with mean.vec[i], sd 1
##then adding the median of the new row onto the median.vector
for(i in 1:n){
norm.matrix[i,]<-rnorm(p,mean.vec[i],1)
median.vector<-c(median.vector,median(norm.matrix[i,]))
}
return(norm.matrix,median.vector)
}
> gen.norm.matrix(4,9,c(1,3,5,7))
$norm.matrix
[,1]
[,2]
[1,] 1.399916 1.899768
[2,] 3.546173 3.430820
[3,] 3.749772 2.763394
[4,] 8.326227 7.238382
[,9]
[1,] -1.422157
[2,] 1.251162
[3,] 3.835126
[4,] 6.371859
[,3]
1.079491
4.162291
5.110853
7.114816
[,4]
[,5]
[,6]
[,7]
[,8]
1.774799 -0.3238758 0.761389 0.4850021 1.452818
2.931297 3.4001752 1.589864 2.0640705 5.043899
3.599903 5.0365784 5.885288 2.7233115 6.074198
7.331289 5.8258485 7.239392 5.4906797 5.817341
$median.vector
[1] 1.079491 3.400175 3.835126 7.114816
The more columns you have (bigger p), the closer you get to the means. Try p=5000.
4) Write a function that takes in a vector of incomes and creates a categorical variable
where the income categories are 0-10,000; 10,000-20,000; 20,000-30,000; etc. Return
the categorical variable.
########################################
##Function 4
##
##Arguments: vector of income values
##Outputs:
categorical variable (counts by 10,000s)
########################################
categorize.income<-function(vector){
##Dimensions
n<-length(vector)
##Need to find out how many categories we need
##Maximum Income
max.value<-max(vector)
n.categ<-ceiling(max.value/10000)
##Creating space for the category counts
#needs to be zeroes here because we’re going to add to the number
cat.var<-rep(0,n.categ)
##first finding which category vector[i] belongs to
##then upping the count in that category by 1
for(i in 1:n){
which.categ<-ceiling(vector[i]/10000)
cat.var[which.categ]<-cat.var[which.categ]+1
}
return(cat.var)
}
> inc.data<-read.table("hw4prob3.dat")
> dim(inc.data)
[1] 150
1
##was read in like a matrix; just want a vector
> inc.data<-inc.data[,1]
> categorize.income(inc.data)
[1] 5 6 4 8 8 9 8 5 5 11 10
> sum(categorize.income(inc.data))
[1] 150
(all of them were categorized...)
4 10
8
3
5
6
9
6
4
8
8
Download