Answers

advertisement
Assignment_GA_solution:
Laerd Statistics. Multiple Regression Analysis using Stata URL https://statistics.laerd.com/statatutorials/multiple-regression-using-stata.php
A health researcher wants to be able to predict "VO2max", an indicator of fitness and health.
Normally, to perform this procedure requires expensive laboratory equipment, as well as
requiring individuals to exercise to their maximum (i.e., until they can no longer continue
exercising due to physical exhaustion). This can put off individuals who are not very active/fit and
those who might be at higher risk of ill health (e.g., older unfit subjects). For these reasons, it has
been desirable to find a way of predicting an individual's VO2max based on attributes that can be
measured more easily and cheaply. To this end, a researcher recruited 100 participants to
perform a maximum VO2max test, but also recorded their "age", "weight", "heart rate" and
"gender". Heart rate is the average of the last 5 minutes of a 20 minute, much easier, lower
workload cycling test. The researcher's goal is to be able to predict VO 2max based on these four
attributes: age, weight, heart rate and gender. Once the analysis was completed, the estimated
multiple linear regression equation was found to be:
̂ = 87.83 – 0.165 x age – 0.385 x weight – 0.118 x heartrate + 13.208 x gender
𝑉𝑂2𝑚𝑎𝑥
where
20<age<65
130<weight<230 ( in pounds)
100< heartrate<185
Gender =0 for female and 1 for male
The goal of this assignment is to implement the steps which will lead to the evolution of the first
generation of variable combination which will optimize the predicted VO2.
Complete the following problems below. Within each part, please include your R program output
with code inside of it and any additional information needed to explain your answer.
a.) (2 points) Generate a population with 4 individuals/chromosome using a seed of 9999
The population of 4 individuals / chromosomes are as follows:
Age
Weight
Heartrate
Gender
Individual 1
58
198
184
Female
Individual 2
49
214
162
Male
Individual 3
56
151
169
Female
Individual 4
29
211
102
Male
> set.seed(9999)
> a<-floor(runif(4, 20, 65))
> w<-floor(runif(4, 130, 230))
> h<-floor(runif(4, 100, 185))
> g<-round(runif(4, 0, 1),0)
>
> pop_chrom<-matrix(c(a,w,h,g),byrow = TRUE,nrow=4,ncol=4) #population uni
ts
> pop_chrom # generated population
[,1] [,2] [,3] [,4]
[1,]
58
49
56
29
[2,] 198 214 151 211
[3,]
[4,]
184
0
162
1
169
0
102
1
b.) (2 points) Reorder the population according to their fitness probabilities. Use a seed of
4554
Chromosome 1: Since generated probability for chromosome 1 is less than the
corresponding fitness probability, Chromosome 1 is not reordered.
Chromosome 2: Since generated probability for chromosome 2 is more than the
corresponding fitness probability, Chromosome 2 is reordered. We check the other
fitness probabilities to see if anyone is greater than the generated probability for
Chromosome 2. Since there are none, we reorder Chromosome 2 by the Chromosome 4
which was the last compared chromosome for fitness probability.
Chromosome 3: Since generated probability for chromosome 3 is more than the
corresponding fitness probability, Chromosome 3 is reordered. We check the other
fitness probabilities to see if anyone is greater than the generated probability for
Chromosome 3. Since there are none, we reorder Chromosome 3 by the Chromosome 2
which was the last compared chromosome for fitness probability.
Chromosome 4: Chromosome is reordered with the only available chromosome,
Chromosome 3
Reordered Pop Age
Individual 1
58
Individual 2
29
Individual 3
49
Individual 4
56
What is the fit() function here?
Weight
198
211
214
151
Heartrate
184
102
162
169
Gender
Female
Male
Male
Female
> fitness<-fit(pop_chrom)
> fitness # evaluated values of the fitness function
[,1]
[,2] [,3] [,4]
[1,] -19.682 -8.553 0.513 2.982
> obj<-abs(fitness)
> total<-sum(obj)
> prob<-obj/total
> set.seed(4554)
> gen_prob<-runif(4, min=0, max=1)
> data.frame(t(prob), gen_prob)
1
2
3
4
t.prob.
0.62029625
0.26955563
0.01616766
0.09398046
gen_prob
0.03689196
0.84700668
0.27576226
0.87378774
newC1<-pop_chrom[,1]
newC2<-pop_chrom[,4]
newC3<-pop_chrom[,2]
newC4<-pop_chrom[,3]
Chrom_reordered<-matrix(c(newC1, newC2, newC3,newC4),nrow=4,ncol=4)
Chrom_reordered
[1,]
[2,]
[3,]
[4,]
[,1] [,2] [,3] [,4]
58
29
49
56
198 211 214 151
184 102 162 169
0
1
1
0
c.) (4 points) If the crossover probability is 0.35 find the chromosome(s) that will undergo
crossover (use a seed of 5551) and check for the position(s) where there will be a
crossover and produce the new chromosome(s) if any. Use a seed of 6661.
Since the 3 and 4th randomly generated value are less than the crossover probability,
chromosome 3/individual3 and chromosome 4/individual 4 will crossover.
NewChrom<-matrix(NA,nrow=4, ncol=4)
> rho<-0.35
> R<-matrix(NA,nrow=4, ncol=1)
> set.seed(5551)
> for(k in 1:4){
+
R[k]<-runif(1, min=0, max=1)
+
R[k]
+ }
> R
[,1]
[1,] 0.54010553
[2,] 0.78905902
[3,] 0.07633008
[4,] 0.31832900
The crossover will take place in position 1.
> set.seed(6661)
> cross<-floor(runif(1, min=1, max=3))
> cross
[1] 1
>
>
>
>
>
>
new_c1<-newC1
new_c2<-newC2
new_c3<-c(49,151,169,0)
new_c4<-c(56,214,162,1)
Chrom_cross<-matrix(c(new_c1, new_c2, new_c3,new_c4),nrow=4,ncol=4)
Chrom_cross
[,1] [,2] [,3] [,4]
[1,]
58
29
49
56
[2,] 198 211 151 214
[3,] 184 102 169 162
[4,]
0
1
0
1
d.) (2 points) If the mutation probability is 0.20 what are the mutation positions (use a seed
of 8989)
Here # of genes =16.
Number of genes that will mutate = mutation probability* # of genes (rounded to the
lowest number)
The positions where the genes will mutate are in the 6th position (weight for the individual
2, the 13th position (age for the 4th individual) and positon 9 (which is the age for the 3 rd
individual)
> #mutation
> tot_len_gene<-4*4
> mut<-0.20
> mut_gen<-floor(tot_len_gene*mut)
> mut_gen
[1] 3
> #Finding the postion of the genes to be mutated
> set.seed(8989)
> floor(runif(3, min=1, max=16))
[1] 6 13 9
Download