Assignment_GA_solution: Laerd Statistics. Multiple Regression Analysis using Stata URL https://statistics.laerd.com/statatutorials/multiple-regression-using-stata.php A health researcher wants to be able to predict "VO2max", an indicator of fitness and health. Normally, to perform this procedure requires expensive laboratory equipment, as well as requiring individuals to exercise to their maximum (i.e., until they can no longer continue exercising due to physical exhaustion). This can put off individuals who are not very active/fit and those who might be at higher risk of ill health (e.g., older unfit subjects). For these reasons, it has been desirable to find a way of predicting an individual's VO2max based on attributes that can be measured more easily and cheaply. To this end, a researcher recruited 100 participants to perform a maximum VO2max test, but also recorded their "age", "weight", "heart rate" and "gender". Heart rate is the average of the last 5 minutes of a 20 minute, much easier, lower workload cycling test. The researcher's goal is to be able to predict VO 2max based on these four attributes: age, weight, heart rate and gender. Once the analysis was completed, the estimated multiple linear regression equation was found to be: ̂ = 87.83 – 0.165 x age – 0.385 x weight – 0.118 x heartrate + 13.208 x gender 𝑉𝑂2𝑚𝑎𝑥 where 20<age<65 130<weight<230 ( in pounds) 100< heartrate<185 Gender =0 for female and 1 for male The goal of this assignment is to implement the steps which will lead to the evolution of the first generation of variable combination which will optimize the predicted VO2. Complete the following problems below. Within each part, please include your R program output with code inside of it and any additional information needed to explain your answer. a.) (2 points) Generate a population with 4 individuals/chromosome using a seed of 9999 The population of 4 individuals / chromosomes are as follows: Age Weight Heartrate Gender Individual 1 58 198 184 Female Individual 2 49 214 162 Male Individual 3 56 151 169 Female Individual 4 29 211 102 Male > set.seed(9999) > a<-floor(runif(4, 20, 65)) > w<-floor(runif(4, 130, 230)) > h<-floor(runif(4, 100, 185)) > g<-round(runif(4, 0, 1),0) > > pop_chrom<-matrix(c(a,w,h,g),byrow = TRUE,nrow=4,ncol=4) #population uni ts > pop_chrom # generated population [,1] [,2] [,3] [,4] [1,] 58 49 56 29 [2,] 198 214 151 211 [3,] [4,] 184 0 162 1 169 0 102 1 b.) (2 points) Reorder the population according to their fitness probabilities. Use a seed of 4554 Chromosome 1: Since generated probability for chromosome 1 is less than the corresponding fitness probability, Chromosome 1 is not reordered. Chromosome 2: Since generated probability for chromosome 2 is more than the corresponding fitness probability, Chromosome 2 is reordered. We check the other fitness probabilities to see if anyone is greater than the generated probability for Chromosome 2. Since there are none, we reorder Chromosome 2 by the Chromosome 4 which was the last compared chromosome for fitness probability. Chromosome 3: Since generated probability for chromosome 3 is more than the corresponding fitness probability, Chromosome 3 is reordered. We check the other fitness probabilities to see if anyone is greater than the generated probability for Chromosome 3. Since there are none, we reorder Chromosome 3 by the Chromosome 2 which was the last compared chromosome for fitness probability. Chromosome 4: Chromosome is reordered with the only available chromosome, Chromosome 3 Reordered Pop Age Individual 1 58 Individual 2 29 Individual 3 49 Individual 4 56 What is the fit() function here? Weight 198 211 214 151 Heartrate 184 102 162 169 Gender Female Male Male Female > fitness<-fit(pop_chrom) > fitness # evaluated values of the fitness function [,1] [,2] [,3] [,4] [1,] -19.682 -8.553 0.513 2.982 > obj<-abs(fitness) > total<-sum(obj) > prob<-obj/total > set.seed(4554) > gen_prob<-runif(4, min=0, max=1) > data.frame(t(prob), gen_prob) 1 2 3 4 t.prob. 0.62029625 0.26955563 0.01616766 0.09398046 gen_prob 0.03689196 0.84700668 0.27576226 0.87378774 newC1<-pop_chrom[,1] newC2<-pop_chrom[,4] newC3<-pop_chrom[,2] newC4<-pop_chrom[,3] Chrom_reordered<-matrix(c(newC1, newC2, newC3,newC4),nrow=4,ncol=4) Chrom_reordered [1,] [2,] [3,] [4,] [,1] [,2] [,3] [,4] 58 29 49 56 198 211 214 151 184 102 162 169 0 1 1 0 c.) (4 points) If the crossover probability is 0.35 find the chromosome(s) that will undergo crossover (use a seed of 5551) and check for the position(s) where there will be a crossover and produce the new chromosome(s) if any. Use a seed of 6661. Since the 3 and 4th randomly generated value are less than the crossover probability, chromosome 3/individual3 and chromosome 4/individual 4 will crossover. NewChrom<-matrix(NA,nrow=4, ncol=4) > rho<-0.35 > R<-matrix(NA,nrow=4, ncol=1) > set.seed(5551) > for(k in 1:4){ + R[k]<-runif(1, min=0, max=1) + R[k] + } > R [,1] [1,] 0.54010553 [2,] 0.78905902 [3,] 0.07633008 [4,] 0.31832900 The crossover will take place in position 1. > set.seed(6661) > cross<-floor(runif(1, min=1, max=3)) > cross [1] 1 > > > > > > new_c1<-newC1 new_c2<-newC2 new_c3<-c(49,151,169,0) new_c4<-c(56,214,162,1) Chrom_cross<-matrix(c(new_c1, new_c2, new_c3,new_c4),nrow=4,ncol=4) Chrom_cross [,1] [,2] [,3] [,4] [1,] 58 29 49 56 [2,] 198 211 151 214 [3,] 184 102 169 162 [4,] 0 1 0 1 d.) (2 points) If the mutation probability is 0.20 what are the mutation positions (use a seed of 8989) Here # of genes =16. Number of genes that will mutate = mutation probability* # of genes (rounded to the lowest number) The positions where the genes will mutate are in the 6th position (weight for the individual 2, the 13th position (age for the 4th individual) and positon 9 (which is the age for the 3 rd individual) > #mutation > tot_len_gene<-4*4 > mut<-0.20 > mut_gen<-floor(tot_len_gene*mut) > mut_gen [1] 3 > #Finding the postion of the genes to be mutated > set.seed(8989) > floor(runif(3, min=1, max=16)) [1] 6 13 9