Biometrical Genetics QIMR workshop 2013 Slides from Lindon Eaves Biometrical Genetics from a BG point of view • Focus is on individual differences and variation in a population • Why are people different • Why are family members similar Environment Genotype r G h Measured variable E Latent variables e P Phenotype The Basic Model Phenotype=Genotype+Environment P=G+E {+f(G,E)} f(G,E) = Genotype-environment interaction and correlation GENES (G) • Contribution (“Heritability”) • Type of Action (“Additive”, “Dominant”, Epistatic”) • Number, location and function Basic Model for Effects of a Single Gene on a Quantitative Trait Think about a variant with two alleles A and a… Decreasing Mid-homozygote Dominance deviation - Homozygous effect Increasing + Homozygous effect Fisher (1918): Basic Ideas • Continuous variation caused by lots of genes (“polygenic inheritance”) • Each gene followed Mendel’s laws • Environment smoothed out genetic differences • Genes may show different degrees of “dominance” • Genes may have many forms (“mutliple alleles”) • Mating may not be random (“assortative mating”) • Showed that correlations obtained by e.g. Pearson and Lee were explained well by polygenic inheritance Mendelian Basis of Continuous Variation? Experimental Breeding Experiments a.D is trib u tio n o f s c o re s p ro d u c e d b y tw o g e n e s b. T h e "s m o o th in g " e ffe c t o f th e e n v iro n m e n t (N = 1 0 0 0 s u b je c ts ) (N = 1 0 0 0 s u b je c ts , 2 g e n e m o d e l) 0 .4 0 .4 0 .3 0 .3 0 .2 0 .2 0 .1 0 .1 0 .0 0 .0 0 1 2 3 4 Y1 5 -2 .5 -1 .5 -0 .5 0 .5 2 .5 S1 c. C o n tin u o u s d is trib u tio n o f p o lyg e n ic tra it (1 0 0 g e n e s w ith s m a ll c u m u la tiv e e ffe c ts ) 0 .0 6 0 .0 4 0 .0 2 0 .0 0 75 1 .5 79 83 87 91 95 99 Y1 103 107 111 115 119 123 3 .5 4 .5 5 .5 6 .5 Environment “E” • Contribution (“1-heritability”) • Type (Shared by family, unique to individual, remote, proximal,short-, long-term) • Non-genetic inheritance • Identification Interactions and Correlations f(G,E) • Mating system, population structure • GxE interaction • Multiple variables: Genetic and Environmental Correlation • Direction of Causation and Causal networks • G x E interaction • G – E correlation • Remembering, Forgetting, Development (GxAge, G x Time etc.) Francis Galton (1822-1911) 1869: Hereditary Genius 1883: Inquiries into Human Faculty and its Development 1884-5: Anthropometic Laboratory at “National Health Exhibition” Hereditary Genius (1869, p 317) Galton’s Anthropometric Laboratory: Karl Pearson (1857-1936) 1903: On the Laws of Inheritance in Man: I Physical Characteristics (with Alice Lee) 1904: II Mental and Moral Characteristics 1914: The Life, Letters and Labours of Francis Galton Pearson and Lee’s diagram for measurement of “span” (finger-tip to finger-tip distance) From Pearson and Lee (1903) p.378 From Pearson and Lee (1903) p.378 From Pearson and Lee (1903) p.387 From Pearson and Lee (1903) p. 373 Modern Data The Virginia 30,000 (N=29691) The Australia 22,000 (N=20480) ANZUS 50K: Extended Kinships of Twins Parents of Twins Siblings of Twins Spouses of Twins Twins Offspring of Twins © Lindon Eaves, 2009 Overall sample sizes Relationship Parent-offspring Siblings Spouses DZ Twins MZ Twins # of pairs 25018 18697 8287 5120 4623 Nuclear Family Correlations for Stature (Virginia 30,000 and OZ 22,000) 0.5 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 US Australia © Lindon Eaves, 2009 Nuclear Family Correlations for Liberalism/Conservatism (Virginia 30,000 and Australia 22,000) 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 US Australia © Lindon Eaves, 2009 The (Really!) BIG Problem Families are a mixture of genetic and social factors Galton’s Solution: Twins (Though Augustine may have got there first – 5th cent.) One (ideal) solution Twins separated at birth But separated MZs are rare An easier alternative: Identical and non-identical twins reared together: Galton (Again!) IDENTICAL TWINS • MONOZYGOTIC: Have IDENTICAL genes (G) • Come from the same family (C) • Have unique experiences during life (E) FRATERNAL TWINS • DIZYGOTIC: Have DIFFERENT genes (G) • Come from the same family (C) • Have unique experiences during life (E) S c a tte rp lo t fo r c o rre c te d M Z s ta tu re 13 HTDEV2 8 3 -2 r=0.924 -7 -1 2 -1 0 -5 0 5 10 HTDEV1 Data from the Virginia Twin Study of Adolescent Behavioral Development S c a tte rp lo t fo r a g e a n d s e x c o rre c te d s ta tu re in D Z tw in s 20 HTDEV2 10 0 r=0.535 -1 0 -2 0 -1 6 -1 1 -6 -1 4 9 14 HTDEV1 Data from the Virginia Twin Study of Adolescent Behavioral Development Genotype Frequencies in Randomly Mating Population “Hardy-Weinberg Equilibrium” frequencies What is the mean expected to be? Note: Effects measured from mid-homozygote (“m”) With equal allele frequencies (easier!) put u=v= ½ And the mean is expected to be…. How does A/a affect the variance? Equal allele frequencies u=v= ½ Dominance component Additive component Q: What happens with lots of genes? A: The effects of the individual genes add up. IF… the genes are independent (“linkage equilibrium”) Requires random mating, complete admixture So: Additive Genetic Variance Dominance Genetic Variance Additive and Dominance Components: Unequal allele frequencies. Can show (see e.g. Mather, 1949) Q: What happens when u=v? VA VD Bottom line: With unequal allele frequencies can still separate VA and VD but their definitions change Plotting Effect of Allele frequency on Genetic Variance Components (“R”) d<-1 # Homozygous effect ("additive") h<-1 # Heterozygous deviation ("dominance") u<-seq(0.01,0.99,by=.01) # Vector of frequencies of increasing allele v<-1-u # Frequencies of decreasing allele VA<-2*u*v*(d+(v-u)*h)^2 # Additive genetic variance VD<-4*u*u*v*v*h*h # Dominance genetic variance VP<-VA+VD # Total (genetic) variance # Plot results plot(u,VP,type="l", main="VA (red) and VD (green) as function of increasing allele frequency", xlab="Frequency of increasing allele",ylab="Variance component") # Add line for VA lines(u,VA,col="red") # Add line for VD lines(u,VD,col="green") 1.0 VA (red) and VD (green) as function of increasing allele frequency 0.2 0.4 0.6 VA VD 0.0 Variance component 0.8 VA+VD 0.0 0.2 0.4 0.6 Frequency of increasing allele 0.8 1.0 What about the environment??? Two main sources of environment • Individual experiences – not shared with siblings: VE • “Family” environment – shared with siblings: VC So: the TOTAL variance (Genes + Environment) is: VP = VA+VD+VE+VC “Heritability” “Broad” heritability: h2b=(VA+VD)/VP Proportion of total variance explained by genes “Narrow” heritability: h2n=VA/VP Proportion of total variance explained by additive (homozygous) genetic effects (predicts response to selection – Fisher, 1930) So far: have looked at effects on total variance… How do VA and VD affect the correlations between relatives? Contribution of genes to correlation between relatives (r): r = C/VP Where C=Covariance between relative pairs “C” depends of kind of relationship (sibling, parent-offspring, MZ twin etc) But can also be expressed in terms of VA and VD Approach 1. For a given relationship, work out expected frequencies of each type of pair (AA, aa etc.) 2. Write phenotypes of each type of relative 3. Compute cross-products of phenotypes of members of type of pair 4. Each cross-product by the corresponding frequency 5. Add the result of “4” across all pair types The answer is the covariance you want (if you have done the algebra right!) For equal allele frequencies…. Contribution of one gene to covariance: Notice that terms in d2 and h2 are separated – but their coefficients change as a function of relationship Can add over all genes to get total contribution to covariance Cov(MZ) = VA + VD Cov(DZ) = ½VA + ¼VD Cov(U)= 0 Can use the same approach for other relationships Contributions of VA and VD to covariances between relatives (ignoring environment) Relationship Total variance Sibling (DZ twin) MZ twin Half-sibling First cousin Parent-offspring Avuncular Grand-parent Unrelated VA 1 ½ 1 ¼ 1/ 8 ½ ¼ 1/ 8 0 Contribution to Covariance VD 1 ¼ 1 0 0 0 0 0 0 Adding effects of Environment VP = VA + VD + VE + VC Cov(MZ) = VA + VD + VC Cov(DZ) = ½VA + ¼VD + VC Cov(UT) = VC Etc. To get the expected correlations Just divided expectations by expected total variance Results are proportional contributions of VA, VD etc. to total variance Practice (paper and pencil) • Pick a “d” and “h” (e.g. d=1,h=1; d=1,h=0) • Pick a frequency for the increasing (A) allele (e.g. u=0.2, u=0.7) • Work out VA and VD • Tabulate on board Vienna