Stephanie Rosse BIOST 516/EPI 516/PHG 519: Autumn Quarter

advertisement

Stephanie Rosse

BIOST 516/EPI 516/PHG 519: Autumn Quarter 2010

Homework 4

Due Wednesday, November 17, 2010

1. [35 points] Consider two linked SNPs. SNP 1 has alleles A and a, and SNP 2 has alleles B and b.

(a) For the two linked SNPs, obtain the theoretical range of the linkage disequilibrium coefficient D and its absolute value │D│under the follow scenarios: i.

P(A) = ½ , P(B) = ½ ii.

P(A) = :95, P(B) = .95

iii.

P(A) = :95, P(B) = .05

iv.

P ( A ) = ½ , P ( B ) = .

95

Max(-P

A

P

B

, - P a

P b

) ≤D

AB

≤Min(P a

P

B

,P

A

P b

) i.

D: -1/4 ≤ D

AB

≤1/4

│D│: 0 ≤ │D

AB

│ ≤1/4 ii.

D: Max( -.9025,-.0025)≤D

AB

≤Min(0.0475, .0475) 

- 0.0025 ≤D

AB

≤0.0475

│D│:

0 ≤ │D

AB

│ ≤0.0475 iii.

D: - 0.0475≤D

AB

≤0.0025

│D│:0 ≤ │D

AB

│ ≤0.0475 iv.

D: - 0.025≤D

AB

≤0.025

│D│:0 ≤ │D

AB

│ ≤0.025

(b) Under what circumstances might D reach its theoretical maximum value, i.e., D = P(a)P(B) or D =

P(A)P(b)?

Explain what this implies and why this makes sense.

P(a)=1-P(A)

P(b)=1-P(B)

D=P(a)P(B)

P(a) P(B)=P(AB) – P(A)P(B)

[1-P(A)]

P(B) –

P(B)= P(AB) – P(A)P(B)

P(A)P(B)= P(AB)P(A)P(B)

P(B)=P(AB)

D=P(A)P(b)

P(A) P(b)= P(AB)-P(A)P(B)

P(A) [1-P(B)]= P(AB)-P(A)P(B)

P(A)P(A)P(B) = P(AB)P(A)P(B)

P(A)=P(AB)

Thus P(A)=P(B)=P(AB)

Through some algebraic manipulation we find that the allele frequency of A equals the allele frequency of

B, which equals the haplotype frequency. This means that either the SNPs are in complete LD or that the

SNPs lie on top of one another. This makes sense because two loci are said to be in linkage disequilibrium (LD) if their respective alleles do not associate independently. We see here that D will reach its theoretical maximum value when allele frequencies of the two SNPs are equal to each other and to the haplotype frequency, and therefore associate together.

2. [65 points] A sample of 260 independent trios (pedigrees with both parents and a single offspring) from Newfoundland were genotyped for two SNPs on chromosome 14. The two

Stephanie Rosse

BIOST 516/EPI 516/PHG 519: Autumn Quarter 2010

Homework 4

Due Wednesday, November 17, 2010

SNPs are linked, and SNP 1 has alleles A

1 and A

2

, and SNP 2 has alleles B

1 and B

2

. The haplotypes for all 260 offspring were phased using the parental genotypes, and the observed paired haplotype counts for the offspring are as follows: a) Where n= number of individuals in sample

P(A1)=probability random allele in population is type A1

SNP1: A1A2

Genotype

A1A1

A1A2

A2A2

Observation

36+62+17=115

5+28+24+62=119

6+11+9=26

Estimation of PA1

P

A1

≈P

Â1

=2(nA1A1)+nA1A2/(2*N)=[2(115)+119]/(2*260)=0.6712

Expected nP

A1

2nP

2 =117.13

A1

(1-PA1)=114.76 n(1-P

A1

) 2 =28.11

X2=(115-117.13)2/117.13 + (119-114.76)2/114.76 + (26-28.11)2/28.11=.3538

We are estimating 2 parameters (A1 and A2) so we perform the Chi2 test with 1 degree of freedom for our test statistic of 0.3538 p-value=0.552

SNP2: B1B2

Genotype

B1B1

B1B2

B2B2

Observation

36+5+6=47

62+28+24+11=125

17+62+9=88

PB1=P B 1=(2(47)+125)/(2*260)=0.42115

Expected nP

B1

2

2nP

B1

X 2 =(47-46.13) 2 /46.13 + (125-126.77) 2 /126.77 + (88-87.10) 2 /87.10= 0.050421

=46.13

(1-P n(1-P

B1

) 2

B1

)=126.77

=87.10 p-value = 0.8224

We retain the null that these SNPs are in HWE.

Stephanie Rosse

BIOST 516/EPI 516/PHG 519: Autumn Quarter 2010

Homework 4

Due Wednesday, November 17, 2010 b) Count

A1B1

A1B2

A2B1

A2B2

Allele Frequency

2(36) + 62 + 5 + 28= 167

62 + 17(2) + 24 + 62=182

5 + 24 + 12 + 11=52

28 + 62 + 11 + 18= 119

P

167/2(260) = .3212

0.35

0.1

0.2288

PA1=.3212 + .35= 0.6712

PA2=.1 + .228= 0.3288

PB1= .3212+.1=0.4212

PB2= .35 + .2288= 0.5788

Expected Haplotype Counts

A1B1=520(.6712)(.4212)= 147.01

A1B2=520(0.6712)(.5788)= 202.0

A2B1=520(0.3288)(.4212)=72.02

A2B2=520(.3288)(.5788)=98.96

X 2 =(167-147.01) 2 /147.01 + (182-202.02) 2 /202.02 + (52-72.02) 2 /72.02 + (119-98.96) 2 /98.96 = 14.325 p-value= 0.000154

We reject the null and conclude that there is strong evidence that these SNPs are in Linkage

Disequilibrium (p<.01).

C.

𝐷

𝐷

=

𝑀𝑖𝑛 (𝑃𝐴2𝐵1,𝑃𝐴1𝐵2)

= .03849/(.3288 ∗ .4212, .6712 ∗ .5788) =.2773

Stephanie Rosse

BIOST 516/EPI 516/PHG 519: Autumn Quarter 2010

Homework 4

Due Wednesday, November 17, 2010

2

𝐷

2

𝑃𝐴1𝑃𝐴2𝑃𝐵1𝑃𝐵2

=

. 03849

2

. 6712

∗ .3288 ∗ .4212 ∗ .5788 = .0275

Download