final2014 Ellen Rim

advertisement
GENE 210: Personalized Genomics and Medicine
Spring 2014 Final Exam
Due Monday, May 26 2014 at midnight.
Stanford University Honor Code
The Honor Code is the University’s statement on academic integrity written by
students in 1921. It articulates University expectations of students and faculty in
establishing and maintaining the highest standards in academic work:
• The Honor Code is an undertaking of the students, individually and collectively:
– that they will not give or receive aid in examinations; that they will not
give or receive unpermitted aid in class work, in the preparation of reports,
or in any other work that is to be used by the instructor as the basis of
grading;
– that they will do their share and take an active part in seeing to it that
others as well as themselves uphold the spirit and letter of the Honor
Code.
• The faculty on its part manifests its confidence in the honor of its students by
refraining from proctoring examinations and from taking unusual and
unreasonable precautions to prevent the forms of dishonesty mentioned
above. The faculty will also avoid, as far as practicable, academic procedures
that create temptations to violate the Honor Code.
• While the faculty alone has the right and obligation to set academic
requirements, the students and faculty will work together to establish optimal
conditions for honorable academic work.
Signature
I attest that I have not given or received aid in this examination, and that I have
done my share and taken an active part in seeing to it that others as well as
myself uphold the spirit and letter of the Stanford University Honor Code.
Name: Ellen Rim
SUNet ID: 05519955
Signature: Ellen Rim
Some questions may have multiple reasonable answers: if you are unsure,
provide a justification based in genetics and cite your sources (SNPedia is fine,
journals are better); as long as the justification is sound, you will receive full
credit.
If you are unsure which SNP(s) are associated with a trait, you may consult any
reference you like.
A family of 3 (mother/father/daughter) has come to you to find out what they can
learn from their genotypes. The parents were both adopted, so they do not know
any of their family history. You have sent their DNA to LabCorp, which ran their
genotypes on a custom 1M OmniQuad array, and they’ve returned the results at:
http://www.stanford.edu/class/gene210/files/final/final_patients.zip (X points)
1. A mislabeling in the lab has caused the samples to be shuffled around and
they are simply labeled: ‘patient1.txt,’ ‘patient2.txt,’ and ‘patient3.txt.’ Determine
which sample is the mother’s, the father’s and the daughter’s. (15 points)
Father: Patient 3
Mother: Patient 1
Daughter: Patient 2
2. What can you tell about the ancestry of the parents? (15 points)
Mother: European (GENOtation chromosome painting and PCA analysis), light skin
pigmentation (rs1426654 from SNPedia). Based on GENOtation chromosome painting
using HapMap3, the haplotype blocks were most similar to Italian Toscani, with blocks
observed in Mexican and Gujarati populations mixed in.
Father: Northern European (GENOtation chromosome painting and PCA analysis), light
skin pigmentation (rs1426654 from SNPedia). Similar to the mother, HapMap3
chromosome painting showed haplotype blocks most similar to Italian Toscani, with
blocks observed in Mexican and Gujarati populations mixed in.
3. The parents are concerned about their daughter’s chance for getting breast
cancer. You investigate the genomes of the father, mother and the daughter and
provide genetic counseling for the family. (15 points total)
A. What is the lifetime risk for breast cancer for the overall population of
Europeans?
According to a 2010 study, the lifetime risk for all breast cancer subtypes is
13.8% among European women (Kurian et al., Breast Cancer Research 2010).
B. Does the genotype of the mother or daughter (at rs77944974) alter their
risk of breast cancer? Explain briefly, providing data on the most
important risk alleles and their effect on risk for breast cancer.
Deletion variant at rs77944974, also known as 185delAG within exon2 of BRCA1, was
observed in four unrelated breast cancer patients in a 1998 study. D is the risk allele and
this deletion is predicted to cause early truncation at the zinc finger domain of the
protein, increasing the lifetime risk of breast cancer to about 60% with two deletion
alleles (Simard et al., Nature Genetics 1994 and Li et al., Frontiers in Bioscience 2013).
Since the mother has one deletion allele at rs77944974, her genotype here increases
her predicted risk of breast cancer. The daughter does not have any deletion alleles and
her risk of breast cancer is not affected by this SNP.
C. Briefly outline what advice you would give to the mother about her risk for
breast cancer, based on your analysis?
In addition to a deletion allele at rs77944974, the mother had two alleles among eleven
BRCA1 and BRCA2 SNPs (rs1799950, rs4986850, rs2227945, rs16942, rs1799966,
rs766173, rs144848, rs4987117, rs1799954, rs11571746, rs4987047) that have been
associated with breast cancer susceptibility (SNPedia and Johnson et al., Human
Molecular Genetics 2007). The odds ratio of the risk allele at rs1799950 was calculated
to be 1.72. Homozygotes for risk allele at rs144848 have 1.31-fold greater risk of breast
cancer (Healey et al., Nature Genetics 2000). Considering that she’s not homozygous
for these mutations, and that 97% of women with 0-2 minor risk alleles out of 25
susceptibility-associated SNPs did not develop breast cancer, I would advise her to get
regular checkups but not mastectomy.
D. Briefly outline what advice you would give to the daughter about her risk
for breast cancer, based on your analysis?
I would also advise the daughter to get regular checkups but not preventive mastectomy.
The daughter has three risk alleles among the eleven BRCA1 and BRCA2 SNPs
(Johnson et al., Human Molecular Genetics 2007). However, she’s not homozygous for
these mutations either.
4. Weeks later, the father (a 42 year old, 185 cm in height, 80 kg in weight, not
taking any other medication) is rushed to the hospital with a stroke. What dose of
warfarin would be given from a clinic that does not perform genetic testing?
What dose of warfarin would be given from a clinic that does perform genetic
testing? Explain the genetic basis for modifying the warfarin dose of the father
given his genotype. (8 points)
5.5mg/day clinical dose was prescribed on http://www.warfarindosing.org and
39.4mg/week was prescribed on GENOtation when genetic information was not
considered. However, a clinic that performs genetic testing would prescribe
24.7mg/week. The father’s genotype is TT at rs9923231 and CT at rs1799853. These
genotypes affect the warfarin target VKORC1 and its metabolism by CYP2C9,
respectively, and render him more sensitive to warfarin. To prevent bleeding, his
warfarin dose should be adjusted based on this information.
5. In her next visit, you observe that the mother has high cholesterol. Would you
prescribe simvastatin (Zocor) to the mother? Why or why not? (7 points)
No, her genotype at rs4149056 is CC. This results in compromised activity of OATP1B1,
a regulator of drug uptake, leading to worse side effects upon statin use. The CC
genotype has been shown to increase statin-induced myopathy risk 16.9-fold (SEARCH
Collaborative Group, NEJM 2008).
6. You counsel the family about the risk for type 2 diabetes for their daughter.
You analyze the daughter’s genome on genotation.com. You need to explain the
results to the family, and how this influences the daughter’s risk for Type 2
diabetes. (15 points total)
A. What is the likelihood of type 2 diabetes prior to genetic testing?
23.7% according to GENOtation, lifetime risk among females in the US is 38.5%
according to Centers for Disease Control and Prevention.
B. What is the likelihood of type 2 diabetes following analysis of the
daughter’s genotype using Genotation?
44.2%
C. How many SNPs were used to assess the risk for type 2 diabetes?
15 SNPs
D. How were the SNPs combined to give the overall score? Which SNP had
the greatest influence on diabetes risk? Explain briefly.
Prior T2D probability (starting point was determined based on ethnicity) was
sequentially multiplied by T2D likelihood ratio of each SNP. The likelihood ratio
was then converted to a probability and the end point was 44.2%. Rs9465871
had the greatest influence; this is a SNP with a large odds ratio of 2.17 for
homozygotes and the daughter is homozygous for the risk allele (Wellcome Trust
Case Control Consortium, Nature 2007).
E. What advice can you provide to the family to help mitigate the chance of
their daughter developing type 2 diabetes?
Since her T2D risk is above average, I would advise her to engage in physical
activity on a regular basis, keep body weight in check, and consume less fat and
sugar although T2D is usually late-onset. It would also be good to check glucose
level and blood pressure regularly as she gets older.
7. The following two SNPs were shown to be associated with risk for type 2
diabetes in two GWAS studies. (15 points total)
snp
rs4402960
rs7754840
odds ratio
1.14
1.28
p-value
8.9 x 10-16
3.5x10-7
cases
14586
1921
controls
17968
1622
A. Which SNP has a larger effect size on risk for type 2 diabetes? Explain
your answer.
rs7754840. The odds ratio of developing T2D given a specific genotype at this
SNP compared to not having the genotype is greater than the other SNP.
B. Which SNP is most statistically significant for risk for type 2 diabetes; i.e.
which SNP is most likely to have a true association?
rs4402960. The much smaller p-value and bigger sample size indicate greater
statistical significance.
C. Is the SNP with the biggest effect size on risk for type 2 diabetes always
going to be the SNP that is most statistically significant? Why or why not?
No, “biggest effect” or large odds ratio sometimes results from a small sample
size or from a very rare SNP. Therefore, SNPs that are not most statistically
significant can yield large odds ratios or predicted effect size.
D. rs7754840 is a SNP that lies within the CDKAL1 gene. This SNP was
identified because it was contained on the Illumina Chip used for
genotyping in the GWAS study. Does this result indicate that rs7754840
is the causal mutation? Does this result indicate that CDKAL1 is involved
in type 2 diabetes? Explain why or why not.
No, the SNP could just be in linkage disequilibrium with a causal gene. It is also
possible that this mutation is a result of a compensatory response to the disease
phenotype.
8. The two parents are considering having another child. You analyze their
genomes and then counsel them on their chance of having a child with one of the
following diseases: hemochromatosis (rs1800562), Alzheimer’s disease
(specifically, look for APOE4 status), breast cancer (BRCA1 status; rs77944974),
cystic fibrosis (rs113993960) and sickle cell anemia (rs334).
For each of these five diseases, what is the chance that the child will have that
disease? Briefly explain your answer. (15 points total)
Hemochromatosis: there is no probability that the child will be homozygous AA for the
hemochromatosis genotype, but 50% probability that s/he will be a carrier. Carriers may
have mild symptoms of hemochromatosis (SNPedia).
Alzheimer’s: since both parents have CC genotype at rs7412, the child will also have CC
genotype. At rs429358 of the ApoE gene, the child will have either CT or CC genotype,
with 50% probability each. Therefore, the child will either have 2x increased risk (CT at
rs429358) or 11x increased risk (CC at rs429358) according to SNPedia.
Breast cancer: there is 50% chance that the child will inherit the deletion mutation from
mother at rs77944974. This will lead to greater breast cancer risk if the child is a girl.
Cystic fibrosis: both parents carry one copy of the deletion mutation at rs113993960.
Therefore, there is 25% probability that the child will have cystic fibrosis and 50%
probability s/he will be a symptom-free carrier.
Sickle cell anemia: mother and father are both homozygous for the normal Hb A allele at
rs334 so the child will not have sickle cell anemia.
9. Prenatal genetic diagnosis (15 points total)
A) A pregnant woman seeks non-invasive prenatal genetic testing and provides a
sample of plasma. You isolate the cell-free DNA (cfDNA) from the maternal
plasma and determine that 10% of it is derived from the fetus. You perform whole
genome sequencing on genomic DNA samples from the mother and father. Next
you perform whole genome sequencing on the cfDNA isolated from maternal
plasma. For each of the sites below, you obtain 100X coverage (i.e., 100 reads
for each site). Fill in the expected read counts in the tables below. Use the
parental genotypes below and the observed allele counts for the cfDNA
sequencing to infer the genotype of the fetus at each of three sites and fill them
in the table.
Site 1
A reads expected
If mother transmits A
If mother transmits G
55
50
Site 2
A reads expected
If mother transmits A
If mother transmits G
55
50
Site 3
T reads expected
If mother transmits T
If mother transmits C
55
50
Observed reads in cell free DNA
Site 1
59 A reads
Site 2
52 A reads
Site 3
49 T reads
Infer fetal genotype:
Site 1
AA
Site 2
AG
Site 3
TC
B) You worry that your call at site 2 might not be accurate. In order to improve
the accuracy of your fetal genotyping, you use parental haplotype blocks. Reevaluate your fetal genotype inference based on the maternal haplotypes below.
Re-evaluated fetal genotype inference:
Site 1
AA
Site 2
AA
Site 3
TC
10. Extra credit question available at
http://www.stanford.edu/class/gene210/web/html/extracredit.html (12 pts).
12345678-
D
C
Stuart
H
B
F
E
Aaron
Download