Carl Vangestel

advertisement
Landscape genomics in sugar pines (Pinus
lambertiana)
Exploring patterns of adaptive genetic
variation along environmental gradients.
Carl Vangestel
Spatial Genomics
Why associations with measures of aridity?
• Drought stress common cause mortality and annual yield loss
• Shortage of water is one of the strongest environmental
constraints and abiotic selective forces in trees
• Geography directly affect water availability → clinal variation
in adaptive traits
Spatial Genomics
Why associations with measures of aridity?
• Future climate change
→ affect local abiotic conditions and distribution of trees
→ higher temperatures and increased variability in
precipitation SW US
→ increase in frequency and intensity of drought
Spatial Genomics
Why sugar pine?
• Sugar pines are less tolerant to drought stress than other conifer
species
→ expected to show strong clinal patterns in adaptive genetic
variation along aridity gradient
→ very sensitive to future climate changes: alterations in
current distribution range
• One of the most diverse genomes among conifers
→ average heterozygosity of specific genes was 26 percent
(upper range of pines studied so far)
Spatial Genomics
Climate Change
current
2030
(Source: USDA Forest Service, RMRS, Moscow Forestry Science Labaratory)
- Different scenarios
- Hadley Climate Scenario
2060
2090
Spatial Genomics
Detailed knowledge on adaptive variation may become crucial
to mitigate impact global climate change
How adaptive variation is distributed over the range of
environments is largely unknown
Goal of this study:
• identify adaptive SNP’s associated with variation in temperature,
precipitation, aridity index (precipitation/potential evapotranspiration), elevation
• functionally annotate these genes
• explore both neutral and adaptive variation across the sugar pine’s
range
Spatial Genomics
N= 338 individuals
Spatial Genomics
• Transcriptome assembly: Sanger, 454 (pool) and Illumina (3 ind)
• Candidate SNPs selection
 Literature
 SNP Quality
• MYB proteins (stomatal closure, etc ...)
• heat shock proteins (prevention of protein denaturation during cellular dehydration)
• Trehalose-6-phosphate synthase (osmotic protection cell membranes during
dehydration)
• LEA proteins (membrane and protein stabilisers, etc ...)
•
...
• First screening: 67 genes selected
• Second screening: 109 under review
Spatial Genomics
Multi-analytical approach
Generalized
linear models
Fst Outlier
Analysis
Bayesian
Environmental
analyses
Spatial Genomics
Neutral SNP
Spatial Genomics
Neutral SNP
Gene Flow (IBD)
Genetic Drift
Spatial Genomics
• “Separate” neutral patterns from selective ones
• Explore adaptive patterns while accounting for neutral population
structure
‘Neutral SNP’
‘Adaptive SNP’
Spatial Genomics
Generalized linear models
For each SNP j:
  ij 
   int   0 ENVi  1q1i  ...  12 q12 i
log
1 
ij 

ENVi = Environmental value for tree i
q1i .. q12n: first n principal components of Q-matrix for tree i
Spatial Genomics
Fst Outlier Analysis
Arlequin
Spatial Genomics
Fst Outlier Analysis
BayeScan
FDR=0.2
FDR=0.05
FDR=0.001
11
0.05
0.05
0.05
10
10
0.04
0.04
0.04
10
66
0.03
0.02
0.01
0.01
0.02
9
0.02
fst
0.03
fst
0.03
66
0.01
-2
-3
-4
0
-1
-2
-3
0
-4
log10(q value)
Alpha11 posterior distribution
Alpha66 posterior distribution
Density
0.2
0.4
0.3
Density
Density
0.4
0.6
0.6
0.5
0.8
0.6
0.8
0.7
1.0
Alpha10 posterior distribution
0.5
1.0
1.5
2.0
2.5
3.0
Alpha10
SNP10
[0.68,2.35]
0.0
0.1
0.2
0.0
0.0
0.0
0.5
1.0
1.5
Alpha11
2.0
2.5
3.0
HPDI
SNP11:
[0.92,2.52]
3.5
-1
-2
log10(q value)
log10(q value)
0.4
-1
0.2
0
0.0
fst
11
11
0
1
2
3
Alpha66
SNP66:
[0.00,2.20]
4
-3
-4
Spatial Genomics
Bayesian Environmental Analysis
ε𝑙
fancestral
𝑥𝑙1
𝑥𝑙2
𝑥𝑙3
𝑥𝑙4
𝑥𝑙5
𝑔(θ𝑙1 )
𝑔(θ𝑙2 )
𝑔(θ𝑙3 )
𝑔(θ𝑙4 )
𝑔(θ𝑙5 )
Drift: fpopulation deviate
Gene flow: deviations covary
Transformed variable 𝑔(θ𝑙𝑖 )
Spatial Genomics
Heat map of var-cov matrix
(Coop et al., 2010)
ρ²
ρ
1
ρ
ρ²
ρ³
ρ²
ρ
1
ρ
← pop5
← pop4
1 ρ
ρ 1
ρ² ρ
ρ³ ρ²
ρ4 ρ³
← pop3
Ω=
← pop2
← pop1
Bayesian Environmental Analysis
ρ4
ρ³
ρ²
ρ
1
← pop1
← pop2
← pop3
← pop4
← pop5
Structure
Spatial Genomics
Bayesian Environmental Analysis
• Selected 1 SNP per gene for var-cov matrix (excluded putative selective genes)
Correlation matrix BayEnv
Pairwise Fst matrix
Spatial Genomics
Bayesian Environmental Analysis
• Formulate null model: drift/gene flow
• Alternative model: drift/gene flow + selection
Null model:
P(θl|Ω, εl) ~ N(εl, εl(1- εl) Ω)
Alternative model:
P(θl|Ω, εl, β) ~ N(εl + βY, εl(1- εl) Ω)
• Bayes Factor: ratio of posterior probability under alternative to the
one under null
• High BF indicative for SELECTION
Spatial Genomics
Bayesian Environmental Analysis
Download