PBG/MCB 622 11/26/2012 Exercise Association Mapping using

advertisement
PBG/MCB 622
11/26/2012 Exercise
Association Mapping using TASSEL
Open TASSEL, Load Genotypic and Phenotypic data going to Data -> Load -> “I will make my best guess
and try”
Calculation of Linkage disequlibrium:





We will calculate LD for chromosome 1 only.
First we will filter our dataset selecting only markers on chromosome 1, with a minor allele
frequency higher than 5% and less than 5% of missing data.
Select the genetotypic data (“Genotype_AM”) and go to Data->Sites->
 Minimum count: 97
 Minimum frequency: 0.05
 Start Position:0
 End Position: 236
 Check “Remove minor SNP states”
 Filter
Select the filtered database and go to Analysis->Linkage disequilibrium
You can plot the results selecting the output LD file and going to Results->LD plot
Calculation of Population Structure Using Principal Components Analysis (Q Matrix)





Make sure TASSEL is in Data mode. Highlight the genotype and click Site. Set the minimum
frequency to 0.05, Minimum count to 97 and have “Remove minor SNP status”. Click Filter.
Numericalization: Highlight the filtered genotype and click Transform. Use the default option of
“Collapse non major alleles.” Click Create data set.
Imputation of missing values: Highlight the numerical genotype and click Transform and then
click Impute Tab. Use the default options. Click Create data set.
PCA: Highlight the imputed numerical genotype, click Transform, and then click PCA Tab. Change
the default option to “Components=3” by choosing Components and type 3 in the text box. Click
Create data set.
You can plot the results selecting the output PCA file and going to Results->Chart
Calculation of the Kinship Matrix (K Matrix) using SNP data

Remove monomorphic sites: Highlight the genotype and click Site in Data mode. Set the
threshold on MAF to 0.05, Minimum count to 97, check “Remove minor SNP status,” then click
Filter.

Estimate kinship: Highlight the filtered genotype and click Kinship in Data mode. A kinship matrix
will be added to the data tree under Matrix category.
Association analysis using GLM (Least squares fixed effects linear model)
1) Naïve Model: Flowering time = Marker effect + residual




Remove monomorphic sites: Highlight the genotype and click Site in Data mode. Set the
threshold on MAF as 0.05, Minimum count to 97, then click Filter.
Joining data: Highlight the two data sets (Filtered genotype and phenotype) by holding the
Control key while selecting the individual data. Then click Intersection (∩) Join on Data mode to
create a combined data set.
Association analysis: Highlight the joint data set then click GLM in Analysis mode to perform
association analysis. Two reports will be added to the data tree.
Visualize results by selecting the “GLM_Marker_test” results file and clicking Results->
Manhattan Plot
2) Q Model: Flowering time = Population Structure+Marker effect + residual




Remove monomorphic sites: Highlight the genotype and click Site in Data mode. Set the
threshold on MAF as 0.05, Minimum count to 97, then click Filter.
Joining data: Highlight the three filtered data sets (“PC_genotype_for_AM”, Filtered genotype
and phenotype) by holding the Control key while selecting the individual data. Then click
Intersection (∩) Join on Data mode to create a combined data set.
Association analysis: Highlight the joint data set then click GLM in Analysis mode to perform
association analysis. Two reports will be added to the data tree.
Visualize results by selecting the “GLM_Marker_test” results file and clicking Results->
Manhattan Plot
Association analysis using MLM (Mixed linear model: Individuals and the residual
are fit as random effects. The other terms are treated as fixed effects.)
3) Q+K Model: Flowering time = Population Structure+ Marker effect + Individuals + residual


Association analysis: Highlight the joint data set created for the Q model
(Phenotype+genotype+PCA) and the Kinship Matrix. Click MLM in Analysis mode. Apply all the
default settings.
Visualize results by selecting the “GLM_Marker_test” results file and clicking Results->
Manhattan Plot
Download