Word

advertisement
Carsten Rosenow, Ph.D.
Linkage Analysis Programs
PROGRAMS
FOR LINKAGE ANALYSIS d-CHIPSNP
PRO
1. Generating the Pedigree file:
 For linkage analysis a pedigree file needs to be generated. This file contains
all the necessary information to reconstruct the individual relationships in a
family. The file contains a minimum of five items (Column A-E). In addition
this file also contains the phenotype information (Column F):
1. Family identifier (Numeric, Column A). This identifier is the
same for each member in the same family including spouses
etc.
2. Individual identifier (Numeric, Column B). Each individual in
a pedigree gets assigned a unique ID.
3. Father ID (Numeric, Column C). This column contains the
individual ID of the father from this person. If the father is
unknown put zero in this column (in this case this person is a
founder). [in dChip we require a founder to have both parents
missing. If only one parent is missing, we can put this parent as
an individual with two 0 grand-parents, but with missing data
for all markers for this parent]
4. Mother ID (Numeric, Column D). This column contains the
individual ID of the mother from this person. If the mother is
unknown put zero in this column.
5. Sex (Numeric, Column E). Put the sex of the person in this
column (1: male; 2: female)
6. Phenotype information. The next columns contain information
on phenotypes for discrete and quantitative traits. Disease
status is usually encoded in column F with 1: unaffected; 2:
affected; 0: missing phenotypes. Quantitative traits can be
added in the following columns using numerical values. [dchip
only accept one “Affected” column and not quantitative traits]
 Save this file as a pedigree file (extension .ped):
i. > File > Save as > Save as type > text (Tab delimited) > file name
XXX.ped > save
2. Generating the GDAS output:
 Analyze genotype data from the mapping 10K array using GCOS/GDAS
 Export ONLY the Affy SNP ID column and data columns. Save as tab
delimited text file.
 Export GDAS table into your experiment folder using the excel format
 Open the file in excel
 Select all column headers that contain the name of the experiment files
 >edit >copy
 Save the GDAS file in text format:
Confidential
Page 1
2/15/2016
Carsten Rosenow, Ph.D.

Linkage Analysis Programs
> File > Save as > Save as type > text (Tab delimited) > file
name XXX.ped > save
 Open the pedigree file that was generated earlier
 Paste the header for the genotype column as the 6th column of the pedigree
file. Each individual in the pedigree file has now a genotype column assigned
that will link him to the GDAS file.
edit > paste special > ‘check the transpose tab’ > click: OK
MAKE SURE EACH PERSON IN THE PEDIGREE FILE IS
ASSOCIATED WITH THE RIGHT COLUMN IN THE GDAS
OUTPUT. ANY PERSON THAT HAS NO GENOTYPE DATA
SHOULD HAVE THE ENTRY “NA” IN THIS COLUMN
 Save the pedigree file in text format:
> File > Save as > Save as type > text (Tab delimited) > file name
XXX.ped > save
Summary: You should have two files now. One is the pedigree file that has all the
family information (this file can contain multiple families). The other file is the
GDAS file which contains the genotype information. Both files are linked through
a common denominator. This denominator is the name of your experiment file
that links each person in the pedigree file with its genotype information in the
GDAS file. Both files need to be in tab delimited text format.
3. dChipSNP
The following files need to be located in the dChipSNP folder. All of these files come
with the program.
 SNP Genome Information File
o 11_k_snp_genome_info_hg15_AFAMfreq.xls: This file is used for
African American populations. It contains the allele frequencies
for this population group.
o 11_k_snp_genome_info_hg15_asian.xls: This file is used for
Asian populations. It contains the allele frequencies for this
population group.
o 11_k_snp_genome_info_hg15_Caufreq.xls: This file is used for
Caucasian populations. It contains the allele frequencies for this
population group.
 cytoBand hg15.txt: Contains information about the Cytoband
 hu refGene hg15.xls: Contains gene annotation information
Confidential
Page 2
2/15/2016
Carsten Rosenow, Ph.D.

Linkage Analysis Programs
In addition you need the data file from GDAS with the Genotype
information and SNP identifier and the pedigree file.
If you have all of these files in your folder you can start using the dChipSNP
program. Double-click on the dChip.exe icon in the folder and the user interface
will open up. From there we will do a step by step linkage analysis.
 Analysis > Get External Data > Fill in the name of the study (group name) >
select the Data file (GDAS file) > Select the SNP tab > click ok
Confidential
Page 3
2/15/2016
Carsten Rosenow, Ph.D.
Linkage Analysis Programs
 The screen shows you general information about your experiment. Make sure
all information is correct before you continue
 Analysis > Chromosome > Genome information file: select the genome info
file > refGene file (optional): if you want gene information in your output,
Confidential
Page 4
2/15/2016
Carsten Rosenow, Ph.D.
Linkage Analysis Programs
select the refGene file > Cytoband file (optional): if you want Cytoband
information in your file select the Cytoband file > Analysis method: Linkage
analysis > click ok
 The program now reads all the information and generates the Chromosome
view.
Confidential
Page 5
2/15/2016
Carsten Rosenow, Ph.D.
Linkage Analysis Programs
 Chromosome > Linkage analysis > select the pedigree file > select dominant
or recessive disease based on your disease model > if you want simulation,
select the simulation tab >if you want to run a Mendel error check select the
Detect Mendelian inheritance error tab or > for sib-pair analysis select the sibpair tab > [sib-pair is an experimental function for analyzing every two
siblings by parametric linkage analysis and doesn’t mean the general sib-pair
analysis; so better do not mention here] click ok
 If you have one Chromosome selected in your Chromosome view the LOD
scores for this Chromosome will be calculated. If you check Chromosome >
Show all, the LOD score for all Chromosomes will be calculated. This might
take a while, depending on the size of your pedigree. The maximum bits are
about 17 (based on the formula 2N-F <18 with N = Non-founders and F=
Founders.
CONGRATULATIONS
You have just finished your first linkage analysis using dChipSNP. Please familiarize
yourself with additional features in the program and different visualization options.
Confidential
Page 6
2/15/2016
Download