Sweep Commands To run a command, go to the Temp/ folder and type ../Other/Ilya_Other/sweep/scripts/run-sweep <command name> [<arguments>] AnalyzeCores LRH Test ./run AnalyzeCores [<core options>] [<lrh test options>] [--background <file>] <project> <outfile> (<pop> <chr>)* Runs the LRH test on all the SNPs of chromosome <chr> genotyped in population <pop>, using the data in <project>. More than one pop/chr pair can be specified; if none are given, reads a tab-separated list of pop/chr pairs from standard input. Outputs its results to <outfile>. Ignores alleles with frequencies < 5% or > 95%. Core Options (defaults in bold, ‘include’ options have ‘exclude’ analogs): --core-window-size N: (e.g., N = 1000000) Maximum number of bases to examine in each direction of the core. (Normally other criteria will be limiting – e.g. EHH will drop below the 0.04 threshold – but this hard limit is set just in case). --include-mono-core-snps: (default: true) Include monomorphic SNPs in cores --include-multi-core-snps: (default: false) Include multi-allelic SNPs in cores (incompatible with Gabriel et al core selection) --include-mono-side-snps: (default: true) Include monomorphic SNPs on both sides of the core in the analysis --include-multi-side-snps: (default: true) Include multi-allelic SNPs on both sides of the core in the analysis --core-size MIN-MAX or --core-size N: (e.g., --core-size 3-10) Core size limits --match-side-snp-density N bases/kb/mb: (default: don’t match) Pick side SNPs at a density of 1 every N bases/kb/mb --match-side-snp-density X cM: (default: don’t match) Pick side SNPs at a density of 1 every X cM --dont-match-side-snp-density: (default: don’t match) Don't filter side SNPs to attain a fixed density LRH Test Options (defaults in bold): --match-markers-at (AllEHH X | distance X (bases | Kb | Mb | cM) | EHHBar X), (e.g. --match-markers-at AllEHH 0.04): For each core allele, select representative markers when AllEHH = approx X, or when SNP is at X bases/Kb/Mb/cM from core, or when EHHBar = approx X. Actual marker AllEHH/distance/EHHBar has to be within 25% of X Other options: --background <file>: Annotate ln(EHH) and ln(REHH) scores with significance using the given background file (see below) CalcLRHBackground LRH Test Background ./run CalcLRHBackground <bin_count> <lrh_file_1> ... <lrh_file_N> <out_background_file> Measures mean and std. dev. of ln(EHH) and ln(REHH) scores in the given <lrh_files> grouped into <bin_count> allele frequency bins, and outputs the results into <out_background_file>. Excludes alleles that have REHH of 0 or 100, since these are unreliable. CalcLRHSignificance LRH Test Significance Annotator ./run CalcLRHSignificance <background_file> <lrh_file_1> ... <lrh_file_N> <out_sig_file> Rescales all the ln(EHH) and ln(REHH) scores in the <lrh_files> according to the background file so as to have zero mean and unit variance within small allele frequency bins. Outputs the results to <out_sig_file>. iHS iHS Test ./run iHS [<ihs test options>] <project> <outfile> (<pop> <chr>)* Runs the iHS test on all the biallelic SNPs with ancestral data (except if --use-minorallele-freq is used) of chromosome <chr> genotyped in population <pop>, using the data in <project>. More than one pop/chr pair can be specified; if none are given, reads a tabseparated list of pop/chr pairs from standard input. Outputs its results to <outfile>. Ignores core SNPs with frequencies < 5% or > 95%. The iHH integral centered around each SNP is performed in three ways: from the SNP to the left side, from the SNP to the right side and from the SNP extending in both directions. Options (defaults in bold): --integrate-to EHH X: (e.g. --integrate-to EHH 0.05) Set upper bound of iHH integral (both ancestral and derived) to the point where the EHH drops to X. --integrate-from AllEHH X: (e.g. --integrate-from AllEHH 1.0) Set lower bound of iHH integral (both ancestral and derived) to the point where AllEHH is X. --integrate-wrt cM | bases: Set “x”-axis with respect to which to integrate. --allow-integrate-to-edge or --disallow-integrate-to-edge: If EHH doesn’t drop to the level specified by --integrate-to, simply integrate as far out as possible; this is useful for simulations, where the 1MB simulation window size is comparable to the length over which EHH decays to 0.05. --use-ancestral-allele-freq: Report the frequency of the ancestral allele; the frequency is used by CalcIHSBackground and CalcIHSSignificance to see what bin to place data in. --use-minor-allele-freq: Report the frequency of the ancestral allele; the frequency is used by CalcIHSBackground and CalcIHSSignificance to see what bin to place data in. iHH_A refers to the minor allele, while iHH_D refers to the major one. Core SNPs without ancestral data can thus be assigned iHS values. CalcIHSBackground iHS Test Background ./run CalcIHSBackground [--one-sided | --two-sided] <bin_count> <ihs_file_1> ... <ihs_file_N> <out_background_file> Measures mean and std. dev. of either one-sided or two-sided unstandardised iHS scores in the given <ihs_files> grouped into <bin_count> allele frequency bins, and outputs the results into <out_background_file>. CalcIHSSignificance iHS Test Significance Annotator ./run CalcIHSSignificance <background_file> <ihs_file_1> ... <ihs_file_N> <out_sig_file> Rescales all the unstandardised iHS scores in the <ihs_files> according to the background file so as to have zero mean and unit variance within small allele frequency bins. Outputs the results to <out_sig_file>. CrossPopAllEHH XPop Test ./run CrossPopAllEHH [<options>] <project> <outfile> (<pop1> <pop2> <chr>)* Runs the XPop test on all the SNPs of chromosome <chr> genotyped in both population <pop1> and <pop2>, using the data in <project>. More than one pop1/pop2/chr trio can be specified; if none are given, reads a tab-separated list of pop1/pop2/chr trios from standard input. Outputs its results to <outfile>. Ignores alleles with frequencies < 5% or > 95%. Options: --extends-to AllEHH X: (e.g. --extends-to AllEHH 0.04) Set the right bound of integration to the point where the pop1+pop2 AllEHH drops to X --integrate-wrt cM | distance | delta: Set “x”-axis with respect to which to integrate; the ‘delta’ options stands for ‘instead of integrating EHH, simply report the value of EHH for each population at the integration right bound.’ CalcXPopBackground XPop Test Background ./run CalcXPopBackground <xpop_file_1> ... <xpop_file_N> <out_background_file> Measures mean and std. dev. of the AllEHH integral logratios in the given <xpop_files>. Outputs the results into <out_background_file>. CalcXPopSignificance XPop Test Significance Annotator ./run CalcXPopSignificance <background_file> <xpop_file_1> ... <xpop_file_N> <out_sig_file> Rescales all the AllEHH integral logratios in the <xpop_files> according to the background file so as to have zero mean and unit variance within small allele frequency bins. Outputs the results to <out_sig_file>. ExportGenes Find the RefSeq genes in certain genome windows ./run ExportGenes <project> <species> <baseIndiv> <pos_file> Using the species-wide data in <project> (or downloading the data from the UCSC genome website if missing), finds the genes of <species> and <baseIndiv> (e.g. “human” and “hg17”) in the given genome regions. The <pos_file> should have a header line followed by tab-separated lines. The first three columns should be Chromosome / Start / Stop. The program writes its results in an extra final column called “Genes in region”. ExportSnpPhase Export the SNP data of a certain genome region in Sweep1’s .snp and .phase format ./run ExportSnpPhase <project> <pop> <chr> <minPos> <maxPos> Exports the genotyped data for <pop> stored in <project>, in the specified region, to the files “region.snp” and “region.phase”. ExportLRHFor Export EHH / EHHBar / REHH decay curves for a particular SNP ./run ExportLRHFor <project> <pop> <chr> <core_pos> Export the EHH / EHHBar / REHH decay curves extending to 1MB on both sides for the SNP in chromosome <chr>, position <core_pos>, using the genotype data for population <pop> stored in <project>. CreateCompoundPopulation DumpMarkerH ExtractData Import ImportHapMapAncestral ImportSteveSims ListCores LocateSelection