GEIRA: Gene-Environment and Gene-Gene Interaction Research Application R version 1.0 User’s manual Henrik Källberg Institute of Environmental Medicine Karolinska Institutet 171 77 Stockholm, Sweden June 1, 2010 Freely download at: URL http://www.epinet.se GEIRAD Gene Environmental Interaction Research Application Dominant model Description This is an application for automatically calculating interaction effects on extensive genome-wide or dense gene data. The application could be used to calculate geneenvironment or gene-gene interaction. Both additative interaction and multiplicative interaction is calculated and options for the dominant genetic model. Attributable proportion due to interaction is calculated as a measure of additative interaction. The underlying measure is the odds ratios which are calculated through general linear models with a logistic link function. Usage GEIRAD(dname, dname2, dname3, start, stop, covar, envi, FILE, id) Arguments dname define where the tped data file containing snp information is available, the name of it and the format of the file (tped.txt-file). dname2 define where the tfam data file containing disease and id information is available, the name of it and the format of the file (tfam-file). dname3 define where the data file containing covariate and environmental factor information is available, the name of it and the format of the file (tfam-file). start state where the first column(variable) for starting the calculations. stop state how many variables from the end that should be rejected (not included in the calculations) covar define name of the covariate. envi define name of the interaction variable. FILE define where the output data file and the name of it will be placed id define the name of the id variable. Details It is important to follow the same order for the data sets in the function. If not the same order is used wrong data set will be used for the calculations and will result in error message. Value chr What chromosome the snp belong to. snp The rs number of the snp. ORii The odds ratio for the combination of risk allele and the envi factor. ORiiL The lower limit for the 95% confidence limit regarding the odds ratio for the combination of risk allele and the envi factor. ORiiH The upper limit for the 95% confidence limit regarding the odds ratio for the combination of risk allele and the envi factor. ORio The odds ratio for the envi factor by itself. ORioL The lower limit for the 95% confidence limit regarding the odds ratio for the envi factor by itself. ORioH The upper limit for the 95% confidence limit regarding the odds ratio for the envi factor by itself. ORoi The odds ratio for the risk allele by itself. ORoiL The lower limit for the 95% confidence limit regarding the odds ratio for the risk allele by itself. ORoiH The upper limit for the 95% confidence limit regarding the odds ratio for the risk allele by itself. AP The attributable proportion due to interaction AP.low The lower limit for the 95% confidence limit regarding the attributable proportion due to interaction AP.high The upper limit for the 95% confidence limit regarding the attributable proportion due to interaction AP.p The p-value for the attributable proportion due to interaction. Mult p-value for multiplicative interaction coefficient. Ind10_1 Number of cases with risk allele and no envi exposure. Ind10_0 Number of controls with risk allele and no envi exposure. Ind01_1 Number of cases with envi exposure and no risk allele. Ind01_0 Ind11_1 Number of controls with envi exposure and no risk allele. Number of cases with envi exposure and two risk alleles. Ind11_0 Number of controls with envi exposure and two risk alleles. minor minor allele. major major allele. risk risk allele. cc_minor minor allele frequency. cc_major major allele frequency cc_minor_ctrl minor allele frequency controls. cc_minor_case minor allele frequency cases. cc_major_ctrl major allele frequency controls. cc_major_case major allele frequency cases. References Cardon, L.R. and Bell, J.I. (2001) Association study designs for complex diseases, Nat Rev Genet, 2, 91-99. Moore, J.H. (2003) The ubiquitous nature of epistasis in determining susceptibility to common human diseases, Hum Hered, 56, 73-82. Plenge, R.M., Seielstad, M., Padyukov, L., Lee, A.T., Remmers, E.F., Ding, B., Liew, A., Khalili, H., Chandrasekaran, A., Davies, L.R., Li, W., Tan, A.K., Bonnard, C., Ong, R.T., Thalamuthu, A., Pettersson, S., Liu, C., Tian, C., Chen, W.V., Carulli, J.P., Beckman, E.M., Altshuler, D., Alfredsson, L., Criswell, L.A., Amos, C.I., Seldin, M.F., Kastner, D.L., Klareskog, L. and Gregersen, P.K. (2007) TRAF1-C5 as a risk locus for rheumatoid arthritis-a genomewide study, N Engl J Med, 357, 1199-1209. Purcell, S. et al. (2007) PLINK: A Tool Set for Whole-Genome Association and PopulationBased Linkage Analyses, Am J Hum Genet, 81, 559-575. Rothman, K.J. (2002) Epidemiology. An introduction. Oxford University Press, New York. Sing, C.F. et al. (2004) Dynamic relationships between the genome and exposures to environments as causes of common human diseases, World Rev Nutr Diet, 93, 77-91. Thornton-Wells, T.A. et al. (2004) Genetics, statistics and human disease: analytical retooling for complexity, Trends Genet, 20, 640-647. Källberg H, Ahlbom A, Alfredsson L. Eur J Epidemiol. 2006;21(8):571-3. Epub 2006 Sep 13. Calculating measures of biological interaction using R. Example Basic function without specification of data sets and variables: GEGEID(dname,dname2,dname3,start,stop,covar,envi,FILE,id) Function for running the sample dataset if the data sets are located on a directory called R under program located on the local hard drive c: GEGEID("c:/program/R/samples.tped","C:/Program/R/sample.tfam","C:/Program/R/SampleC ov.txt",1,0,covar="cov",envi="env",FILE="c:/program/R/wgi2.r",id="indid")