#Note: copy/paste code from this slide maybe broken depending on PowerPoint. Copy friendly code at be found at README.rst: https://bitbucket.org/charade/swan/src SWAN Installation T UTORIAL AND TIPS FOR LINUX Charlie Xia 10/2/15 AND OSX 2 Contents SWAN Requirements SWAN Requirements Overview [p3] Know your GCC compiler [p4] Know your R Installation [p5] Know CRAN and Bioconductor [p6] Ubuntu Setup C++ [p7] Setup R and devtools [p8] CentOS Setup C++ [p9] Setup R and devtools [p10] OSX Setup C++ [p11] Setup R and devtools [p12] SWAN Install and Test Install R dependencies [p13] Install SWAN [p14] Test single sample analysis [p15] Test case-control analysis [p16] Download and test using Virtual Machines [p17] Extra SWAN Reference & Resources [p19] Left flowchart: How to use this doc. Page number is in bracket [ ] 2 SWAN Requirements Overview (Tested Platforms) 3 Requirements Tested Versions OS (OSX or Linux) Ubuntu 14.0.4; CentOS 7; OS X 10.10.5 (Yosemite) GCC (>4.3) 4.8.4(apt-get); 4.8.3(yum); 4.7.4(macports); 4.7.4(homebrew) R (>3.1) 3.2.2; 3.1.1; R devtools (any) 1.9.1 R_LIBS_USER (if no sudoer) Only need to set R_LIBS_USER in your shell profile only if R's default is not working for you AND you have no permission to write to the system R lib AND CRAN Packages (all current versions) RcppArmadillo (source), Rcpp (source); BH, data.table, devtools, digest, hash, methods, optparse, parallel, plyr, robustbase, sets, stringr, zoo Bioconductor Packages (all current versions) Biobase, Biostrings, BSgenome, GenomeInfoDb, GenomicRanges, IRanges, Rsamtools Samtools (>0.1.19) 0.1.19; 1.2 Note: SWAN may still be installable even if some of these versions are not met. Please consult the Wiki, FAQ and/or contact authors for help. You can also use SWAN with pre-installed virtual machines without installation. Know Your C++ Environment Determine your GCC version $ gcc --version Tip: using GCC GCC is available on Ubuntu through apt-get and CentOS by yum. GCC is available on Mac OSX through either macports and homebrew. We have installed SWAN successfully with either macports or homebrew shipped GCC. Please see other slides for details. User should only use either macports or homebrew. You CANNOT mix use them. When use gcc on OS X systems, the best practice to switch R's compiler is set CC=gcc and CXX=g++ in the ~/.R/Makevars user configuration file 4 A common error related to compiler is 'symbol not found' in linking. One most often cause of this is the libstdc++ or other library files corresponding to the compiler are not found. To fix, add paths containing proper libraries to the user's or system's $LD_LIBRARY_PATH. Know Your R Know R version $ R –version Know R environmental variables R> Sys.getenv() #check R_LIBS_USER which is the default place for SWAN install Tip: $R_LIBS_USER NOTE: $R_LIBS_USER/swan/bin is default by SWAN to put binaries/scripts. R will generate a default R_LIBS_USER when you first try to install R packages without root permission. User has to know the path, however override it is NOT recommended, unless you know how. The path has to be consistent all the time from installation. or otherwise R packages installed can't be found by SWAN. You can always move the binaries/scripts other places AFTER installation and correspondingly update $SWAN_BIN and include it in your $PATH. 5 Know CRAN and Bioconductor Know Howto Install CRAN Packages R> local({r <- getOption("repos"); r["CRAN"] <- "http://cran.us.r-project.org"; options(repos=r)}) R> install.packages("package") Tip: Online Documents for R install.packages https://stat.ethz.ch/R-manual/R-devel/library/utils/html/install.packages.html CRAN is the main package archive for the R community. Most R dependencies are here! Know Howto Install Bioconductor Packages R> source("http://bioconductor.org/biocLite.R") R> biocLite("package") Tip: Online Documents for R BiocLite https://www.bioconductor.org/install/#why-biocLite Bioconductor is the main package archive for biologist R users. Most bio-related R dependencies are here! 6 Ubuntu – Setup C++ Install GCC $ sudo apt-get install gcc-4.8 #other versions similar Tip: Online tutorials install GCC on Ubuntu Ubuntu 14+ now ships with GCC 4.8. To upgrade GCC from earlier Ubuntu distros, see tutorials like this one: http://askubuntu.com/questions/271388/how-to-install-gcc-4-8 7 Ubuntu - Setup R and devtools Install R and devtools on Ubuntu $ sudo apt-get install r-base-dev $ sudo apt-get -y build-dep libcurl4-gnutls-dev $ sudo apt-get -y install libcurl4-gnutls-dev $ sudo apt-get -y install libxml2-dev R> install.packages(“devtools”,dependencies=T) Tip: Online tutorials install R 3.2 on Ubuntu To use the latest R 3.2 versions you need to add some non official apt-get resource to the /etc/apt/sources.list . The tutorial #1 is very good and updated. You can use tutorial #2 and #3 as references needed, as they are with an older R version 3.1. https://pythonandr.wordpress.com/2015/04/27/upgrading-to-r-3-20-on-ubuntu/ http://sysads.co.uk/2014/06/install-r-base-3-1-0-ubuntu-14-04/ 8 https://www.digitalocean.com/community/tutorials/how-to-set-upr-on-ubuntu-14-04 CentOS – Setup C++ Install GCC $ sudo yum install devtoolset-2-gcc-4.8.2 devtoolset-2-binutils devtoolset-2-gcc-c++-4.8.2 # other versions similar Tip: Online tutorials install GCC on CentOS CentOS 7 ships with GCC 4.8. To upgrade GCC from earlier Cent OS distros, see tutorials like this one: http://superuser.com/questions/381160/how-to-install-gcc-4-7-x-4-8x-on-centos 9 CentOS - Setup R and devtools Install R and devtools on CentOS $ sudo yum -y install epel-release $ sudo yum -y install R-core-devel $ sudo yum -y install libcurl-devel $ sudo yum -y install libxml2-devel $ sudo yum -y install openssl-devel R> install.packages(“devtools”,dependencies=T) Tip: Online tutorials install R 3.2 on CentOS To use the latest R 3.2 versions you need to use yum to install from the epel-release. yum is standard with CentOS 7. For older CentOS distros please find tutorials online for install yum and epel-release. Also http://Pkgs.org has instructions for installation of its individual packages. http://www.rackspace.com/knowledge_center/article/install-epeland-additional-repositories-on-centos-and-red-hat 10 http://pkgs.org/centos-7/epel-x86_64/R-3.2.21.el7.x86_64.rpm.html OSX – Setup C++ Install GCC $ sudo port install gcc47 $ brew install gcc-4.8 #other gcc versions similar # if using macports # if using homebrew Tip: switching C/C++ compilers: $ sudo port select gcc mp-gcc47 # selecting gcc & g++ 4.7 for macports $ ln -s /usr/local/gcc-4.7 /usr/local/gcc # selecting gcc 4.7 for homebrew $ ln -s /usr/local/g++4.7 /usr/local/g++ # selecting g++ 4.7 for homebrew $ echo -e "CC=gcc\nCXX=g++" > ~/.R/Makevars # note: save your own version first 11 OSX - Setup R and devtools Install R 3.2 and devtools on OSX Download and Install R 3.2 for OSX from https://cran.r-project.org R> install.pakcages(“devtools”) Tip: Online resources to install R 3.1+ on OSX https://cran.r-project.org/doc/manuals/r-patched/R-admin.html https://cran.r-project.org/bin/macosx/ 12 Install R Dependencies #Note: if during any of install.packages() and devtools::install_bitbucket(), R throws a write permission error, add option lib=Sys.getenv('R_LIBS_USER') to following function calls and try again. SWAN FAQ: http://bitbucket.org/charade/swan/wiki/FAQ # First disable slow Tk/Tcl prompts of mirrors R> options(menu.graphics=FALSE) # Some Rcpp packages have to to installed from source, otherwise may cause 'segfault' R> install.packages(pkgs=c("Rcpp","RcppArmadillo"),type="source") # now if you have "-lgfortran" or "-lquadmath" not found problems from above commands, please see entry in FAQ for fix R> install.packages(pkgs=c("BH", "data.table", "devtools", "digest", "hash", "methods", "optparse", "parallel", "plyr", "robustbase", "sets", "stringr", "zoo")) # other CRAN packages R> source("http://bioconductor.org/biocLite.R") # Bioconductor R> biocLite("BiocUpgrade") # Upgrade your Bioc to latest compatible version of your R # now if you have "Error in unloadNamespace(package)" after "preparing package for lazy loading", please see entry in FAQ for fix R>biocLite(pkgs=c("Biobase", "Biostrings", "BSgenome", "GenomeInfoDb", "GenomicRanges", "I Ranges", "Rsamtools")) # other Bioconductor packages 13 Install SWAN Install Samtools (optional) # if you don't have Samtools >0.19, go to the current Samtools webpage and install it: http://www.htslib.org/download/ Install SWAN R> devtools::install_bitbucket("charade/swan",dependencies=T,clean=T) R> devtools::install_bitbucket("charade/swan",dependencies=T,clean=T, lib=sys.getenv('R_LIBS_USER')) #if you got permission error Locate SWAN Looking for SWAN binaries in path after “SWAN Binaries can be found at:" in install log. See example screen shot below: 14 Single Sample Analysis Pipeline One Sample One Lib Analysis Example # read the swan/README.rst for more instructions $ test> $SWAN_BIN/single.sh one >single.one.log One Sample Multiple Libs Analysis Example $ test> $SWAN_BIN/single.sh two >single.two.log # We are giving two libs as example here, more libs can be handled the same # way by appending additional unmerged lib bam files to the "," separated list Tip: where to download swan source and swan_test.tgz $ wget https://bitbucket.org/charade/swan/get/master.zip $ wget http://meta.usc.edu/softs/swan/swan_test.tgz # unzip the downloaded swan source package into "swan" # change into "swan/test" and tar unzip the swan_test.tgz there # we are only testing functionalities of SWAN here, data has no scientific values 15 Paired Sample Analysis Pipeline Two Sample One Lib Analysis Example # read the swan/README.rst for instructions $ test> $SWAN_BIN/paired.sh one >paired.one.log Two Sample Multiple Libs Analysis Example $ test> $SWAN_BIN/paired.sh two >paired.two.log # We are giving two libs as example here, more libs can be handled the same # way by appending additional unmerged lib bam files to the "," separated list Tip: where to download swan source and swan_test.tgz $ wget https://bitbucket.org/charade/swan/get/master.zip $ wget http://meta.usc.edu/softs/swan/swan_test.tgz # unzip the downloaded swan source package into "swan" # change into "swan/test" and tar unzip the swan_test.tgz there # we are only testing functionalities of SWAN here, data has no scientific values! 16 Download and Use Authors' Testing Virtual Machines Download Ubuntu/CentOS Virtual Machines with SWAN http://meta.usc.edu/softs/vbox/CentOS_7_SWAN.vdi.gz http://meta.usc.edu/softs/vbox/Ubuntu_14_SWAN.vdi.gz #CentOS 7 #Ubuntu 14 Install Oracle VirtualBox and Import .vdi VM # go to the oracle VirtualBox downloads below. Download and install. https://www.virtualbox.org/wiki/Downloads # unzip, install, create a Linux machine and import the downloaded .vdi file to VirtualBox see youtube tutorial: https://www.youtube.com/watch?v=1P_l7iVKfgs Note: password of a VM username is ALWAYS username itself 17 References Useful SWAN resources README: https://bitbucket.org/charade/swan/src Homepage: https://bitbucket.org/charade/swan/wiki/Home FAQ: https://bitbucket.org/charade/swan/wiki/FAQ Manual: https://bitbucket.org/charade/swan/wiki/Manual Examples: https://bitbucket.org/charade/swan/wiki/Example This doc: http://bitbucket.org/charade/swan/wiki/doc/SWAN_Installation.pptx Author's complete Install steps from Clean Ubuntu, CentOS and OSX: https://bitbucket.org/charade/swan/wiki/Example Thank you for using SWAN! 18 SWAN Requirements Overview [p3] Have A Testing Machine ? N Get A Virtual Machine Y Download [p18] Good GCC [p4] ? N Install/Upgrade GCC Y Ubuntu [p7] CentOS [p9] OS X [p11] Good R and devtools package [p6-7] ? N Install/Upgrade R and devtools Y Ubuntu [p8] CentOS [p10] Install R Dependencies [p13] Install SWAN and Samtools [p14] Testing your installation [p15-16] 19 Additional Resources [p18] OS X [p12] Create Your Own Testing Virtual Machines Download and Install Oracle VirtualBox # go to the oracle VirtualBox downloads below. Download and install. https://www.virtualbox.org/wiki/Downloads Download Ubuntu DVD image # go to the ubuntu desktop downloads below. Download the Ubuntu iso file. http://www.ubuntu.com/download/desktop Download CentOS DVD image # go to the CentOS desktop downloads below. Download the CentOS iso file. https://www.centos.org/download/ Create Ubuntu/CentOS Virtual Machines # Here are good youtube videos. https://www.youtube.com/watch?v=QkJmahizwO4 #ubuntu 14 https://www.youtube.com/watch?v=Eb-FetgKB6k #CentOS 7 # For ease of use of CentOS, in "software selection" you need to select "Gnome Desktop" and its "Development Tools", "Gnome Applications", "Internet Applications" and "Office Suites" for GUI and turn on "network" in "Network and Hostname" 20 during installation. Download and Use Authors' Testing Virtual Machines Download Clean Ubuntu/CentOS Virtual Machines http://meta.usc.edu/softs/vbox/CentOS_7_clean.vdi.gz http://meta.usc.edu/softs/vbox/Ubuntu_14_clean.vdi.gz #CentOS 7 #Ubuntu 14 Download Ubuntu/CentOS Virtual Machines with R http://meta.usc.edu/softs/vbox/CentOS_7_R.vdi.gz http://meta.usc.edu/softs/vbox/Ubuntu_14_R.vdi.gz #CentOS 7 #Ubuntu 14 Download Ubuntu/CentOS Virtual Machines with SWAN http://meta.usc.edu/softs/vbox/CentOS_7_SWAN.vdi.gz http://meta.usc.edu/softs/vbox/Ubuntu_14_SWAN.vdi.gz #CentOS 7 #Ubuntu 14 Install Oracle VirtualBox and Import .vdi VM # go to the oracle VirtualBox downloads below. Download and install. https://www.virtualbox.org/wiki/Downloads # unzip, install, create a Linux machine and import the downloaded .vdi file to VirtualBox see youtube tutorial: https://www.youtube.com/watch?v=1P_l7iVKfgs Note: password of a VM username is ALWAYS username itself 21