Tutorial - Bitbucket

advertisement
#Note: copy/paste code from this slide maybe broken
depending on PowerPoint.
Copy friendly code at be found at README.rst:
https://bitbucket.org/charade/swan/src
SWAN Installation
T UTORIAL
AND
TIPS
FOR
LINUX
Charlie Xia
10/2/15
AND
OSX
2
Contents
SWAN Requirements

SWAN Requirements Overview [p3]

Know your GCC compiler [p4]

Know your R Installation [p5]

Know CRAN and Bioconductor [p6]
Ubuntu

Setup C++ [p7]

Setup R and devtools [p8]
CentOS

Setup C++ [p9]

Setup R and devtools [p10]
OSX

Setup C++ [p11]

Setup R and devtools [p12]
SWAN Install and Test

Install R dependencies [p13]

Install SWAN [p14]

Test single sample analysis [p15]

Test case-control analysis [p16]

Download and test using Virtual Machines [p17]
Extra

SWAN Reference & Resources [p19]
Left flowchart: How to use this doc.
Page number is in bracket [ ]

2
SWAN Requirements Overview (Tested Platforms)
3
Requirements
Tested Versions
OS (OSX or Linux)
Ubuntu 14.0.4; CentOS 7; OS X 10.10.5 (Yosemite)
GCC (>4.3)
4.8.4(apt-get); 4.8.3(yum); 4.7.4(macports);
4.7.4(homebrew)
R (>3.1)
3.2.2; 3.1.1;
R devtools (any)
1.9.1
R_LIBS_USER (if no
sudoer)
Only need to set R_LIBS_USER in your shell profile only if
R's default is not working for you AND you have no
permission to write to the system R lib AND
CRAN Packages
(all current versions)
RcppArmadillo (source), Rcpp (source);
BH, data.table, devtools, digest, hash, methods, optparse,
parallel, plyr, robustbase, sets, stringr, zoo
Bioconductor
Packages
(all current versions)
Biobase, Biostrings, BSgenome, GenomeInfoDb,
GenomicRanges, IRanges, Rsamtools
Samtools (>0.1.19)
0.1.19; 1.2
Note: SWAN may still be installable even if some of these versions are not met.
Please consult the Wiki, FAQ and/or contact authors for help. You can also use
SWAN with pre-installed virtual machines without installation.
Know Your C++ Environment
Determine your GCC version
$ gcc --version
Tip: using GCC
GCC is available on Ubuntu through apt-get and CentOS by yum.
GCC is available on Mac OSX through either macports and homebrew.
We have installed SWAN successfully with either macports or homebrew shipped
GCC. Please see other slides for details. User should only use either macports or
homebrew. You CANNOT mix use them.
When use gcc on OS X systems, the best practice to switch R's compiler is set
CC=gcc and CXX=g++ in the ~/.R/Makevars user configuration file
4
A common error related to compiler is 'symbol not found' in linking. One most often
cause of this is the libstdc++ or other library files corresponding to the compiler are
not found. To fix, add paths containing proper libraries to the user's or system's
$LD_LIBRARY_PATH.
Know Your R
Know R version
$ R –version
Know R environmental variables
R> Sys.getenv()
#check R_LIBS_USER which is the default place for SWAN install
Tip: $R_LIBS_USER
NOTE: $R_LIBS_USER/swan/bin is default by SWAN to put binaries/scripts. R will
generate a default R_LIBS_USER when you first try to install R packages without root
permission. User has to know the path, however override it is NOT recommended,
unless you know how. The path has to be consistent all the time from installation. or
otherwise R packages installed can't be found by SWAN.
You can always move the binaries/scripts other places AFTER installation and
correspondingly update $SWAN_BIN and include it in your $PATH.
5
Know CRAN and Bioconductor
Know Howto Install CRAN Packages
R> local({r <- getOption("repos"); r["CRAN"] <- "http://cran.us.r-project.org"; options(repos=r)})
R> install.packages("package")
Tip: Online Documents for R install.packages
https://stat.ethz.ch/R-manual/R-devel/library/utils/html/install.packages.html
CRAN is the main package archive for the R community. Most R dependencies are here!
Know Howto Install Bioconductor Packages
R> source("http://bioconductor.org/biocLite.R")
R> biocLite("package")
Tip: Online Documents for R BiocLite
https://www.bioconductor.org/install/#why-biocLite
Bioconductor is the main package archive for biologist R users.
Most bio-related R dependencies are here!
6
Ubuntu – Setup C++
Install GCC
$ sudo apt-get install gcc-4.8
#other versions similar
Tip: Online tutorials install GCC on Ubuntu
Ubuntu 14+ now ships with GCC 4.8. To upgrade GCC from earlier
Ubuntu distros, see tutorials like this one:
http://askubuntu.com/questions/271388/how-to-install-gcc-4-8
7
Ubuntu - Setup R and devtools
Install R and devtools on Ubuntu
$ sudo apt-get install r-base-dev
$ sudo apt-get -y build-dep libcurl4-gnutls-dev
$ sudo apt-get -y install libcurl4-gnutls-dev
$ sudo apt-get -y install libxml2-dev
R> install.packages(“devtools”,dependencies=T)
Tip: Online tutorials install R 3.2 on Ubuntu
To use the latest R 3.2 versions you need to add some non official
apt-get resource to the /etc/apt/sources.list . The tutorial #1 is
very good and updated. You can use tutorial #2 and #3 as
references needed, as they are with an older R version 3.1.
https://pythonandr.wordpress.com/2015/04/27/upgrading-to-r-3-20-on-ubuntu/
http://sysads.co.uk/2014/06/install-r-base-3-1-0-ubuntu-14-04/
8
https://www.digitalocean.com/community/tutorials/how-to-set-upr-on-ubuntu-14-04
CentOS – Setup C++
Install GCC
$ sudo yum install devtoolset-2-gcc-4.8.2 devtoolset-2-binutils devtoolset-2-gcc-c++-4.8.2
# other versions similar
Tip: Online tutorials install GCC on CentOS
CentOS 7 ships with GCC 4.8. To upgrade GCC from earlier Cent
OS distros, see tutorials like this one:
http://superuser.com/questions/381160/how-to-install-gcc-4-7-x-4-8x-on-centos
9
CentOS - Setup R and devtools
Install R and devtools on CentOS
$ sudo yum -y install epel-release
$ sudo yum -y install R-core-devel
$ sudo yum -y install libcurl-devel
$ sudo yum -y install libxml2-devel
$ sudo yum -y install openssl-devel
R> install.packages(“devtools”,dependencies=T)
Tip: Online tutorials install R 3.2 on CentOS
To use the latest R 3.2 versions you need to use yum to install
from the epel-release. yum is standard with CentOS 7. For older
CentOS distros please find tutorials online for install yum and
epel-release. Also http://Pkgs.org has instructions for installation
of its individual packages.
http://www.rackspace.com/knowledge_center/article/install-epeland-additional-repositories-on-centos-and-red-hat
10
http://pkgs.org/centos-7/epel-x86_64/R-3.2.21.el7.x86_64.rpm.html
OSX – Setup C++
Install GCC
$ sudo port install gcc47
$ brew install gcc-4.8
#other gcc versions similar
# if using macports
# if using homebrew
Tip: switching C/C++ compilers:
$ sudo port select gcc mp-gcc47
# selecting gcc & g++ 4.7 for macports
$ ln -s /usr/local/gcc-4.7 /usr/local/gcc # selecting gcc 4.7 for homebrew
$ ln -s /usr/local/g++4.7 /usr/local/g++ # selecting g++ 4.7 for homebrew
$ echo -e "CC=gcc\nCXX=g++" > ~/.R/Makevars # note: save your own version first
11
OSX - Setup R and devtools
Install R 3.2 and devtools on OSX
Download and Install R 3.2 for OSX from https://cran.r-project.org
R> install.pakcages(“devtools”)
Tip: Online resources to install R 3.1+ on OSX
https://cran.r-project.org/doc/manuals/r-patched/R-admin.html
https://cran.r-project.org/bin/macosx/
12
Install R Dependencies
#Note: if during any of install.packages() and devtools::install_bitbucket(),
R throws a write permission error, add option lib=Sys.getenv('R_LIBS_USER')
to following function calls and try again.
SWAN FAQ: http://bitbucket.org/charade/swan/wiki/FAQ
# First disable slow Tk/Tcl prompts of mirrors
R> options(menu.graphics=FALSE)
# Some Rcpp packages have to to installed from source, otherwise may cause 'segfault'
R> install.packages(pkgs=c("Rcpp","RcppArmadillo"),type="source")
# now if you have "-lgfortran" or "-lquadmath" not found problems from above commands, please
see entry in FAQ for fix
R> install.packages(pkgs=c("BH", "data.table", "devtools",
"digest", "hash", "methods", "optparse", "parallel", "plyr", "robustbase", "sets", "stringr", "zoo")) #
other CRAN packages
R> source("http://bioconductor.org/biocLite.R")
# Bioconductor
R> biocLite("BiocUpgrade") # Upgrade your Bioc to latest compatible version of your R
# now if you have "Error in unloadNamespace(package)" after "preparing package for lazy
loading", please see entry in FAQ for fix
R>biocLite(pkgs=c("Biobase", "Biostrings", "BSgenome", "GenomeInfoDb", "GenomicRanges", "I
Ranges", "Rsamtools")) # other Bioconductor packages
13
Install SWAN
Install Samtools (optional)
# if you don't have Samtools >0.19, go to the current Samtools webpage and install it:
http://www.htslib.org/download/
Install SWAN
R> devtools::install_bitbucket("charade/swan",dependencies=T,clean=T)
R> devtools::install_bitbucket("charade/swan",dependencies=T,clean=T,
lib=sys.getenv('R_LIBS_USER')) #if you got permission error
Locate SWAN
Looking for SWAN binaries in path after “SWAN Binaries can be found at:" in install log.
See example screen shot below:
14
Single Sample Analysis Pipeline
One Sample One Lib Analysis Example
# read the swan/README.rst for more instructions
$ test> $SWAN_BIN/single.sh one >single.one.log
One Sample Multiple Libs Analysis Example
$ test> $SWAN_BIN/single.sh two >single.two.log
# We are giving two libs as example here, more libs can be handled the same
# way by appending additional unmerged lib bam files to the "," separated list
Tip: where to download swan source and swan_test.tgz
$ wget https://bitbucket.org/charade/swan/get/master.zip
$ wget http://meta.usc.edu/softs/swan/swan_test.tgz
# unzip the downloaded swan source package into "swan"
# change into "swan/test" and tar unzip the swan_test.tgz there
# we are only testing functionalities of SWAN here, data has no scientific values
15
Paired Sample Analysis Pipeline
Two Sample One Lib Analysis Example
# read the swan/README.rst for instructions
$ test> $SWAN_BIN/paired.sh one >paired.one.log
Two Sample Multiple Libs Analysis Example
$ test> $SWAN_BIN/paired.sh two >paired.two.log
# We are giving two libs as example here, more libs can be handled the same
# way by appending additional unmerged lib bam files to the "," separated list
Tip: where to download swan source and swan_test.tgz
$ wget https://bitbucket.org/charade/swan/get/master.zip
$ wget http://meta.usc.edu/softs/swan/swan_test.tgz
# unzip the downloaded swan source package into "swan"
# change into "swan/test" and tar unzip the swan_test.tgz there
# we are only testing functionalities of SWAN here, data has no scientific values!
16
Download and Use Authors' Testing Virtual Machines
Download Ubuntu/CentOS Virtual Machines with SWAN
http://meta.usc.edu/softs/vbox/CentOS_7_SWAN.vdi.gz
http://meta.usc.edu/softs/vbox/Ubuntu_14_SWAN.vdi.gz
#CentOS 7
#Ubuntu 14
Install Oracle VirtualBox and Import .vdi VM
# go to the oracle VirtualBox downloads below. Download and install.
https://www.virtualbox.org/wiki/Downloads
# unzip, install, create a Linux machine and import the downloaded .vdi file to VirtualBox
see youtube tutorial:
https://www.youtube.com/watch?v=1P_l7iVKfgs
Note: password of a VM username is ALWAYS username itself
17
References
Useful SWAN resources
 README: https://bitbucket.org/charade/swan/src
 Homepage: https://bitbucket.org/charade/swan/wiki/Home
 FAQ: https://bitbucket.org/charade/swan/wiki/FAQ
 Manual: https://bitbucket.org/charade/swan/wiki/Manual
 Examples: https://bitbucket.org/charade/swan/wiki/Example
 This doc:
http://bitbucket.org/charade/swan/wiki/doc/SWAN_Installation.pptx
 Author's complete Install steps from Clean Ubuntu, CentOS and OSX:
https://bitbucket.org/charade/swan/wiki/Example
Thank you for using SWAN!
18
SWAN Requirements Overview [p3]
Have A Testing Machine ?
N
Get A Virtual Machine
Y
Download [p18]
Good GCC [p4] ?
N
Install/Upgrade GCC
Y
Ubuntu [p7]
CentOS [p9]
OS X [p11]
Good R and devtools package [p6-7] ?
N
Install/Upgrade R and devtools
Y
Ubuntu [p8]
CentOS [p10]
Install R Dependencies [p13]
Install SWAN and Samtools [p14]
Testing your installation [p15-16]
19
Additional Resources [p18]
OS X [p12]
Create Your Own Testing Virtual Machines
Download and Install Oracle VirtualBox
# go to the oracle VirtualBox downloads below. Download and install.
https://www.virtualbox.org/wiki/Downloads
Download Ubuntu DVD image
# go to the ubuntu desktop downloads below. Download the Ubuntu iso file.
http://www.ubuntu.com/download/desktop
Download CentOS DVD image
# go to the CentOS desktop downloads below. Download the CentOS iso file.
https://www.centos.org/download/
Create Ubuntu/CentOS Virtual Machines
# Here are good youtube videos.
https://www.youtube.com/watch?v=QkJmahizwO4
#ubuntu 14
https://www.youtube.com/watch?v=Eb-FetgKB6k
#CentOS 7
# For ease of use of CentOS, in "software selection" you need to select "Gnome Desktop"
and its "Development Tools", "Gnome Applications", "Internet Applications" and
"Office Suites" for GUI and turn on "network" in "Network and Hostname"
20 during installation.
Download and Use Authors' Testing Virtual Machines
Download Clean Ubuntu/CentOS Virtual Machines
http://meta.usc.edu/softs/vbox/CentOS_7_clean.vdi.gz
http://meta.usc.edu/softs/vbox/Ubuntu_14_clean.vdi.gz
#CentOS 7
#Ubuntu 14
Download Ubuntu/CentOS Virtual Machines with R
http://meta.usc.edu/softs/vbox/CentOS_7_R.vdi.gz
http://meta.usc.edu/softs/vbox/Ubuntu_14_R.vdi.gz
#CentOS 7
#Ubuntu 14
Download Ubuntu/CentOS Virtual Machines with SWAN
http://meta.usc.edu/softs/vbox/CentOS_7_SWAN.vdi.gz
http://meta.usc.edu/softs/vbox/Ubuntu_14_SWAN.vdi.gz
#CentOS 7
#Ubuntu 14
Install Oracle VirtualBox and Import .vdi VM
# go to the oracle VirtualBox downloads below. Download and install.
https://www.virtualbox.org/wiki/Downloads
# unzip, install, create a Linux machine and import the downloaded .vdi file to VirtualBox
see youtube tutorial:
https://www.youtube.com/watch?v=1P_l7iVKfgs
Note: password of a VM username is ALWAYS username itself
21
Download