BioCloud Developer Documentation

advertisement
BioCloud Developer Documentation
BioCloud environment consists of two different modules namely; BioCloud Portal
(BP) and the BioCloud Workflow Manager (BWM). One instance of BP is employed
to serve multiple BWMs which are associated to a single virtual organization (VO) in
the environment. BioCloud architecture is illustrated in Figure 1.
Figure 1 BioCloud architecture with the portal and the workflow managers.
BioCloud Portal
BP is a web application which is developed using Java EE using JAX-RS restful web
services. BP comes with two different interfaces; one as an API for the
programmatic access, and one user interface to be able to manage the system. We
first explain how to setup the BP environment and configure the system for the
developers and then clarify the use of the interfaces.
Configuration
To be able to compile and run the system, Java runtime environment should be
installed first. During the development we employed MVN to compile and manage
the source code. Using MVN is optional but recommended. Installation of Java
environment and MVN is out of the scope of this document. The rest of the
document assumes the availability of MVN.
1. Create a web application project using MVN.
a. mvn archetype:generate -DgroupId=osu.hpc DartifactId=portalWebApplication -DarchetypeArtifactId=mavenarchetype-webapp -DinteractiveMode=false
b. Copy the source code of the project under portalWebApplication
c. Update pom.xml file as below:
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/mavenv4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>osu.hpc</groupId>
<artifactId>portalWebApplication</artifactId>
<packaging>war</packaging>
<version>1.0</version>
<name>portalWebApplication Maven Webapp</name>
<url>http://maven.apache.org</url>
<repositories>
<repository>
<id>jboss</id>
<url>http://repository.jboss.org/maven2</url>
</repository>
</repositories>
<dependencies>
<!-- Junit support -->
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<!-- core library -->
<dependency>
<groupId>org.jboss.resteasy</groupId>
<artifactId>resteasy-jaxrs</artifactId>
<version>2.3.1.GA</version>
</dependency>
<dependency>
<groupId>net.sf.scannotation</groupId>
<artifactId>scannotation</artifactId>
<version>1.0.2</version>
</dependency>
<!-- JAXB provider -->
<dependency>
<groupId>org.jboss.resteasy</groupId>
<artifactId>resteasy-jaxb-provider</artifactId>
<version>2.3.1.GA</version>
</dependency>
<!-- Multipart support -->
<dependency>
<groupId>org.jboss.resteasy</groupId>
<artifactId>resteasy-multipart-provider</artifactId>
<version>2.3.1.GA</version>
</dependency>
<!-- Jackson support -->
<dependency>
<groupId>org.jboss.resteasy</groupId>
<artifactId>resteasy-jackson-provider</artifactId>
<version>2.3.1.GA</version>
</dependency>
<!-- For better I/O control -->
<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
<version>2.0.1</version>
</dependency>
<!-- For db control -->
<dependency>
<groupId>org.postgresql</groupId>
<artifactId>postgresql</artifactId>
<version>9.3-1101-jdbc41</version>
</dependency>
<!-- For deltacloud control -->
<dependency>
<groupId>osu.hpc</groupId>
<artifactId>clientDeltaCloud</artifactId>
<version>0.2.1</version>
</dependency>
<!-- To get the ip address of the requests -->
<dependency>
<groupId>javax.servlet</groupId>
<artifactId>servlet-api</artifactId>
<version>2.5</version>
</dependency>
<!-- WEKA, Data mining -->
<dependency>
<groupId>nz.ac.waikato.cms.weka</groupId>
<artifactId>weka-dev</artifactId>
<version>3.7.10</version>
</dependency>
</dependencies>
<build>
<finalName>portalWebApplication</finalName>
</build>
</project>
d. MVN assumes availability of the packages online. We use a custon
deltacloud jar file. Therefore, this jar file should be deployed to local
repository using the following command
mvn install:install-file -Dfile=jar_file_name.jar -DgroupId=osu.hpc -DartifactId=clientDeltaCloud Dversion=0.2.1 -Dpackaging=jar
e. Compile the source code using the following command
mvn package
f. The web application is ready for deployment. Check the war file under
target folder. Install the war file to a web server.
2. Web server installation
a. Download apache-tomcat
wget http://archive.apache.org/dist/tomcat/tomcat7/v7.0.53/bin/apache-tomcat-7.0.53.tar.gz
b. Untar the file
tar -zxvf apache-tomcat-7.0.53.tar.gz
c. Configure tomcat users
i. Open the tomcat users configuration file and define roles for
tomcat user
ii. Define roles and users in the following file:
_locaton_of_tomcat_/apache-tomcat-7.0.53/conf/tomcatusers.xml
1. Add roles:
a. <role rolename="tomcat"/>
b. <role rolename="manager-gui"/>
2. Add users:
a. <user username="tomcat"
password="your_password"
roles="tomcat,manager-gui"/>
d. Start the application server
i. sh _locaton_of_tomcat_/apache-tomcat-7.0.53/bin/startup.sh
e. Deploying application
i. Open http://localhost:8080 from the internet browser and
click Manager App
ii. Login to tomcat admin panel (use the username and password
defined in 3.c.ii.2)
iii. In the WAR file to deploy section, choose the generated war file
and click deploy
iv. The deployed application contains following scripts under
/files directory:
1. headconfig.sh - To create the cluster configuration in
head node
2. clusterconfig.sh - To create the cluster configuration in
head as well compute nodes.
3. remove.sh - To remove the details of deleted nodes in
the cluster configuration.
4. free_nodes.sh - To identify the free nodes in a cluster.
5. freespace.sh - To identify the free space available in an
instance.
6. storage.sh - To attach a additional volume to an
instance.
7. check_pbs - To check the status of pbs scheduler.
f. Now you can start using the application
i. Navigate to http://localhost:8080/portalWebApplication/ BP
requires registration before using the system which requires
installation and the configuration of the database
ii. Restful web-service will be accessible at
http://localhost:8080/portalWebApplication/rest-ws/ Check
controller/PortalHandler.java file for the available functions
iii. The log file of the portal is accessible at
http://localhost:8080/portalWebApplication/log.txt
3. Database installation
a. Download PostgreSQL
wget http://ftp.postgresql.org/pub/source/v9.3.4/postgresql9.3.4.tar.gz
b. Untar the file
i. tar -zxvf postgresql-9.3.4.tar.gz
c. Installation
i. cd postgresql-9.3.4/
ii. ./configure --prefix=/home/ubuntu/usr/local
iii. make
iv. make install
d. Initialize the database
i. pg_ctl init -D src/postgresql-9.3.4/data/portal/
ii. pg_ctl start -D src/postgresql-9.3.4/data/portal -l
src/postgresql-9.3.4/data/portal/logfile
iii. createdb portaldb
e. Test database
i. psql portaldb
f. Create the tables as follows:
CREATE TABLE vo (
id
serial PRIMARY KEY,
name
varchar(100) NOT NULL,
email
varchar(40) NOT NULL UNIQUE,
passwd
varchar(200) NOT NULL,
wfmid
integer,
signup_time timestamp DEFAULT current_timestamp
);
CREATE TABLE users (
id
serial PRIMARY KEY,
name
varchar(100) NOT NULL,
email
varchar(40) NOT NULL UNIQUE,
passwd
varchar(200) NOT NULL,
vo_id
integer,
signup_time timestamp DEFAULT current_timestamp
);
CREATE TABLE session (
uid
serial PRIMARY KEY,
is_admin
varchar(40) NOT NULL,
token
varchar(200),
init_time timestamp DEFAULT current_timestamp
);
CREATE TABLE clusters (
id
serial PRIMARY KEY,
vo_id
integer NOT NULL,
name
varchar(100) NOT NULL,
url
varchar(100) NOT NULL,
password
varchar(200) NOT NULL,
addition_time timestamp DEFAULT current_timestamp
);
CREATE TABLE cresources (
id
serial PRIMARY KEY,
vo_id
integer NOT NULL,
name
varchar(100) NOT NULL,
access_id
varchar(100) NOT NULL,
access_key
varchar(100) NOT NULL,
addition_time timestamp DEFAULT current_timestamp
);
CREATE TABLE wfm (
id
serial PRIMARY KEY,
rtype
varchar(20) NOT NULL,
iid
integer NOT NULL,
init_time timestamp DEFAULT current_timestamp
);
CREATE TABLE cinstances (
id
serial PRIMARY KEY,
rid
integer NOT NULL,
instance_id varchar(40),
cluster_status varchar(40),
num_slave_req integer,
start_time timestamp DEFAULT current_timestamp
);
CREATE TABLE slaves (
id
serial PRIMARY KEY,
ciid
integer NOT NULL,
instance_id varchar(40),
start_time timestamp DEFAULT current_timestamp
);
CREATE TABLE workflow (
id
serial PRIMARY KEY,
vo_id
integer NOT NULL,
uid
integer NOT NULL,
name
varchar(100) NOT NULL,
submit_time timestamp DEFAULT current_timestamp
);
CREATE TABLE wfsteps (
id
serial PRIMARY KEY,
wfid
integer NOT NULL,
step_id
integer,
tool_id
varchar(100) NOT NULL,
target
integer NOT NULL,
profile
varchar(100) NOT NULL,
num_cpu_req integer NOT NULL,
depth
integer NOT NULL,
dependency
varchar(100) NOT NULL,
start_time timestamp,
e_runtime integer,
is_evaluated varchar(100) NOT NULL
);
CREATE TABLE tool_profiles (
id
serial PRIMARY KEY,
tool_id varchar(100),
iid
integer NOT NULL,
inputfs float NOT NULL,
outputfs float NOT NULL,
node_count integer NOT NULL,
cpu_count integer,
memPeak float NOT NULL,
cpuPeak float NOT NULL,
exec_time bigint NOT NULL
);
CREATE TABLE tools (
id
serial PRIMARY KEY,
tool_id varchar(100) NOT NULL,
can_part varchar(40) NOT NULL,
can_scale varchar(40) NOT NULL,
cpu_req integer NOT NULL
);
i. Initialize the tools and tool_profiles tables
http://localhost:8080/portalWebApplication/rest-ws/ittp
4. DeltaCloud installation
a. Follow the instructions to install deltacloud server
https://deltacloud.apache.org/install-deltacloud.html
b. To enable a single DeltaCloud server invocation for all cloud
drivers(to avoid multiple invocation of deltacloud server), download
these three files from the repository
i. URLConnectionTransport$1.class
ii. URLConnectionTransport.class
iii. URLConnectionTransport$HttpError.class
and update them into client-0.2.0-SNAPSHOT.jar ( in directory
org/apache/deltacloud/client/transport/ ), which is obtained at the
end of Step(a).
c. Start deltacloud using the following command (for AWS cloud)
i. deltacloudd -i ec2 -r 0.0.0.0 -p 3001 -P us-west-2 --daemon
5. Service initialization script
a. Service initialization script is under src/scripts in the repository. Copy
this file under/etc/init.d/biocloudportal-init (also postgresql-stop
which is copied to the directory based on the location given in the init
script) on the instance and configure accordingly. This will start
deltacloud and the database servers automatically. This script does
not start apache on bootup. Run apache manually.
BioCloud Workflow Manager
BWM consists of Galaxy server to manage and run the workflows, database manager
to store local information regarding Galaxy server, and a proxy web server to host
Galaxy and handle user authentication.
1. Database installation
a. Download PostgreSQL
i. wget http://ftp.postgresql.org/pub/source/v9.3.4/postgresql9.3.4.tar.gz
b. Untar the file
i. tar -zxvf postgresql-9.3.4.tar.gz
c. Installation
i. cd postgresql-9.3.4/
ii. ./configure --prefix=/home/ubuntu/usr/local
iii. make
iv. make install
d. Initialize the database
i. pg_ctl init -D src/postgresql-9.3.4/data/galaxy/
ii. pg_ctl start -D src/postgresql-9.3.4/data/galaxy -l
src/postgresql-9.3.4/data/galaxy/logfile
iii. createdb galaxydb
e. Test database
i. psql galaxydb
f. Create tables
i. The tables will be created automatically by Galaxy in the first
run
2. Proxy server installation
a. Install apache server
i. sudo apt-get update
ii. sudo apt-get install apache2
b. Enable rewrite, proxy_http, proxy, and header modules
i. sudo a2enmod rewrite
ii. sudo a2enmod proxy_http
iii. sudo a2enmod proxy
iv. sudo a2enmod headers
c. Restart apache to activate modules
i. sudo service apache2 restart
d. Define Galaxy server as a valid proxy. Change proxy configuration file
i. sudo emacs /etc/apache2/mods-enabled/proxy.conf
ii. Add the following lines, assuming Galaxy will run on port 9002
#Define Galaxy as a valid Proxy
<Proxy http://localhost:9002>
Order deny,allow
Allow from all
</Proxy>
e. Configure proxy rules
i. sudo emacs /var/www/.htaccess
ii. Add the following lines assuming the Galaxy will be installed
under /opt/sw/galaxy/MERCURIAL/galaxy-dist/ and run on
port 9002
RewriteEngine on
# Take the $REMOTE_USER environment variable and set it as a header in
the proxy request.
#RequestHeader unset Cookie
AuthType Basic
AuthName Galaxy
AuthUserFile /opt/sw/galaxy/MERCURIAL/galaxy-dist/.htpasswd
Require valid-user
RewriteCond %{IS_SUBREQ} ^false$
RewriteCond %{LA-U:REMOTE_USER} (.+)
#RewriteRule . - [E=RU:%1]
RewriteRule .* - [E=RU:%1]
RequestHeader set REMOTE_USER %{RU}e
#RequestHeader add X-Forwarded-User %{RU}e
RewriteRule ^/static/style/(.*) /opt/sw/galaxy/MERCURIAL/galaxydist/static/june_2007_style/blue/$1 [L]
RewriteRule ^/static/scripts/(.*) /opt/sw/galaxy/MERCURIAL/galaxydist/static/scripts/packed/$1 [L]
RewriteRule ^/static/(.*) /opt/sw/galaxy/MERCURIAL/galaxy-dist/static/$1 [L]
RewriteRule ^/favicon.ico /opt/sw/galaxy/MERCURIAL/galaxydist/static/favicon.ico [L]
RewriteRule ^/robots.txt /opt/sw/galaxy/MERCURIAL/galaxydist/static/robots.txt [L]
RewriteRule ^(.*) http://localhost:9002/$1 [P]
f. Obtain user credentials automatically from the portal
i. Create script file
emacs /opt/sw/galaxy/MERCURIAL/galaxydist/check_users.py
ii. Add the following lines to the file
#!/usr/bin/python
import sys
import os
import glob
import imp
import requests
import json
import subprocess
f = open('/opt/sw/galaxy/MERCURIAL/galaxy-dist/ip_conf', 'r')
for line in f:
if line.split(" ")[0] == 'portal':
s = line.split(" ")[1]
wsurl = s+':8080/portalWebApplication/rest-ws/'+'getcrdntls'
f.close()
headers = {'content-type': 'application/json'}
r = requests.get(wsurl, headers=headers)
#print 'all response: ', r.json()
for i in range(0, len(r.json())):
# print 'response: ',
(r.json()[i])
uname = (r.json()[i])['accessid']
passwd = (r.json()[i])['accesskey']
#print 'username: ', uname, ' password: ',
passwd
if i == 0:
cmd = 'htpasswd -b -c /opt/sw/galaxy/MERCURIAL/galaxy-dist/.htpasswd '
+ uname + ' ' + passwd
else:
cmd = 'htpasswd -b /opt/sw/galaxy/MERCURIAL/galaxy-dist/.htpasswd '
+ uname + ' ' + passwd
p = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)
for line in p.stdout.readlines():
print line,
retval = p.wait()
iii. Change the Galaxy startup script
emacs /opt/sw/galaxy/MERCURIAL/galaxy-dist/run.sh
iv. Add the following lines to the file
python check_users.py
3. Galaxy server installation
a. Install required packages
sudo apt-get build-dep build-essential
gcc --version
sudo apt-get update
sudo apt-get install gcc
sudo apt-get install make
sudo apt-get install libsqlite3-dev
sudo apt-get install libncurses5-dev
sudo apt-get install tk-dev
sudo apt-get install libdb5.1-dev ?libdb-dev?
sudo apt-get install libgdbm-dev
sudo apt-get install libssl-dev
sudo apt-get install libreadline6-dev
b. Install zLib, bzip2, and python
zLib
wget http://zlib.net/zlib-1.2.8.tar.gz
./configure --prefix=/home/ubuntu/usr/local
make test
make install
bzip2
wget http://www.bzip.org/1.0.6/bzip2-1.0.6.tar.gz
make -f Makefile-libbz2_so
make
make install PREFIX=/home/ubuntu/usr/local
Python
./configure --prefix=/home/ubuntu/usr/local ?--enable-shared?
make
make install
c. Install docutils and mercurial
Docutils
python setup.py install
Mercurial
make PREFIX=/home/ubuntu/usr/local install
d. Install Galaxy server
hg clone https://bitbucket.org/galaxy/galaxy-dist/
cd galaxy-dist
hg update stable
e. Install Torque/OpenPBS
apt-get install -y build-essential
apt-get install -y autoconf
apt-get install -y automake
apt-get install -y pkg-config chkconfig
apt-get install -y libtool
apt-get install -y libboost-all-dev
apt-get install -y openssl
apt-get install -y gcc gcc-c++
apt-get install -y libxml2-dev
apt-get install -y git
git clone https://github.com/adaptivecomputing/torque.git -b 4.5.0
/opt/4.5.0
cd /opt/4.5.0
export CXXFLAGS="$CXXFLAGS -fPIC"
./autogen.sh
./configure --with-server-home=/opt/torque
make;make install
cp contrib/init.d/debian.trqauthd /etc/init.d/trqauthd
chkconfig --add trqauthd
echo /usr/local/lib > /etc/ld.so.conf.d/torque.conf
ldconfig
service trqauthd start
QMGR="/opt/torque/qmgr.conf"
echo "master" > /opt/torque/server_name
/usr/local/sbin/pbs_server -t create
echo "# Create queues and set their attributes" > $QMGR
echo "create queue workq" >> $QMGR
echo "set queue workq queue_type = Execution" >> $QMGR
echo "set queue workq max_queuable = 50" >> $QMGR
echo "set queue workq enabled = True" >> $QMGR
echo "set queue workq started = True" >> $QMGR
echo "set server scheduling = True" >> $QMGR
echo "set server default_queue = workq" >> $QMGR
echo "set server log_events = 511" >> $QMGR
echo "set server mail_from = adm" >> $QMGR
echo "set server query_other_jobs = True" >> $QMGR
echo "set server resources_default.neednodes = 1" >> $QMGR
echo "set server resources_default.nodect = 1" >> $QMGR
echo "set server resources_default.nodes = 1" >> $QMGR
echo "set server scheduler_iteration = 600" >> $QMGR
echo "set server node_check_rate = 150" >> $QMGR
echo "set server tcp_timeout = 6" >> $QMGR
echo "set server node_pack = False" >> $QMGR
/usr/local/bin/qmgr < $QMGR
qterm
service pbs_server restart
cat > /opt/torque/server_priv/nodes << EOL
`cat /root/machinefile`
EOL
cp contrib/init.d/debian.pbs_mom /etc/init.d/pbs_mom
update-rc.d pbs_mom defaults
cp contrib/init.d/debian.pbs_sched /etc/init.d/pbs_sched
update-rc.d pbs_sched defaults
make packages
wget -P /root http://cz.archive.ubuntu.com/ubuntu/pool/universe/p/pssh/pssh_2.1.11_all.deb
apt-get install -y python-support
dpkg -i /root/pssh_2.1.1-1_all.deb
parallel-scp -h /root/machinefile /opt/4.5.0/torque-package-clients-linux-x86_64.sh
/tmp
parallel-scp -h /root/machinefile /opt/4.5.0/torque-package-mom-linux-x86_64.sh /tmp
parallel-ssh -h /root/machinefile /tmp/torque-package-clients-linux-x86_64.sh -install
parallel-ssh -h /root/machinefile /tmp/torque-package-mom-linux-x86_64.sh --install
parallel-scp -h /root/machinefile /opt/4.5.0/contrib/init.d/debian.pbs_mom
/etc/init.d/pbs_mom
parallel-ssh -h /root/machinefile update-rc.d pbs_mom defaults
/etc/init.d/pbs_server restart
/etc/init.d/pbs_mom restart
/etc/init.d/pbs_sched restart
parallel-ssh -h /root/machinefile service pbs_mom restart
f. Install PBS_Python
https://oss.trac.surfsara.nl/pbs_python/ (download and compile against the TORQUE
installation)
LIBTORQUE_DIR=/path/to/libtorque python scripts/scramble.py -e pbs_python (place
the python egg)
g. Modify the main configuration file
emacs /opt/sw/galaxy/MERCURIAL/galaxy-dist/universe_wsgi.ini
Change port number to 9002
Change database connection to
postgress://Ubuntu:@localhost:5432/galaxydb
Enable new_file_path to /opt/sw/galaxy/MERCURIAL/tmp
Enable cluster_files_directory to database/pbs
Enable collect_outputs_from to job_working_directory
Enable remote_user to True
Enable job_config_file to job_conf.xml
h. Modify job configuration file
i. Add pbs job runner. A sample file can be similar to this
<?xml version="1.0"?>
<job_conf>
<plugins workers="4">
<plugin id="pbs" type="runner"
load="galaxy.jobs.runners.pbs:PBSJobRunner" workers="2"/>
<plugin id="local" type="runner"
load="galaxy.jobs.runners.local:LocalJobRunner"/>
</plugins>
<handlers>
<handler id="main"/>
</handlers>
<destination id="pbs_default" runner="pbs">
<!-- <param
id="Resource_List">nodes=1:ppn=2,walltime=72:00:00</param> -->
<param id="Resource_List">nodes=1:ppn=1</param>
</destination>
<destination id="pbs_longjobs" runner="pbs">
<!-- Define parameters that are native to the job runner plugin. -->
<param id="Resource_List">walltime=72:00:00</param>
</destination>
<destination id="pbs_8" runner="pbs">
<param
id="Resource_List">nodes=1:ppn=8,walltime=72:00:00</param>
</destination>
<destination id="local_default" runner="local"/>
</destinations>
<tools>
<tool id="gops_join_1" destination="local_default"/>
<tool id="Show tail1" destination="pbs_8"/>
</tools>
</job_conf>
i.
Add the tools to Galaxy
i. Change tool configuration file
emacs /opt/sw/galaxy/MERCURIAL/galaxy-dist/tool_conf.xml
ii. Add the following lines to the file
<section id="rna_seq_assembly" name="RNA-SEQ Assembly">
<tool file="assemblyWF/transAbyss.xml"/>
<tool file="assemblyWF/SOAPdenovo_Trans.xml"/>
<tool file="assemblyWF/IDBA_tran.xml"/>
<tool file="assemblyWF/Trinity.xml"/>
<tool file="assemblyWF/VelvetOases.xml"/>
<tool file="assemblyWF/VelvetOasses.xml"/>
<tool file="assemblyWF/Trimmomatic.xml"/>
<tool file="assemblyWF/TGICL.xml"/>
<tool file="assemblyWF/Merger.xml"/>
<tool file="assemblyWF/BlastX_1.xml"/>
<tool file="assemblyWF/BlastX_2.xml"/>
<tool file="assemblyWF/BlastX_3.xml"/>
<tool file="assemblyWF/Blastx_NR.xml"/>
<tool file="assemblyWF/Blastx_Uniport.xml"/>
<tool file="assemblyWF/Blastx_Swiss.xml"/>
<tool file="assemblyWF/Parse_blast.xml"/>
<tool file="assemblyWF/Fastq_to_Fasta.xml"/>
<tool file="assemblyWF/Fastq_to_fasta.xml"/>
<tool file="assemblyWF/Fasta_Filter.xml"/>
</section>
<section name="Bio-CloudBroker Tools" id="mTools">
<tool file="myTools/mpiformatdb.xml" />
<tool file="myTools/mpiblast.xml" />
<tool file="myTools/partitionFileRef.xml" />
<tool file="myTools/mergeFileDynRef.xml" />
</section>
<section name="ExomeSeq WF" id="ExomeSeqWF">
<tool file="exomeSeqWF/bwa-0.7.5_aln.xml" />
<tool file="exomeSeqWF/bwa-0.7.5_sampe.xml" />
<tool file="exomeSeqWF/sam2bam_sort.xml" />
<tool file="exomeSeqWF/picard-1.94-markduplicates.xml" />
<tool file="exomeSeqWF/picard-1.94-addreadgroups.xml" />
<tool file="exomeSeqWF/gatk-2.8.1_realign.xml" />
<tool file="exomeSeqWF/gatk-1.4_somatic_indel_detector.xml" />
</section>
<section name="ExomeSeq WF by Ref" id="ExomeSeqRefWF">
<tool file="exomeSeqRefWF/bwa-0.7.5_aln.xml" />
<tool file="exomeSeqRefWF/bwa-0.7.5_sampe.xml" />
<tool file="exomeSeqRefWF/sam2bam_sort.xml" />
<tool file="exomeSeqRefWF/picard-1.94-markduplicates.xml" />
<tool file="exomeSeqRefWF/picard-1.94-addreadgroups.xml" />
<tool file="exomeSeqRefWF/gatk-2.8.1_realign.xml" />
<tool file="exomeSeqRefWF/gatk-1.4_somatic_indel_detector.xml" />
</section>
iii. Copy the tools under myTools, exomeSeqWF, exomeSeqRefWF,
and assemblyWF to /opt/sw/galaxy/MERCURIAL/galaxydist/tools/
j. Change the IP address of the BP
i. sudo emacs /opt/sw/galaxy/MERCURIAL/galaxydist/ip_conf.xml
ii. A record should be similar to this without a new line character
at the end
portal http://ec2-ip.address.compute.amazonaws.com
k. Change Galaxy code to improve submitted workflows
i. emacs /opt/sw/galaxy/MERCURIAL/galaxydist/lib/galaxy/webapps/galaxy/controllers/workflow.py
ii. emacs /opt/sw/galaxy/MERCURIAL/galaxydist/lib/galaxy/tools/actions/__init__.py
iii. Modified code is marked with “ifs”. Search for this term and
find the modified locations in the repository, and then update
the Galaxy code in these two files.
l. Change Galaxy code to intervene the job submission process and run
the jobs based on the BP decision
i. emacs /opt/sw/galaxy/MERCURIAL/galaxydist/lib/galaxy/jobs/handler.py
ii. Modified code is marked with “ifs”. Search for this term and
find the modified locations in the repository, and then update
the Galaxy code.
m. Change Galaxy code so that BP can track the job completion process
i. emacs /opt/sw/galaxy/MERCURIAL/galaxydist/lib/galaxy/jobs/__init__.py
ii. Modified code is marked with “ifs”. Search for this term and
find the modified locations in the repository, and then update
the Galaxy code.
n. Change Galaxy code to collect the profiling information
i. emacs /opt/sw/galaxy/MERCURIAL/galaxydist/lib/galaxy/jobs/runners/pbs.py
ii. Modified code is marked with “ifs”. Search for this term and
find the modified locations in the repository, and then update
the Galaxy code.
iii. emacs /opt/sw/galaxy/MERCURIAL/galaxydist/lib/galaxy/jobs/runners/util/job_script/DEFAULT_JOB_F
ILE_TEMPLATE.sh
iv. Modified code is not marked in this file. Change accordingly.
o. The galaxy server contains the following scripts under /opt directory
i. getPID1.sh - To collect the profile information
ii. checkpbs.sh - To check the status of pbs scheduler
iii. freenodes.sh - To identify the free nodes in the cluster
iv. client.py - To send the collected information to profiler in BP
p. Service initialization script
i. Copy biocloudwfm-init under src/scripts in the repository to
/etc/init.d and activate it accordingly.
Download