IBM Smart Cloud guide started 1. Overview Access link: https://www-147.ibm.com/cloud/enterprise/dashboard We are going to work in the IBM Smart Cloud Enterprise. The first thing we are going to do is to sign in. So, the dashboard is composed in four parts: • • • • Overview. We can see and access the main options to be managed. Control panel. Add images, create instances and control the storage. Account. Manage users inside instances and permissions. Support. Help and guide you to use the IBM Cloud. Overview panel This window gives us information about our instances and our recent activity in SCE Control Panel In this case we can do the things we want such as create instances, delete it, reboot etc. Also, we can establish predefined images and specifics storage for large data sets. Account These tools allow us to manage fixed IPs and keys for connecting machines. Support 2. Create and access to instances You can create different kind of instances such as predefined or your own images. In this document we can focus in preinstalled images and how to configure. First of all, we have to learn how to work and manage instances by command lines. A link with a couple of examples: http://www.ibm.com/developerworks/cloud/library/cl-commandlinelx/ These examples show us how to create and describe instances. Also we can delete it or ask for a lot of information about everything show in the dashboard told before. When we create an instance, we have to say which private key we want. I recommend that this key must the same for all the instances. So we can connect between instances by the same key. This key allows us to connect for the instance too by software. For that, we have to download the key by the dashboard when we create a private key in the account window. If we want to connect by ssh (i.e: by putty) we have to do these steps: Firstly, we have to generate a .ppk with the software Puttygen. Click on Load button and select de private key downloaded in the dashboard. After that, write the instance’s IP. Click in the option – Allow attempted changes of username in SSH-2 – and select de private key (*.ppk) previously created by putty. 3. Configure instances 3.1. Use MPI in the instances http://www.ibm.com/developerworks/systems/library/es-hpcluster1/ The only thing we need is (after installing in one machine in a usual way) copy the folder in other machines and establish the PATH environment. In the last page of the document we can describe the steps in each part Linux Command Line If we want to manage by command line we have to download the tools and establish the JAVA environment. The following link helps us: https://www-147.ibm.com/cloud/enterprise/ram/assetDetail/generalDetails.faces?guid={F1466F46-A4AB-3879D883-1A26A43BF046}&v=1.4.1&fid=&tid=&post= To access your IBM account by command line you have to create a file with a passphrase. There is a command line to create that called: ic-create-password -u|--username username -p|--password password -w|--passphrase passphrase -g|--file-path filepath [-h|--help] See the documentation about the command line necessary to manage the IBM Cloud. Pseudo code for managing instances using MPI: 1. Create Master instance: ./ic-create-instance.sh Or Add Instance by command line by dashboard web interface 2. Configure Master instance: a. Copy MPI directory (mpich) and copy directory “command line tools”. b. Environment variable PATH for MPI and JAVA_HOME Export PATH=$PATH:/home/idcuser/mpich/bin Export JAVA_HOME=/usr/lib/jvm/jre-1.6.0-openjdk Export PATH=$PATH:$JAVA_HOME/bin c. Copy Directory keys. Inside key.priv & pass.txt for accessing IBM. The key.priv allows us connect the Master with any worker the first time (by ssh). This file we already have it (previously downloaded in the dashboard) Pass.txt is a file generated by command line tool to access the IBM Cloud. See the documents for creating the file d. Stop IPtables: “sudo /sbin/service iptables stop” e. Generate public key for ssh: ssh-keygen –t rsa. This provides us a file called id_rsa.pub. We use that for allowing Master to connect workers. Permissions to authorize ssh and connect it: chmod 700 .ssh chmod 600 key.priv cat .ssh/id_rsa.pub > .ssh/authorized_keys2 chmod 600 .ssh/* 3. Create Workers instances: while nº PE < Max do ./ic-create_instance.sh –u user – w passphrase –g pass.txt –t instanceType (i.e: COP32.1/2048/60 –k imageID (see dashboard information) –L region (i.e: 61 for Germany) –n “instance name” –d “description” > instance.txt ./getID instance.txt newID.txt It is a software that create a file with only the ID for the new instance. This file generated help us later to configure each instance created by this master instance. Nº PE = Nº PE - 1 end while 4. Configure workers by master instance. a. Copy mpich directory and establish the PATH environment. b. Copy id_rsa.pub key into the file authorized_keys2 to allow the master to connect this machine. We can do this with an scp with the parameter –i. The key can be sent for the master machine. It is the key file downloaded in the dashboard. An example: scp -r -i key.priv /home/…/.ssh/authorized_keys2 “masterIP”:/home/…/.ssh/id_rsa.pub Don’t forget to change the key.priv permissions: chmod 600 key.priv c. Change permissions for ssh: chmod 700 .ssh chmod 600 .ssh/* d. Stop iptables: sudo /sbin/service iptables stop e. Create the directory for MrCirrus: software, data, etc. Also it is necessary to create a little script called “sendResult.sh” because MrCirrus-Worker when it finished, execute this script to send the partial result to the Master machine. If you use a previous version for MrCirrus, that step is not necessary. f. Create directory called the same name of the tools directory of the IBM command line tools. It is necessary because the mpirun has to change the execution directory (step e) and the worker machine has to have the same directories for that. Example: The Master has the following directory.: - cmdTools with the command line tolos and software necessary to prepare and send the configuration to workers. testMPI with the MrCirrus software, sequential app and data. So, the worker machine has to have the same name of these directories. g. Once you have configured the machines, you can test if all is right. Try to connect via ssh from the Master machine: ssh “IPworker” and also test the data and directories are rights. 3.2. Hadoop In Hadoop (hadoop-streaming in this case) you have to create a script to upload and download the necessary files to run your application. There are some command lines to do all this things. The script depends on the software. There is an example in this document to prepare a Blast execution for hadoop. This example is necessary to send for all hadoop machines (slaves) before running the job Some of the command line to manage hadoop-streaming are: Manage files: Hadoop dfs –get “Hadoop-directory/file” “my machine directory” Hadoop dfs –put “my machine directory” “Hadoop-directory/file” Hadoop dfs –rm “Hadoop-directory/file” Hadoop dfs –rmr “Hadoop-directory” Execute hadoop-streaming: Hadoop jar /mnt/biginsights/opt/ibm/biginsights/IHC/Contrib/streaming/hadoop-0.20.2streaming.jar –mapper “scriptCreated” –input “script-map” –output “Hadoop directory” scriptCreated is the script showed in this document. script-map is a file with each execution line. 3. Scripts examples. Create instances: Example: ./createInstances.sh 4 testIBM /home/…/passFile nproc=$1 prefix=$2 #prefix for the names of the machines created. pass=$3 #directory of passFile created by command line while [ $nproc -ne 0 ] do #crear instancias segun numero de procesadores. echo "creando instancia..." ./ic-create-instance.sh -u ots@bioinf.jku.at -w bitlab -g $pass -t COP32.1/2048/60 -k 20032564 -c keyRHEL -L 61 -n "$prefix$nproc" -d "Instance created by command line" > instances.txt nproc=`expr $nproc - 1` echo "instancia creada...faltan $nproc ." cat instances.txt >> first.txt ./getIDs instances.txt IDsAux.txt cat IDsAux.txt >> newIDs.txt done Configure instances: First of all, I have sent this script to the worker and I execute it by ssh from the Master Node. Example: ./scriptClient.sh “IPMaster” “Directory of MrCirrus (only the name)” # Copy Directory MPI from master miIP=$2 dirMPI=$1 scp -r -i key2.priv -o StrictHostKeyChecking=no $miIP:/home/idcuser/mpich/ /home/idcuser/ chmod 777 mpich/bin/* echo "export PATH=/home/idcuser/mpich/bin:$PATH" >> .bash_profile scp -i key2.priv -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no $miIP:/home/idcuser/.ssh/id_rsa.pub /home/idcuser/.ssh/authorized_keys2 chmod 700 .ssh/ chmod 600 .ssh/* sudo /sbin/service iptables stop #una vez preparado hay que copiar los datos mkdir cmdTools mkdir $dirMPI scp -r -i key2.priv -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no $miIP:/home/idcuser/$dirMPI/soft /home/idcuser/$dirMPI scp -r -i key2.priv -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no $miIP:/home/idcuser/$dirMPI/data /home/idcuser/$dirMPI # When the Worker finish, MrCirrus execute a script called sendResult.sh. This script depends of the application. In this case AllFrag is the app used. echo "scp -i /home/idcuser/key2.priv -o StrictHostKeyChecking=no /home/idcuser/$dirMPI/bin* $miIP:/home/idcuser/$dirMPI/" > /home/idcuser/$dirMPI/sendResult.sh echo "ls -l > ls.txt" >> /home/idcuser/$dirMPI/sendResult.sh echo "rm bin*" >> /home/idcuser/$dirMPI/sendResult.sh chmod 777 /home/idcuser/$dirMPI/sendResult.sh Hadoop script #!/usr/bin/env bash # map_Hadoop_blast.sh: running a C program using Hadoop Streaming # Written by Oscar, 2011 read offset CMD echo "offset = $offset, CMD = $CMD" >&2 #bucket=jlc-uma folder=dirPrueba app=blastall64 APP_PATH=`pwd` # Splits CMD into individual words (white space delimiter) set $CMD app=$offset algorithm=$1 db=$3 seq2=$5 output_file=$7 paramb=${9} paramv=${11} #S3_PREFIX='$bucket.s3.amazonaws.com' # Check directory contents ls >&2 if [ ! -f $app ] then echo "soft get $app" >&2 hadoop dfs -get $folder/$app $app fi if [ ! -f $db ] then echo "hadoop dfs -get $folder/$db $db" >&2 hadoop dfs -get $folder/$db $db fi if [ ! -f $db.nhr ] then echo "hadoop dfs -get $folder/$db.nhr $db.nhr" >&2 hadoop dfs -get $folder/$db.nhr $db.nhr fi if [ ! -f $db.nin ] then echo "hadoop dfs -get $folder/$db.nin $db.nin" >&2 hadoop dfs -get $folder/$db.nin $db.nin fi if [ ! -f $db.nsq ] then echo "hadoop dfs -get $folder/$db.nsq $db.nsq" >&2 hadoop dfs -get $folder/$db.nsq $db.nsq fi if [ ! -f $seq2 ] then echo "hadoop dfs -get $folder/$seq2 $seq2" >&2 hadoop dfs -get $folder/$seq2 $seq2 fi echo "Launching: $app -p $algorithm -d $db -i $seq2 -o $output_file -b $paramb -v $paramv" >&2 chmod 777 $APP_PATH/$app $APP_PATH/$app -p $algorithm -d $db -i $seq2 -o $output_file -b $paramb -v $paramv ls >&2 echo "Uploading to IBM $output_file $folder" >&2 hadoop dfs -put $output_file $folder/$output_file # Writes something to standard output echo "File $seq2 processed over the DB $db. Output file is $output_file !!" # Script report status exit 0