Cloud Computing Mick Watson Director of ARK-Genomics The Roslin Institute Structure • What is a computer – Desktops / servers / clusters – Clients / servers • • • • Virtualisation The Cloud Accessing the Amazon cloud Costs etc What is a computer? What is a server? What is a cluster? • A cluster is a connected set of computers (nodes) • Any job can be run on any of the nodes • A server may be a single PC, it may be cluster An operating system? • An OS is the software that runs the computer: – E.g. Windows – E.g. Linux – E.g. Mac OSX – E.g. Android – E.g. Solaris – E.g. iOS Virtualisation • You can run an entire computer inside a computer • • • • Take OS and data and all running processes Create an “image” Recreate that image inside another PC Access it as if it was a (physical) normal PC • http://en.wikipedia.org/wiki/Virtualization What is “the cloud”? • There is not just one! – Amazon EC2 – Rackspace – Etc • “The Cloud” refers to a large cluster of computers, in which you can create, for a fee, as many virtual computers as you like AMAZON EC2 We will use Amazon EC2 • Terminology: – EC = “Elastic Compute” – Image – a preconfigured computer image. Like a template. – Instance – an virtual version of an image that you can log in to and use • Select a pre-configured Amazon Machine Image (AMI) to get up and running immediately. • Or create an AMI containing your applications, libraries, data, and associated configuration settings. • Configure security and network access • Choose which instance type(s) you want, then start, terminate, and monitor as many instances as you like. • Pay only for the resources that you actually consume, like instancehours or data transfer • Close down the image(s) when finished Linux • Linux refers to an entire family of operating systems: – Red Hat – Ubuntu – Debian etc • Linux is free • Many of the computers that power the internet run Linux • Almost all bioinformaticians use it • Powerful, extendable and open The power is the command line • Don’t panic! BioLinux • http://nebc.nerc.ac.uk/tools/bio-linux/bio-linux-6.0 • A Linux operating system designed for bioinformatics • More than 500 bioinformatics software programs installed on top of Ubuntu 10.4 base • There are BioLinux AMIs on EC2 – CloudBioLinux • We have created our own AMI based on CloudBioLinux How powerful? • Once you have selected the “type” of computer (AMI) you must then select the size and power • • • • • • M1 Small Instance (Default) 1.7 GiB of memory, 1 EC2 Compute Unit (1 virtual core with 1 EC2 Compute Unit), 160 GB of local instance storage, 32-bit or 64-bit platform M1 Medium Instance 3.75 GiB of memory, 2 EC2 Compute Units (1 virtual core with 2 EC2 Compute Units each), 410 GB of local instance storage, 32-bit or 64-bit platform M1 Large Instance 7.5 GiB of memory, 4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each), 850 GB of local instance storage, 64-bit platform M1 Extra Large Instance 15 GiB of memory, 8 EC2 Compute Units (4 virtual cores with 2 EC2 Compute Units each), 1690 GB of local instance storage, 64-bit platform M3 Extra Large Instance 15 GiB of memory, 13 EC2 Compute Units (4 virtual cores with 3.25 EC2 Compute Units each), EBS storage only, 64-bit platform M3 Double Extra Large Instance 30 GiB of memory, 26 EC2 Compute Units (8 virtual cores with 3.25 EC2 Compute Units each), EBS storage only, 64-bit platform How powerful? • Once you have selected the “type” of computer (AMI) you must then select the size and power • • • • • • Micro Instance 613 MiB of memory, up to 2 ECUs (for short periodic bursts), EBS storage only, 32-bit or 64-bit platform (free) High-Memory Extra Large Instance 17.1 GiB memory, 6.5 ECU (2 virtual cores with 3.25 EC2 Compute Units each), 420 GB of local instance storage, 64-bit platform High-Memory Double Extra Large Instance 34.2 GiB of memory, 13 EC2 Compute Units (4 virtual cores with 3.25 EC2 Compute Units each), 850 GB of local instance storage, 64-bit platform High-Memory Quadruple Extra Large Instance 68.4 GiB of memory, 26 EC2 Compute Units (8 virtual cores with 3.25 EC2 Compute Units each), 1690 GB of local instance storage, 64-bit platform High-CPU Medium Instance 1.7 GiB of memory, 5 EC2 Compute Units (2 virtual cores with 2.5 EC2 Compute Units each), 350 GB of local instance storage, 32-bit or 64-bit platform High-CPU Extra Large Instance 7 GiB of memory, 20 EC2 Compute Units (8 virtual cores with 2.5 EC2 Compute Units each), 1690 GB of local instance storage, 64-bit platform How powerful? • Once you have selected the “type” of computer (AMI) you must then select the size and power • • • • Cluster Compute Quadruple Extra Large 23 GiB memory, 33.5 EC2 Compute Units, 1690 GB of local instance storage, 64-bit platform, 10 Gigabit Ethernet Cluster Compute Eight Extra Large 60.5 GiB memory, 88 EC2 Compute Units, 3370 GB of local instance storage, 64-bit platform, 10 Gigabit Ethernet Cluster GPU Quadruple Extra Large 22 GiB memory, 33.5 EC2 Compute Units, 2 x NVIDIA Tesla “Fermi” M2050 GPUs, 1690 GB of local instance storage, 64-bit platform, 10 Gigabit Ethernet High I/O Quadruple Extra Large 60.5 GiB memory, 35 EC2 Compute Units, 2 * 1024 GB of SSDbased local instance storage, 64-bit platform, 10 Gigabit Ethernet THE PRACTICAL • http://www.ark-genomics.org/events-online-training/eu-training-course • In this course you will start a new Amazon EC2 instance and begin to learn some of the essentials of the linux commandline • Don’t panic.