Scientific Computing in the Consumer Digital Infrastructure David P. Anderson Space Sciences Lab University of California, Berkeley The Austin Forum November 7, 2013 Science needs computing power ● High-performance computing ● High-throughput computing – Thousands or millions of independent jobs – What matters is the rate of job completion, not the turnaround time of individual jobs High-throughput computing applications ● ● ● Physical simulation – particle collision – atomic/molecular (bio, nano) – Earth climate system Compute-intensive data analysis – particle physics (LHC) – Astrophysics (radio, gravitational) – genomics Bio-inspired optimization – genetic algorithms, flocking, ant colony etc. Approaches to HTC ● Cluster computing – ● Grid computing – ● share clusters between organizations Cloud computing – ● lots of commodity or rack-mounted PCs in a room rent cluster nodes, e.g. Amazon EC2 Volunteer computing – use computers owned by consumers The Consumer Digital Infrastructure ● ● Computing devices – Desktop and laptop computers – Mobiles devices: tablets, smart phones – Game consoles – Set-top boxes, DVRs – Appliances Commodity Internet – Cable, DSL, fiber to the home, cell networks Measures of computing speed ● Floating-point operation (FLOP) ● GigaFLOPS (109/sec): 1 Central Processing Unit (CPU) ● TeraFLOPS (1012/sec): 1 Graphics Processing Unit (GPU) ● PetaFLOPS (1015/sec): 1 supercomputer ● ExaFLOPS (1018/sec): current Holy Grail CDI performance potential ● ● 1 billion Desktop/laptop PCs – CPUs: 10 ExaFLOPS – GPUs: 1,000 ExaFLOPS 2.5 billion smartphones – CPUs: 10 ExaFLOPS Volunteer computing ● ● Consumers donate computing capacity to – support science – be in a community – compete History – 1997: GIMPS, distributed.net – 1999: SETI@home, Folding@home – 2003: BOINC Limiting factors ● Volunteership – ● Study of college students [Toth 2006] ● 5% would “definitely participate” ● 10% would “possible participate” PC availability – 65% average availability [Kondo 2008] – 35% of PCs are available 24/7 Other limiting factors ● Network bandwidth (client, server) – ● Commodity Internet Memory, disk usage – new PCs average 6 GB RAM BOINC: middleware for volunteer computing ● Supported by NSF since 2002 ● Open source (LGPL) ● Based at University of California, Berkeley ● http://boinc.berkeley.edu Volunteer computing with BOINC projects volunteers LHC@home CPDN attachments WCG How to volunteer Choose projects Configure Community Creating a BOINC project ● Install BOINC server software on a Linux box ● Compile apps for Windows/Mac/Linux ● Attract volunteers – develop web site – generate publicity – communicate with volunteers Volunteer computing today ● 500,000 active computers ● 50 projects ● 15 PetaFLOPS average Some BOINC-based projects ● IBM World Community Grid ● Einstein@home ● Climateprediction.net ● LHC@home ● Rosetta@home Cost The cost of 10 TeraFLOPS for 1 year: ● CPU cluster: $1.5M ● Amazon EC2: $4M – ● 5,000 small instances Volunteer: ~ $0.1M How BOINC works project home PC get jobs download data, executables BOINC client compute upload outputs HTTP BOINC server Issues handled by BOINC ● Heterogeneous computers ● Untrusted, anonymous computers – Result validation ● replication, adaptive replication ● Credit: amount of work done ● Consumer-friendly client Using GPUs ● ● BOINC detects and schedules GPUs – NVIDIA, AMD, Intel – multiple/mixed GPUs – various language systems (CUDA, OpenCL, CAL) Issues – non-preemptive GPU scheduling – no paging of GPU memory Multicore apps ● Next-generation PCs may have 100 cores ● BOINC supports multi-core apps – OpenMP, MPI – OpenCL CPU apps Using VM technology ● ● ● CDI platforms: – 85% Windows – 7% Linux – 7% Mac OS X Developing and maintaining versions for different platforms is hard Even making a portable Linux executable is hard Virtual machines application Guest operating system Host operating system Virtual machines application Debian Linux 2.6 Windows 7 BOINC VM support ● Create a VM image for your favorite environment ● Create executables for that environment VirtualBox executive BOINC client Vbox wrapper shared directory: executable input, output files VM instance VM advantages ● Develop in your favorite environment – ● A VM is a strong “sandbox” – ● No need for multiple versions Can run untrusted applications Free “checkpointing” BOINC on Android ● New GUI ● Battery-related issues ● Released July 2013 – Google, Amazon App Stores – ~50K active devices Why hasn’t volunteer computing gained traction? ● “Ecosystem of projects” model – ● Lots of competing projects Problems with this model – Creating/operating a project is too hard and risky – Volunteers need simplicity – No coherent PR; too many brands Umbrella projects ● One project serves many scientists ● Examples – CAS@home (Chinese Academy of Science) – World Community Grid (IBM) – U. of Westminster (desktop grid) – Ibercivis (Spanish consortium) Integrating BOINC ● HTCondor (U. of Wisconsin) – Goal: BOINC-based back end for Open Science Grid or any Condor pool HTCondor node Grid manager BOINC GAHP Job submission BOINC server Integrating BOINC ● HUBzero (Purdue) – Goal: BOINC-based back end for science portals such as nanoHUB Hub BOINC server PCs projects projects Proposal: Science@home ● ● ● Single “brand” for volunteer computing Volunteers register for science areas rather than projects How to allocate computing power? – Involve the HPC, scientific funding communities Implementing Science@home ● BOINC “account manager” architecture BOINC client Science@home projects projects projects Summary ● Volunteer computing is – Usable for most HTC applications – A path to ExaFLOPS computing – A way to popularize science ● BOINC provides the software infrastructure ● Barriers are largely organizational Contacts ● http://boinc.berkeley.edu ● davea@ssl.berkeley.edu