UCSC 100 Gbps Science DMZ – 1 year 9 month Update Brad Smith & Mary Doyle Slide 1 Slide 1 Goal 1 - 100 Gbps DMZ - Complete! CENIC HPR and Global Research Networks CENIC DC and Global Internet Existing 10 Gb/s SciDMZ 10 Gb/s SciDMZ Research 10 Gb/s Border Border SciDMZ Infrastructure 100 Gb/s Science DMZ Router L2 dtn.ucsc.edu Core Core Science DMZ 10 Gb/s Campus Distribution Core Slide 2 Goal 2 – Collaborate with users to use it! • MCD Biologist doing brain wave imaging • SCIPP analyzing LHC ATLAS data • HYADES cluster doing Astrophysics visualizations • CBSE Cancer Genomics Hub Slide 3 Exploring mesoscale brain wave imaging data James Ackman Assistant Professor Department of Molecular, Cell, & Developmental Biology University of California, Santa Cruz 1. Record brain activity patterns 2. Analyze cerebral connectivity • external computing • local computing Science DMZ • • • • Acquire 60 2.1GB TIFF images/day (120 GB/day total). Initially transfer 20 Mbps = 12-15 mins/TIFF = 15hrs/day! With Science DMZ 354 Mbps = 1min = 1hr/day! Expected to grow 10x over near term Slide 4 SCIPP Network Usage for Physics with ATLAS Ryan Reece ryan.reece@cern.ch Santa Cruz Institute for Particle Physics Slide 5 ATLAS Detector Humans (for scale) p+ p+ T. Rex Slide 6 Data Volume • LHC running 2009-2012 produced ~ 100 PB – Currently ~10 PB/year • SCIPP process and skim that on the LHC computing grid, and bring ~10 TB of data to SCIPP each year. – 12hr transfer time impacts ability to provide input for next experiment • Expect ≈ 4 times the data volume in the next run 2015-2018. • Our bottleneck is downloading the skimmed data to SCIPP. • Current download rate ~ few TB every few weeks. Slide 7 Throughput 1 Gbps – 400 Mbps public network private network atlas01 (headprv) public-private network bridge XROOTD data-flow wrk0prv users atlas02 (int0prv) 1 Gb nfs NFS data-flow ≈20 TB downloading from grid atlas03 (nfsprv) 1 Gb atlas04 (int1prv) 1 Gb 1 Gb Dell 6248 Switch (2007) campus network 1 Gb wrk1prv wrk2prv ... XROOTD wrk7prv 128 CPUs ≈20 TB Slide 8 Throughput 10 Gbps – 400 Mbps?! public network private network atlas01 (headprv) public-private network bridge XROOTD data-flow wrk0prv users atlas02 (int0prv) 1 Gb nfs NFS data-flow ≈20 TB downloading from grid atlas03 (nfsprv) atlas04 (int1prv) 10 Gb 10 Gb 10 Gb Dell 6248 Switch (2007) campus network 1 Gb wrk1prv wrk2prv ... XROOTD wrk7prv 128 CPUs ≈20 TB Slide 9 With help from ESNet! Offload Dell Switch – 1.6 Gbps public network private network atlas01 (headprv) public-private network bridge XROOTD data-flow wrk0prv users atlas02 (int0prv) 1 Gb nfs NFS data-flow ≈20 TB downloading from grid atlas03 (nfsprv) atlas04 (int1prv) 10 Gb 10 Gb 10 Gb 10 Gb Dell 6248 Switch (2007) campus network 1 Gb wrk1prv wrk2prv ... XROOTD wrk7prv 128 CPUs ≈20 TB Slide 10 SCIPP Summary • Quadrupled throughput – Reduce download time from 12 hrs to 3 hrs • Still long ways from 10 Gbps potential – ~30mins (factor of 8) • Probably not going to be enough for new run – ~4x data volume • Possible problems – – – – • Atlas03 storage (not enough spindles) WAN or protocol problems 6 year old Dell switch Investigating GridFTP solution and new LHC data access node from SDSC We are queued up to help them when they’re ready… Slide 11 Hyades • Hyades is an HPC cluster for Computational Astrophysics • Funded by a $1 million grant from NSF in 2012 • Users from departments of Astronomy & Astrophysics Physics Earth & Planetary Sciences Applied Math & Statistics Computer Science, etc • Many are also users of national supercomputers Slide 12 Hyades Hardware • • • • • • • • 180 Compute Nodes 8 GPU Nodes 1 MIC Node 1 big-memory Analysis Node 1 3D Visualization Node Lustre Storage, providing 150TB of scratch space 2 FreeBSD Files Servers, providing 260TB of NFS space 1 PetaByte Cloud Storage System, using Amazon S3 protocols Slide 13 Slide 14 Data Transfer • 100+ TB between Hyades and NERSC • 20 TB between Hyades and NASA Pleiades; in the process of moving 60+ TB from Hyades to NCSA Blue Waters • 10 TB from Europe to Hyades • Shared 10 TB of simulation data with collaborators in Australia, using the Huawei Cloud Storage Slide 15 Remote Visualization • Ein is a 3D Visualization workstation, located in an Astronomy office (200+ yards from Hyades) • Connected to Hyades via a 10G fiber link • Fast network enables remote visualization in real time: – Graphics processing locally on Ein – Data storage and processing remotely, either on Hyades or on NERSC supercomputers Slide 16 CBSE CGHub • NIH/NCI archive of cancer genomes • 10/2014 - 1.6PB of genomes uploaded • 1/2014 – 1PB/month downloaded(!) • Located at SDSC… managed from UCSC • Working with CGHub to explore L2/“engineered” paths Slide 17 Innovations… • “Research Data Warehouse” – DTN with long-term storage • Whitebox switches – – – – On chip packet buffer – 12 MB 128 10 Gb/s SERDES... so 32 40-gig ports SOC… price leader, uses less power Use at network edge Slide 18 Project Summary • 100 Gbps Science DMZ completed • Improved workflow for a number of research groups • Remaining targets – – – – Extend Science DMZ to more buildings Further work with SCIPP… when they need it L2 (“engineered”) paths with CBSE (genomics) SDN integration • Innovations – “Research Data Warehouse” - DTN as long-term storage – Whitebox switches Slide 19 Questions? Brad Smith Director Research & Faculty Partnerships, ITS University of California Santa Cruz brad@ucsc.edu Slide 20