Clemson NextNet SDN Use Cases for Life Sciences Research Kuang-Ching “KC” Wang Associate Professor Clemson University KC Wang Clemson University Sponsored by NSF grant OCI‐1245936 July 17 2013 1 Clemson NextNet: A NSF CC-NIE Project Objectives: • Direct access to I2 100G Innovation Platform • Science DMZ from anywhere, w/o manual plumbing • Campus production, end-to-end support • Flexible, optimized 10~40G access to resources on campus and other universities • Software defined network (SDN) KC Wang Clemson University July 17 2013 2 What is the Fuss About SDN? Network Researchers: Traditional Network SDN Industry: Traditional Networking Traditional Networking Traditional network gettinging unmanageable (not about bandwidth)! VLAN 2 • IP address à Server ID VLAN 3 – Each network device knows the complete L2/L3 network • Legacy protocols converge slowly VLAN 1 – Not keeping pace with network trends • Virtual environments becoming unmanageable – Maintaining server/network coherency very labor intensive • Service provider networks grafted on – No provision for provider-to-provider bridging Traffic segregation methods were introduced to help increase scale – What about QoS and SLAs in a virtual environment? Still no global controller; switches growing in complexity 6 KC Wang Clemson University We’re reaching the practical limits of cost and complexity 7 July 17 2013 3 What Do Our (Life Sciences) Folks Need? Two Clemson life sciences researchers in attendance today: • Alex Feltus – Associate Professor in Genetics & Biochemistry – Faculty Consultant in Clemson University Genomics Institute – Research: Rapid crop design with massive gene interaction networks … Data Store N Palmetto HPC Cluster • David Kwartowitz – Assistant Professor in Bioengineering – Research: Rapid processing stereo laparoscopic data for real-time preand intra-surgery support KC Wang Clemson University July 17 2013 Real-time medical imaging 4 The Feltus Lab Builds Massive Gene Interaction Networks Using RNA Expression Profiles From Next-Generation Sequence (NGS) and Microarray Experiments. Rice (Oryza sativa) Goal: Rapidly design new crop varieties for a specific environment including “old” environments with a changed climate… Personalized Agriculture KC Wang Clemson University July 17 2013 5 Slide prepared by Alex Feltus Massive amounts of DNA/RNA/Genetic Data in Databases 1.64 Quadrillion base pairs in 5 yrs! KC Wang Clemson University July 17 2013 http://www.ncbi.nlm.nih.gov/Traces/sra/ 6 Slide prepared by Alex Feltus A 5.7G 5.7G 5.8G 5.8G 6.7G 6.7G 6.8G 6.8G 6.5G 6.5G 6.6G 6.6G 7.3G 7.3G 7.4G 7.4G 5.6G 5.6G 5.7G 5.7G 8.8G 8.8G 8.9G 8.9G NGS Biomarker Example Datasets RAW DATA (uncompressed) Sample_Feltus1_L006_R1.cat.fastq Sample_Feltus1_L006_R2.cat.fastq Sample_Feltus1_L007_R1.cat.fastq Sample_Feltus1_L007_R2.cat.fastq Sample_Feltus2_L006_R1.cat.fastq Sample_Feltus2_L006_R2.cat.fastq Sample_Feltus2_L007_R1.cat.fastq Sample_Feltus2_L007_R2.cat.fastq Sample_Feltus3_L006_R1.cat.fastq Sample_Feltus3_L006_R2.cat.fastq Sample_Feltus3_L007_R1.cat.fastq Sample_Feltus3_L007_R2.cat.fastq Sample_Feltus4_L006_R1.cat.fastq Sample_Feltus4_L006_R2.cat.fastq Sample_Feltus4_L007_R1.cat.fastq Sample_Feltus4_L007_R2.cat.fastq Sample_Feltus5_L006_R1.cat.fastq Sample_Feltus5_L006_R2.cat.fastq Sample_Feltus5_L007_R1.cat.fastq Sample_Feltus5_L007_R2.cat.fastq Sample_Feltus6_L006_R1.cat.fastq Sample_Feltus6_L006_R2.cat.fastq Sample_Feltus6_L007_R1.cat.fastq Sample_Feltus6_L007_R2.cat.fastq KC Wang Clemson University 2.4G 2.4G 2.7G 2.7G 2.6G 2.6G 3.0G 3.0G 2.2G 2.2G 2.9G 2.9G PROCESSED DATA (compressed) Sample_Feltus1_L007_R1.MERGED.BAM Sample_Feltus1_L007_R1.MERGED.BAM Sample_Feltus2_L006_R1.MERGED.BAM Sample_Feltus2_L007_R1.MERGED.BAM Sample_Feltus3_L006_R1.MERGED.BAM Sample_Feltus3_L007_R1.MERGED.BAM Sample_Feltus4_L006_R1.MERGED.BAM Sample_Feltus4_L007_R1.MERGED.BAM Sample_Feltus5_L006_R1.MERGED.BAM Sample_Feltus5_L006_R1.MERGED.BAM Sample_Feltus6_L006_R1.MERGED.BAM Sample_Feltus6_L007_R1.MERGED.BAM 6 RNA Samples in Duplicate 163.6 GB (raw) + 31.8 GB (processed) = 195.4 GB of critical data files (<6 hours to process on cluster) Does not include: Intermediate processing files Reference genome (0.72 GB) July 17 2013 7 Slide prepared by Alex Feltus The CUTTERS (Kwartowitz) lab is working to enable remote processing of stereo laparoscopic data for real-time feedback with surgical robot systems on partner sites (Vanderbilt, Mayo Clinic) Mayo Clinic, MN Vanderbilt, TN KC Wang Clemson University July 17 2013 Palmetto HPC Clemson, SC Cluster 8 How Does It Work Today Data Center Research Network Campus Network G ISP 1 Internet … Campus Network R&E net 1 ISP 2 Internet Research Network … Data Center Campus Network KC Wang Clemson University Data Center Research Network July 17 2013 R&E net Down the road • compliances • User-specific privileges • access control 9 What Are We Building NOW CCNIE/Science DMZ- stage 1 I2 AL2S 10gig 10gig ~180 nodes (Palmetto) 10 gig Openflow control traffic via current production network 10gig 100gig 10gig 10gig sr S4810 Brocade MLX-e (Palmetto) Biotech 10gig sr Current Production Network Pica 8 3920 10 gig sr 10gig sr Sirrine Big Switch Controller 40 gig lr 10gig sr ITC MRV Optical MUX 10gig to 40 gig fanout 40 gig lr Pica 8 3920 10gig sr Dwdm link 40 gig lr 10gig sr Z9000 End Users Riggs Poole 10gig sr 40gig lr Poole MRV Optical Mux s4810 40gig lr 10gig sr Rhodes 10gig sr s4810 End Users 10gig sr McAdams s4810 10gig sr End Users KC Wang Clemson University July 17 2013 10 Porting GENI Research Prototype to Production SOS: Seamless Large Data Transport SOS Controller SOS agent Steroid OpenFlow Service (SOS) by Aaron Rosen and KC Wang 3.2 SOS agent 3.1 SOS pipe • Seamless TCP throughput upgrade, e.g., 2.5 Mbps 120 Mbps • Multipath support TCP • Automatic site agent detection 2 4.2 4.1 1 SOS-enabled switch SOS-enabled switch TCP Perceived point-to-point or multi-point connection SOS SCinet Upcoming demos of SOS: SOS Stanford • NSF 12th GENI conference, Kansas City, MO. • Supercomputing 2011, Seattle, WA. KC Wang Clemson University GENI core SOS UW-Madison SOS Clemson July 17 2013 11 Condo of Condos: Connecting Campus HPC with SDN • NSF grant to expand Clemson’s condominium HPC model to a national scale KC Wang Clemson University July 17 2013 12 Significance of IT Support Team to Bootstrap Researcher Use of HPC and SDN Number of Users New Palmetto Cluster Users KC Wang Clemson University May 2010: Galen joins CITI and begins recruiting & training users And to Create a Transformative University • a unique coalition among academy, IT, and industrial partners within and beyond Clemson. IT partners • University IT (condo of condos …) • Internet2 • ESNET Academic partners • Researchers • Na onal labs • SDN research groups CC-NIE & Research Center IT Research Teaching Government agencies • GENI • US Ignite • DoD, DoT, DoE • State DHHS Industry partners • Companies • SDN R&D Labs 3 • Synergy with other university research centers: Cyberinstitute, ICAR, and Watts Innovation Center KC Wang Clemson University July 17 2013 14 Synergy with Cross-Communities Momentum Research Communities Companies Universities ... Open Source Communities KC Wang Clemson University IT Communities July 17 2013 15 FURTHER QUESTIONS KWANG@CLEMSON.EDU KC Wang Clemson University July 17 2013 16