Advanced Computing Resources for IU Researchers Anurag Shankar University Information Technology Services Indiana University March 2, 2012 University Information Technology Services July 17, 2016 Outline • IU’s Research Cyberinfrastructure - People, hardware, software, services, training & support • Research LifeCycle Support - Solutions for your research workflows • Scope & Scale - Boundaries & growth • Costs - Who pays what • Benefits - The difference it makes University Information Technology Services 7/17/2016 What is Cyberinfrastructure (CI)? • Federal funding agencies define “Cyberinfrastructure” as a technological and sociological solution to the problem of efficiently connecting laboratories, data, computers, and people with the goal of enabling derivation of novel scientific theories and knowledge University Information Technology Services 7/17/2016 Cyberinfrastructure … • Consists of research environments that support advanced data acquisition, storage, management, integration, mining, visualization and other computing and information processing services distributed over the network within (local CI) or beyond a single institution (global CI) University Information Technology Services 7/17/2016 IU’s Research Cyberinfrastructure • Provided by the Research Technologies (RT) division of UITS http://uits.iu.edu/page/avel • Dedicated to supporting research, no matter the area • Strives to help you get recognition & funding (“make you shine”) University Information Technology Services 7/17/2016 People • • • • • RT consists of around 90 staff Staff are both base and grant funded RT is divided into thirteen logical units Units cover generic to very specific areas Staff are distributed across IUB and IUPUI • We are available at a moment’s notice University Information Technology Services 7/17/2016 Research Technologies Units • • • • • • • Advanced IT Core Advanced Visualization Lab Biomedical Applications Computational Biology Core Services Digital Library Program High Performance Applications • High Performance File Systems • High Performance Systems • Open Science Grid • Research Storage • Science Gateways Group • Statistical & Mathematical Computing University Information Technology Services 7/17/2016 RT Units … • • • • • • Advanced IT Core – An IU School of Medicine service core to assist researchers in using RT resources Advanced Visualization Laboratory - Consulting and support for scientific visualization, virtual reality, high-end computer graphics, and visual tele-collaboration Biomedical Applications - Consultation on access and management of biomedical data Computational Biology – Application/technical support and software development for biological computing (particularly in genomics, cell biology, and molecular biology) Core Services – Applications/hosting for researchers & RT Digital Library Program - Services for digital library development University Information Technology Services 7/17/2016 RT Units … • • • • • • • High Performance Applications – Programming support for IU’s parallel supercomputers Big Red and Quarry High Performance File Systems – High-speed, parallel file systems (Data Capacitor) High Performance Systems – Administration of central research computing systems Open Science Grid – Large-scale, high-throughput grid computing for science Research Storage – Data storage resources for research Science Gateways Group – Software development & consulting for grid and cloud applications Statistical and Mathematical Computing – Support and licenses for statistical, mathematical, and GIS software University Information Technology Services 7/17/2016 RT Hardware • Data Analysis, Simulation & Storage supercomputers, disk & tape-based storage systems, database servers • Data Publishing – web and application servers • Data Visualization – virtual reality theater, high-resolution display wall, stereo wall, 3D scanner, 3D printer, haptics, stereo video rigs University Information Technology Services 7/17/2016 RT Software • Statistics – Amos, Eviews, GAUSS, HLM, LISREL, Minitab, Mplus, R, RATS, SAS, SPSS, Stata, S-PLUS • Mathematics – Maple, MATLAB, Mathematica, Geometer’s Sketchpad, LINDO, LINGO • GIS – Alta4, ESRI, ERDAS, Pictometry • Other – AutoCAD, Nvivo, SigmaPlot, Scientific WorkPlace, Stat/Transfer University Information Technology Services 7/17/2016 Software • Databases – Oracle, MySQL, Oracle Application Express, phpMyAdmin • High Performance Computing – ATLAS, BLAS, Cern Program Library, Chaco, Charm, Cmake, FFT, FDPR, Global Arrays, GNU c compiler collection, Hadoop, IMSL/C, MPA library, Science library, HDF5, LAPACK, NAG Fortran library, NetCDF, Octave, PyACTS, R, SCALAPACK, SPRNG, SVDPACK, SWIG • Debuggers – TotalView, Valgrind, Vampir Trace University Information Technology Services 7/17/2016 Software • Graphing/Visualization – BoxShade, Grace, Graphviz, Mesa 3D, Qt, Raster3D, Visualization Toolkit • Chemistry - Amber, AutoDock, AutoDockVina,CPMD, Gamess, MolPro, MolScipt, NAMD, Procheck, Quantum ESPRESSO, Siesta, STRIDE, VMD • Physics/Engineering - ANSYS, FLUENT, GAMBIT, LS-DYNA, Trillinos University Information Technology Services 7/17/2016 Software • Biology – BioPerl, BLASTZ, CAP3, ClustalW, EMBOSS/wEMBOSS, FASTA, fastDNAml, GARLI, IMPUTE, MACH, McMCcoal, MEME, MIGRATE, MPI-HUMMER, MrBayes, MULTICLUSTAL, MUMMER, NCBI, PAML, PAUP, PHYLIP, Phrap, PLINK, RepeatMaster, structure, TREE-PUZZLE Users can also request software installation on IU supercomputers See http://kb.iu.edu/data/bade.html#request University Information Technology Services 7/17/2016 Services • Data Storage – Research File System (RFS), Scientific Data Archive (SDA), Research Database Complex (RDC) • Data Sharing & Management – Alfresco Share, Research Electronic Data Capture (REDCap) • Data Publishing – Discern Application Server, RDCWeb, Virtual Servers • Data Visualization – Advanced Visualization Lab (AVL), GIS University Information Technology Services 7/17/2016 Services … • Simulation & Data Analysis – Big Red, Quarry supercomputers, high-performance I/O, xSEDE & OSG national computational grids, Stat/Math Center • Software Development – Custom software development • Consulting – Requirements gathering, database administration, data management, data visualization, parallel programming, computational biology, grid computing, & more University Information Technology Services 7/17/2016 RT Services (more info at http://kb.iu.edu) Service Type Service Details Data Storage RFS Ubiquitous, disk based storage for in-work files SDA Cached, tape-based, large-scale archival storage Database Storage RDC Oracle & MySQL databases, remote access from applications, Unix command line, ODBC, SQL*Plus/Aqua Data Studio/Oracle client (Oracle), phpMyAdmin (MySQL) Data Sharing Alfresco Share Document sharing with IU and non-IU collaborators Data Management REDCap Creation and management of online surveys and databases; DBs from spreadsheets, exports to Excel, SAS, R, etc., data sharing with IU and non-IU users Data Publishing Discern PHP/Java applications serving viaApache/Tomcat RDCWeb Serving PHP/Java applications accessing databases on RDC via Apache/Tomcat AVL Provides AVL hardware for visualizing data Visualization University Information Technology Services 7/17/2016 RT Services (more info at http://kb.iu.edu) Service Type Service Details Data Analysis & Simulation Big Red (IBM) 1025 nodes (2048 PowerPC 970 CPUs), 8GB RAM/node, 73GB disk, high speed Myricom interconnect Quarry (IBM) 140 nodes (560 Intel Xeon 5410 CPUs), 8/16GB RAM/node, 73GB disk, 340TB scratch, GB ethernet interconnect 230 nodes (460 Intel Xeon 5410 CPUs), 16GB RAM/node, 160GB disk, 360TB scratch, GB ethernet interconnect Extreme Science and Engineering Discovery Environment (xSEDE) A national grid of 16 supercomputers, data storage & visualization resources (time allocation through formal proposal submission, but startup allocations are easy to get) Open Science Grid (OSG) An international grid of a large number of compute resources for large scale grid computing University Information Technology Services 7/17/2016 Training & Support • Customized training (virtually, at IUB/IUPUI, or on campus) • General support for all research IT issues • Software support for RT installed/developed software • Database (DBA) support • System administration support for virtual servers • Grant support University Information Technology Services 7/17/2016 Research LifeCycle Support RT can support you comprehensively, at every step of your research workflow, by leveraging its own and/or national resources to provide or build solutions ranging from simple to complex, from modest to immense, and from local to national to international University Information Technology Services 7/17/2016 Data LifeCycle The research lifecycle from RT’s perspective consists of a data lifecycle, representing the sequence from data creation to archival/disposal University Information Technology Services 7/17/2016 The Research/Data LifeCycle Pre-Grant • Preliminary Investigation • Cyberinfrastructure Design Proposal • Proposal Prep • Budget • Funding Execution Post-Grant • • • • • Data Acquisition Data Analysis Data Management Data Sharing Data Visualization • Data Publishing • Data Archival • Data Disposal University Information Technology Services 7/17/2016 Research LifeCycle Support – Pre-Grant • Preliminary Investigation: Support your preliminary investigation (for little or no cost) Make you aware of how RT can help the project Help you explore technology options University Information Technology Services 7/17/2016 Research LifeCycle Support – Grant Proposal • Proposal Preparation: Actively partner as co-investigators Provide support letters Provide text on how IU’s massive cyberinfrastructure will support your specific project or a standard RT template Develop a budget • Cyberinfrastructure Design: Initial consulting Gather requirements Research options Design solutions University Information Technology Services 7/17/2016 Research LifeCycle Support – Grant Execution • Data Acquisition Provide data capture application (REDCap) Optimize data flows • Data Analysis Migrate your analysis or other compute-intensive application to IU supercomputers, with potentially significant (say 10X) speedups to result Accelerate your application further by parallelizing it Supply optimized (parallel) versions (if available) of commonly used libraries & tools University Information Technology Services 7/17/2016 Grant Execution … • Data Storage: Provide secure, regularly backed up disk storage of and ubiquitous access to data in work (RFS) Provide massive amounts of disaster-proof (IUB & IUPUI mirrored), secondary, archival storage (SDA) Provide large spaces and support for Oracle and MySQL databases (RDC) • Data Management: Help you manage your data using professional data management paradigms such as metadata and tags, data provenance, data dictionaries (REDCap) Design data management systems/applications University Information Technology Services 7/17/2016 Grant Execution … • Data Sharing: Provide collaborative spaces you can create instantly to invite others at IU or outside IU to share work documents or structured data over the web (Alfresco Share) • Data Visualization: Help you explore and use visualization to assist your analysis and/or decision making Provide a place to expose yourself and students to the latest, high-end visualization technologies Present you visualization at conferences University Information Technology Services 7/17/2016 Research LifeCycle Support – Post-Grant • Data Publishing: Design of data dissemination environments – web application design, development, hosting, web publishing of data in Oracle/MySQL databases, web portals Hassle-free virtual servers for you to run your own applications, system administration for these servers • Data Archival: Long-term (decades), massive storage (TBs) of webaccessible secondary storage Help you consolidate ALL your data in a SINGLE location. • Data Disposal: Dispose off your data securely and professionally University Information Technology Services 7/17/2016 Scope & Scale • Project scope is bounded only by funding & resource limits, but both can be raised • IU’s massive cyberinfrastructure allows us to help you plan for future growth, be it data storage, analysis, or dissemination • Supported projects currently scale up to petabytes (a million GB) of storage, months of running analysis applications on supercomputers, etc. • Geographically, we have built and run highly distributed cyberinfrastructure for national and international collaborations University Information Technology Services 7/17/2016 Costs • Nearly all of the RT services described here (supercomputers, storage, visualization resources) are provided free of charge to any IU researcher • Projects that require significant RT commitment, for instance software development, system administration, etc., are provided at cost, but usually have RT written into grants to cover costs • Consulting up to a reasonable limit, especially in preliminary phases to help you attract funding, is free University Information Technology Services 7/17/2016 Benefits • There is evidence that including RT in grant proposals has enhanced funding prospects • Speedups in the research workflow allows you to accomplish more and to think bigger • You don’t need local hardware or system administration resources • RT solutions are built professionally and for production use • RT provides high security for your data – it has done the hard work to align itself with HIPAA University Information Technology Services Contact Your single point of contact for all things RT: Anurag Shankar ashankar@iu.edu 812-325-8629 Local Contact: Carol Wood cfwood@iun.edu 219-980-7758 7/17/2016