Advanced Computing Resources for IU Researchers Anurag Shankar University Information Technology Services

advertisement
Advanced Computing Resources
for IU Researchers
Anurag Shankar
University Information Technology Services
Indiana University
March 2, 2012
University Information Technology Services
July 17, 2016
Outline
• IU’s Research Cyberinfrastructure
- People, hardware, software, services, training & support
• Research LifeCycle Support
- Solutions for your research workflows
• Scope & Scale
- Boundaries & growth
• Costs
- Who pays what
• Benefits
- The difference it makes
University Information Technology Services
7/17/2016
What is Cyberinfrastructure (CI)?
• Federal funding agencies define
“Cyberinfrastructure” as a technological
and sociological solution to the problem
of efficiently connecting laboratories,
data, computers, and people with the
goal of enabling derivation of novel
scientific theories and knowledge
University Information Technology Services
7/17/2016
Cyberinfrastructure …
• Consists of research environments that
support advanced data acquisition,
storage, management, integration,
mining, visualization and other computing
and information processing services
distributed over the network within (local
CI) or beyond a single institution (global
CI)
University Information Technology Services
7/17/2016
IU’s Research Cyberinfrastructure
• Provided by the Research Technologies
(RT) division of UITS
http://uits.iu.edu/page/avel
• Dedicated to supporting research, no
matter the area
• Strives to help you get recognition &
funding (“make you shine”)
University Information Technology Services
7/17/2016
People
•
•
•
•
•
RT consists of around 90 staff
Staff are both base and grant funded
RT is divided into thirteen logical units
Units cover generic to very specific areas
Staff are distributed across IUB and
IUPUI
• We are available at a moment’s notice
University Information Technology Services
7/17/2016
Research Technologies Units
•
•
•
•
•
•
•
Advanced IT Core
Advanced Visualization Lab
Biomedical Applications
Computational Biology
Core Services
Digital Library Program
High Performance
Applications
• High Performance File
Systems
• High Performance Systems
• Open Science Grid
• Research Storage
• Science Gateways Group
• Statistical & Mathematical
Computing
University Information Technology Services
7/17/2016
RT Units …
•
•
•
•
•
•
Advanced IT Core – An IU School of Medicine service core to
assist researchers in using RT resources
Advanced Visualization Laboratory - Consulting and support
for scientific visualization, virtual reality, high-end computer
graphics, and visual tele-collaboration
Biomedical Applications - Consultation on access and
management of biomedical data
Computational Biology – Application/technical support and
software development for biological computing (particularly in
genomics, cell biology, and molecular biology)
Core Services – Applications/hosting for researchers & RT
Digital Library Program - Services for digital library
development
University Information Technology Services
7/17/2016
RT Units …
•
•
•
•
•
•
•
High Performance Applications – Programming support for
IU’s parallel supercomputers Big Red and Quarry
High Performance File Systems – High-speed, parallel file
systems (Data Capacitor)
High Performance Systems – Administration of central
research computing systems
Open Science Grid – Large-scale, high-throughput grid
computing for science
Research Storage – Data storage resources for research
Science Gateways Group – Software development &
consulting for grid and cloud applications
Statistical and Mathematical Computing – Support and
licenses for statistical, mathematical, and GIS software
University Information Technology Services
7/17/2016
RT Hardware
• Data Analysis, Simulation & Storage supercomputers, disk & tape-based
storage systems, database servers
• Data Publishing – web and application
servers
• Data Visualization – virtual reality theater,
high-resolution display wall, stereo wall,
3D scanner, 3D printer, haptics, stereo
video rigs
University Information Technology Services
7/17/2016
RT Software
• Statistics – Amos, Eviews, GAUSS, HLM,
LISREL, Minitab, Mplus, R, RATS, SAS, SPSS,
Stata, S-PLUS
• Mathematics – Maple, MATLAB, Mathematica,
Geometer’s Sketchpad, LINDO, LINGO
• GIS – Alta4, ESRI, ERDAS, Pictometry
• Other – AutoCAD, Nvivo, SigmaPlot, Scientific
WorkPlace, Stat/Transfer
University Information Technology Services
7/17/2016
Software
• Databases – Oracle, MySQL, Oracle Application
Express, phpMyAdmin
• High Performance Computing – ATLAS, BLAS,
Cern Program Library, Chaco, Charm, Cmake,
FFT, FDPR, Global Arrays, GNU c compiler
collection, Hadoop, IMSL/C, MPA library, Science
library, HDF5, LAPACK, NAG Fortran library,
NetCDF, Octave, PyACTS, R, SCALAPACK,
SPRNG, SVDPACK, SWIG
• Debuggers – TotalView, Valgrind, Vampir Trace
University Information Technology Services
7/17/2016
Software
• Graphing/Visualization – BoxShade, Grace,
Graphviz, Mesa 3D, Qt, Raster3D, Visualization
Toolkit
• Chemistry - Amber, AutoDock, AutoDockVina,CPMD, Gamess, MolPro, MolScipt, NAMD,
Procheck, Quantum ESPRESSO, Siesta, STRIDE,
VMD
• Physics/Engineering - ANSYS, FLUENT,
GAMBIT, LS-DYNA, Trillinos
University Information Technology Services
7/17/2016
Software
• Biology – BioPerl, BLASTZ, CAP3, ClustalW,
EMBOSS/wEMBOSS, FASTA, fastDNAml, GARLI,
IMPUTE, MACH, McMCcoal, MEME, MIGRATE,
MPI-HUMMER, MrBayes, MULTICLUSTAL,
MUMMER, NCBI, PAML, PAUP, PHYLIP, Phrap,
PLINK, RepeatMaster, structure, TREE-PUZZLE
Users can also request software installation
on IU supercomputers
See http://kb.iu.edu/data/bade.html#request
University Information Technology Services
7/17/2016
Services
• Data Storage – Research File System (RFS),
Scientific Data Archive (SDA), Research Database
Complex (RDC)
• Data Sharing & Management – Alfresco Share,
Research Electronic Data Capture (REDCap)
• Data Publishing – Discern Application Server,
RDCWeb, Virtual Servers
• Data Visualization – Advanced Visualization
Lab (AVL), GIS
University Information Technology Services
7/17/2016
Services …
• Simulation & Data Analysis – Big Red,
Quarry supercomputers, high-performance I/O,
xSEDE & OSG national computational grids,
Stat/Math Center
• Software Development – Custom software
development
• Consulting – Requirements gathering,
database administration, data management,
data visualization, parallel programming,
computational biology, grid computing, & more
University Information Technology Services
7/17/2016
RT Services (more info at http://kb.iu.edu)
Service Type
Service
Details
Data Storage
RFS
Ubiquitous, disk based storage for in-work files
SDA
Cached, tape-based, large-scale archival storage
Database
Storage
RDC
Oracle & MySQL databases, remote access from
applications, Unix command line, ODBC,
SQL*Plus/Aqua Data Studio/Oracle client (Oracle),
phpMyAdmin (MySQL)
Data Sharing
Alfresco Share
Document sharing with IU and non-IU collaborators
Data
Management
REDCap
Creation and management of online surveys and
databases; DBs from spreadsheets, exports to
Excel, SAS, R, etc., data sharing with IU and non-IU
users
Data Publishing
Discern
PHP/Java applications serving viaApache/Tomcat
RDCWeb
Serving PHP/Java applications accessing
databases on RDC via Apache/Tomcat
AVL
Provides AVL hardware for visualizing data
Visualization
University Information Technology Services
7/17/2016
RT Services (more info at http://kb.iu.edu)
Service Type
Service
Details
Data Analysis &
Simulation
Big Red (IBM)
1025 nodes (2048 PowerPC 970 CPUs), 8GB
RAM/node, 73GB disk, high speed Myricom
interconnect
Quarry (IBM)
140 nodes (560 Intel Xeon 5410 CPUs), 8/16GB
RAM/node, 73GB disk, 340TB scratch, GB ethernet
interconnect
230 nodes (460 Intel Xeon 5410 CPUs), 16GB
RAM/node, 160GB disk, 360TB scratch, GB
ethernet interconnect
Extreme Science
and Engineering
Discovery
Environment
(xSEDE)
A national grid of 16 supercomputers, data storage
& visualization resources (time allocation through
formal proposal submission, but startup allocations
are easy to get)
Open Science
Grid (OSG)
An international grid of a large number of compute
resources for large scale grid computing
University Information Technology Services
7/17/2016
Training & Support
• Customized training (virtually, at IUB/IUPUI,
or on campus)
• General support for all research IT issues
• Software support for RT installed/developed
software
• Database (DBA) support
• System administration support for virtual
servers
• Grant support
University Information Technology Services
7/17/2016
Research LifeCycle Support
RT can support you comprehensively, at
every step of your research workflow, by
leveraging its own and/or national
resources to provide or build solutions
ranging from simple to complex, from
modest to immense, and from local to
national to international
University Information Technology Services
7/17/2016
Data LifeCycle
The research lifecycle from RT’s
perspective consists of a data lifecycle,
representing the sequence from data
creation to archival/disposal
University Information Technology Services
7/17/2016
The Research/Data LifeCycle
Pre-Grant
• Preliminary
Investigation
• Cyberinfrastructure
Design
Proposal
• Proposal Prep
• Budget
• Funding
Execution
Post-Grant
•
•
•
•
•
Data Acquisition
Data Analysis
Data Management
Data Sharing
Data Visualization
• Data Publishing
• Data Archival
• Data Disposal
University Information Technology Services
7/17/2016
Research LifeCycle Support –
Pre-Grant
• Preliminary Investigation:
 Support your preliminary investigation (for little or no cost)
 Make you aware of how RT can help the project
 Help you explore technology options
University Information Technology Services
7/17/2016
Research LifeCycle Support –
Grant Proposal
• Proposal Preparation:
 Actively partner as co-investigators
 Provide support letters
 Provide text on how IU’s massive cyberinfrastructure will
support your specific project or a standard RT template
 Develop a budget
• Cyberinfrastructure Design:




Initial consulting
Gather requirements
Research options
Design solutions
University Information Technology Services
7/17/2016
Research LifeCycle Support –
Grant Execution
• Data Acquisition
 Provide data capture application (REDCap)
 Optimize data flows
• Data Analysis
 Migrate your analysis or other compute-intensive application
to IU supercomputers, with potentially significant (say 10X)
speedups to result
 Accelerate your application further by parallelizing it
 Supply optimized (parallel) versions (if available) of
commonly used libraries & tools
University Information Technology Services
7/17/2016
Grant Execution …
• Data Storage:
 Provide secure, regularly backed up disk storage of and
ubiquitous access to data in work (RFS)
 Provide massive amounts of disaster-proof (IUB & IUPUI
mirrored), secondary, archival storage (SDA)
 Provide large spaces and support for Oracle and MySQL
databases (RDC)
• Data Management:
 Help you manage your data using professional data
management paradigms such as metadata and tags, data
provenance, data dictionaries (REDCap)
 Design data management systems/applications
University Information Technology Services
7/17/2016
Grant Execution …
• Data Sharing:
 Provide collaborative spaces you can create instantly to
invite others at IU or outside IU to share work documents or
structured data over the web (Alfresco Share)
• Data Visualization:
 Help you explore and use visualization to assist your
analysis and/or decision making
 Provide a place to expose yourself and students to the latest,
high-end visualization technologies
 Present you visualization at conferences
University Information Technology Services
7/17/2016
Research LifeCycle Support –
Post-Grant
• Data Publishing:
 Design of data dissemination environments – web
application design, development, hosting, web publishing of
data in Oracle/MySQL databases, web portals
 Hassle-free virtual servers for you to run your own
applications, system administration for these servers
• Data Archival:
 Long-term (decades), massive storage (TBs) of webaccessible secondary storage
 Help you consolidate ALL your data in a SINGLE location.
• Data Disposal:
 Dispose off your data securely and professionally
University Information Technology Services
7/17/2016
Scope & Scale
• Project scope is bounded only by funding & resource
limits, but both can be raised
• IU’s massive cyberinfrastructure allows us to help
you plan for future growth, be it data storage,
analysis, or dissemination
• Supported projects currently scale up to petabytes (a
million GB) of storage, months of running analysis
applications on supercomputers, etc.
• Geographically, we have built and run highly
distributed cyberinfrastructure for national and
international collaborations
University Information Technology Services
7/17/2016
Costs
• Nearly all of the RT services described here
(supercomputers, storage, visualization resources)
are provided free of charge to any IU researcher
• Projects that require significant RT commitment,
for instance software development, system
administration, etc., are provided at cost, but
usually have RT written into grants to cover costs
• Consulting up to a reasonable limit, especially in
preliminary phases to help you attract funding, is
free
University Information Technology Services
7/17/2016
Benefits
• There is evidence that including RT in grant
proposals has enhanced funding prospects
• Speedups in the research workflow allows you
to accomplish more and to think bigger
• You don’t need local hardware or system
administration resources
• RT solutions are built professionally and for
production use
• RT provides high security for your data – it has
done the hard work to align itself with HIPAA
University Information Technology Services
Contact
Your single point of contact for all things RT:
Anurag Shankar
ashankar@iu.edu
812-325-8629
Local Contact:
Carol Wood
cfwood@iun.edu
219-980-7758
7/17/2016
Download