DDN & iRODS at ICBR By Alex Oumantsev History of ICBR Campus wide Interdisciplinary Center for Biotechnology Research Core Facility Funded by the state of Florida in 1987 On average 58 staff •22% faculty •45% full time staff •33% postgrad Current Cores at ICBR Bioinformatics Cytometry Electron Microscopy Gene Expression & Genotyping Monoclonal Antibody NextGen DNA Sequencing Proteomics & Mass Spectrometry Sanger Sequencing Diverse Environment Over 400 services provided Multiple diverse platforms Diverse user base Varying analysis pipelines Wide range of data Growing data set sizes Computational Challenges Data storage Data processing Data delivery Current Computational Environment Several storage silos with NFS/SMB ~600TB Mix of 10 and 1 GbE Various size compute systems ~ 1000 cores Workstations connected over 1GbE Current Storage Segmented Slow Not high availability Current Data Delivery Methods Hardware encrypted USB drives University provided file transfer service •5GB max single file size Client personal USB drives for self service instruments Various unsupported options… iRODS at ICBR Set up a system that can store all instrument data Maintain archive of all of the instrument data Check out data for analysis Check results back in Manage permissions Electronic Data delivery DDN at ICBR Rapid growth of instrument output dataset size SFA 12KXE •Fast and scalable storage •Ability to run custom images on storage •GPFS •Ability to run some compute tasks directly on storage •High availability iRODS on DDN Scalable, high performance, high availability Set up iRODS on the VMs that run on SFA 12KXE All of the VMs see common storage namespace A pool of iRODS resource servers running on the VMs •Each has full view of the namespace •Microservices take advantage of built in compute Use SSD from the SFA 12KXE to store iCAT •Run on MySQL Cluster edition •Prevents single point of failure •Distributed for performance iRODS on DDN VM-0 VM-1 VM-2 VM-3 iRODS iRODS iRODS iRODS iDORP iDORP iDORP iDORP iCAT iCAT iCAT iCAT GPFS GPFS GPFS GPFS SFA Driver SFA Driver SFA Driver SFA Driver RAID DISK RAID SSD iRODS at ICBR Automated instrument data and metadata ingestion into iRODS •Set up most used instruments •Set up new instruments as they arrive Provide download link to clients via some LIMS Create custom Web front end •Uniform look •Data portal with identical interface for all of the Cores •Custom views supporting mobile platforms Have option to transfer client data to other University compute Resources