http://www.grid-support.ac.uk http://www.ngs.ac.uk The Storage Resource Broker and the NGS Slides from Wayne Schroeder, SDSC and Peter Berrisford, RAL http://www.nesc.ac.uk/ http://www.pparc.ac.uk/ http://www.eu-egee.org/ Acknowledgements • This tutorial selects slides from several sources, specifically from talks given by Wayne Schroeder (SDSC) and Peter Berrisford (RAL) 2 Goal • Introduces a subset of the SRB functionality – Distributed file management for NGS users – This is the focus of the practical that follows • NOTE: SRB has many facets! – Wayne Schroeder: “SRB does so much, people tend to learn subsets and are often unaware of useful features” – So explore further! • http://www.sdsc.edu/srb/ • For a full SRB tutorial, see: http://www.niees.ac.uk/events/srb 3 What is SRB? (1 of 3) • The SDSC Storage Resource Broker (SRB) is client-server middleware that provides a uniform interface for connecting to heterogeneous data resources over a network and accessing unique or replicated data objects. • SRB, in conjunction with the Metadata Catalog (MCAT), provides a way to access data sets and resources based on their logical names or attributes rather than their names and physical locations. 4 What is SRB? (2 of 3) • The SDSC SRB system is a comprehensive distributed data management solution, with features to support the management, collaborative (and controlled) sharing, publication, and preservation of distributed data collections. • The SRB also serves as middleware via a rich set of APIs available to higher-level applications and by providing a management layer on top of a wide variety of storage systems. 5 What is SRB? (3 of 3) The SRB is an integrated solution which includes: – – – – – – – – – – a logical namespace, interfaces to a wide variety of storage systems, high performance data movement (including parallel I/O), fault-tolerance and fail-over, WAN-aware performance enhancements (bulk operations), storage-system-aware performance enhancements ('containers' to aggregate files), metadata ingestion and queries (a MetaData Catalog (MCAT)), user accounts, groups, access control, audit trails, GUI administration tool data management features, replication user tools (including a Windows GUI tool (inQ), a set of SRB Unix commands, and Web (mySRB)), and APIs (including C, C++, Java, and Python). SRB Scales Well (many millions of files, terabytes) Supports Multiple Administrative Domains / MCATs (srbZones) And includes SDSC Matrix: SRB-based data grid workflow management system to create, access and manage workflow process pipelines. 6 SRB Projects • • • • • • • • Digital Libraries – – UCB, Umich, UCSB, Stanford,CDL NSF NSDL - UCAR / DLESE NASA Information Power Grid Astronomy – – National Virtual Observatory 2MASS Project (2 Micron All Sky Survey) Particle Physics – – – Particle Physics Data Grid (DOE) GriPhyN SLAC Synchrotron Data Repository Medicine – Digital Embryo (NLM) Earth Systems Sciences – – ESIPS LTER Persistent Archives – – NARA LOC Neuro Science & Molecular Science – – TeleScience/NCMIR, BIRN SLAC, AfCS, … Over 90 Tera Bytes in 16 million files 7 SRB Scalability Storage Resource Broker (SRB) Data brokered by SDSC instances of SRB** As of 7/24/2003 Data_size (in Project Instance Count (files) GB) NPACI 6,050.00 2,317,368 Digsky 46,100.00 5,719,025 DigEmbryo 720.00 45,365 HyperLter 215.00 5,097 Hayden 7,078.00 59,399 Portal 968.00 27,250 SLAC 1,790.00 254,974 NARA/Collection 52.80 79,195 NSDL/SIO Exp 232.00 15,809 TRA 90.60 2,385 LDAS/SALK 498.00 9,858 BIRN 121.00 237,283 AfCS 95.30 18,762 UCSDLib 1,084.00 138,415 NSDL/CI 278.00 993,886 SCEC 12.60 18,660 TeraGrid 623.00 36,508 TOTAL 66,008.30 9,979,239 Users 367 68 23 27 142 316 43 51 23 25 60 138 20 29 113 38 1,978 3,461 As of 9/12/2003 Data_size (in Count (files) GB) 8,350.00 2,903,386 46,100.00 5,719,025 720.00 45,365 215.00 5,097 7,130.00 59,781 1,133.00 31,717 1,790.00 254,974 53.10 79,112 315.00 27,170 92.00 2,378 737.00 12,898 273.00 675,531 99.00 19,714 1,085.00 138,421 379.00 2,596,090 7,561.00 1,249,144 1,664.00 47,644 77,696.10 13,867,447 Users 372 68 23 28 158 339 43 55 23 26 66 145 20 29 114 39 1,942 As of 10/01/2003 As of 11/14/2003 Data_size (in Data_size (in Count Count (files) Users GB) GB) (files) 8,570.00 2,946,526 375 8,822.00 2,995,432 46,100.00 5,719,025 68 42,786.00 6,076,982 720.00 45,365 23 720.00 45,365 215.00 5,097 28 215.00 5,097 7,830.00 59,983 158 7,835.00 60,001 1,141.00 32,396 342 1,244.00 34,094 1,790.00 254,974 43 2,108.00 294,149 53.90 79,118 55 67.00 82,031 418.00 39,253 23 603.00 87,191 92.00 2,387 26 92.00 2,387 767.00 12,906 66 824.00 13,016 382.00 2,048,132 156 389.00 1,084,749 102.00 20,260 20 107.00 21,295 1,085.00 138,421 29 1,085.00 138,421 379.00 2,596,090 114 465.00 2,948,903 9,680.00 1,561,396 40 12,274.00 1,721,241 1,745.00 49,106 2,073 10,603.00 433,938 3,490 81,069.90 66 TB 9.97 million 3 thousand 77 TB 13.9 million 3 thousand ** Does not cover data brokered by SRB spaces administered outside SDSC. Does not cover databases; covers only files stored in file systems and archival storage systems Does not cover shadow- 81 TB 15,610,435 3,639 90,239.00 16,044,292 15.6 million3 thousand 90 TB 8 Users 377 69 23 28 168 352 43 56 26 26 66 167 21 29 114 43 2,229 3,837 16 million 3 thousand What is SRB? • Storage Resource Broker (SRB) is a software product developed by the San Diego Supercomputing Centre (SDSC). • Allows users to access files and database objects across a distributed environment. • Actual physical location and way the data is stored is abstracted from the user • Allows the user to add user defined metadata describing the scientific content of the information 9 How SRB Works • 4 major components: MCAT Database c d MCAT Server b e f SRB A Server SRB B Server – The Metadata Catalogue (MCAT) – The MCAT-Enabled SRB Server – The SRB Storage Server – The SRB Client g a SRB Client 10 SRB Client Tools • Provide a user interface to send requests to the SRB server. • 4 main interfaces: – – – – Command line (S-Commands) MS Windows (InQ) Web based (MySRB). Java (JARGON) • Web Services (MATRIX) 11 Planned Deployment on NGS Disk Farm Database Servers @ Manchester MCAT DB1 SRB Server DB n MCAT Server @ Manchester Online Replication Failover link User MCAT Server @ RAL MCAT DB1 SRB Server DB n Database Servers @ RAL SRB server @ Leeds Resource Driver Disk Farm SRB server @ RAL Resource Driver Disk Farm SRB server @ Oxford Resource Driver SRB server @ HPCX Resource Driver Disk Farm Disk Farm 12 Summary • SRB provides NGS users with – – – – – a virtual filesystem Accessible from all core nodes and from the “UI” / desktop (will provide) redundancy – mirrored catalogue server Replica files Support for application metadata associated with files 13 http://www.grid-support.ac.uk http://www.ngs.ac.uk SRB Tutorial Guy Warner NeSC Training Team http://www.nesc.ac.uk/ http://www.pparc.ac.uk/ http://www.eu-egee.org/ Overview • Use of the Scommands – Commands for unix based access to srb – Strong analogy to unix file commands • Accessing files from multiple (two) sites 15 Getting Started • Launch two “putty” connections to pub-234 – One for commands run on pub-234 – One for connecting to grid-data.rl.ac.uk and running commands from there • Open browser at http://homepages.nesc.ac.uk/~gcw/NGS/srb.html • Follow the instructions from there. • Your srb name is the same as your account on pub234 16 SRB practical 17