TeraGrid Science Support Nancy Wilkins-Diehr San Diego Supercomputer Center Area Director for Science Gateways Talk Outline • User Support – User Engagements – Advanced Support for TeraGrid Applications (ASTA) • Science Gateways – Initial projects – Deployment strategies – Preparation for expansion • Education, Outreach and Training (EOT) Questions answered in this presentation • • • • • • • – – – – – – – – – – – – – – – Has user engagement been effective? How are user requirements investigated and defined? How for uncertainty and change in user requirements managed? How is usability evaluated, e.g., formatively and summatively? How are applications prioritized for implementation? What refinements or changes to 2.1-4 are envisaged? Has outreach been effective? How is the potential for the wider take up of applications assessed? How are applications being adapted for use by wider user communities? Has training been effective? How is effectiveness being assessed? What quality control measures are in place for training materials? What refinements or changes to 4.1-2 are envisaged? Science Gateways: The TeraGrid report refers to a document, a Science Gateway primer, that reports on general strategy for portal deployment. The reference given is http://wg.teragrig.org/Gateways but this site is private (a password is needed). Please forward a copy of this document. We would like to be able to assess the maturity of the Science Gateways activities. Please provide appropriate information during the presentations. Are effective science portal building environments available to the user community? If so, what is available? – I.e., what science portals that invoke simulations and/or manage massive data sets are in operation across TeraGrid and used by discipline science communities? If not, what is the progress toward this? Has a Grid/Web Services environment been established? To what extent is it used by the science community? What cross connections / resource sharing have been made with other Grids? How much effort and funds have been/will be invested in developing and testing inter-grid interoperability? TeraGrid User Services Sergiu Sanielevici Pittsburgh Supercomputing Center Area Director for User Services Components of User Support • 24/7 Help desk integrating all sites • Training and tutorials • Extensive documentation • TeraGrid User Portal • User contact team • Intensive support – ASTA – Science Gateways • User Survey TeraGrid User Portal Vision • Integrate important user capabilities in one place: – Information services: •Documentation, training, real time consulting •Notification (news, MOTDs, next downtimes, etc.) •Resources info, calendars, cross-site run scheduling •Network info – Account services •Allocation requests •Allocation management & usage reporting •Accounts management (including setting up grid credentials) – Interactive services •Job launching •File transfers •Linear workflow •Data mining – Listing of and access to data collections – Remote vis (interactive), and eventually collaborative • With personalizability and customizability, it can be a foundation for application portals and (some) science gateways Proactive Approach to Discovering and Meeting User Requirements • User Contact team for each allocated LRAC/MRAC project • Results in ability to understand, track and anticipate evolving needs of the users • Codes specifically written or requested by allocated users receive highest installation priority –Optimization, Scaling, I/O, ETF Network Utilization, Workflow mapping –Overcoming application-level obstacles to portability and interoperation –Resolving third-party package issues • Intensive support for selected projects: ASTA Program Plans for 2006 • Improve reach and quality of personalized, proactive user support system • Improve tracking and logging of staff-user interactions • Improve User Survey content, administration, and follow-up • Work with external evaluators • Consider new tools e.g. User Forum Advanced Support for TeraGrid Applications (ASTA) • Inaugurated 6/1/05; 10 projects now underway • Already produced remarkable new science using TG-deployed software … including the SC05 Analytics Challenge winner. • Help users to: – Achieve their science objectives – Utilize TeraGrid resources interestingly and effectively • Improve the quality of the TeraGrid infrastructure – Provide feedback to staff when testing, piloting and exercising TeraGrid capabilities • Selection by TG staff, NSF, PIs willing and able to assign developer time from within their project. Simulation of Blood Flow in Human Arterial Tree on the TeraGrid Supported by NSF and TeraGrid Team Members Brown University: Imperial College, London: Argonne National Lab: ASTA: S. Dong, L. Grinberg, A. Yakhot, G.E. Karniadakis S.J. Sherwin N.T. Karonis, J. Insley, J. Binns, M. Papka D.C. O’Neal, C. Guiang, J. Lim Simulating & Visualizing Human Arterial Tree Computation USA Visualization ANL Viz servers UK Viewer client SC05, Seattle, WA What ASTA Helps With • NekTar development and porting • Mpich-G2 on heterogeneous platforms • Cross-platform access and “firefighting” • Visualization • Project coordination CMS on the TeraGrid Compact Muon Solenoid Experiment Large Hadron Collider PI: Harvey Newman, CalTech • CMS experiment is looking for the Higgs particles, thought to be responsible for mass, and to find supersymmetry, a necessary element for String theory. Currently running event simulations and reconstructions to validate methods prior to experimental data becoming available. “Using the NSF TeraGrid for Parametric Sweep CMS Applications”, to appear in Proceedings of the International Symposium on Nuclear Electronics and Computing (NEC’2005), Sofia, Bulgaria, Sept. 2005 • • TeraGrid ASTA Team: Tommy Minyard, Edward Walker, Kent Milfeld, Jeff Gardner Simulations running simultaneously across multiple TeraGrid sites, SDSC, NCSA and TACC, using grid middleware tool, GridShell Complex workflow consisting of multiple execution stages running a large number of serial jobs (~1000s) with very large datasets stored on SDSC HPSS and staged to local sites prior to job runs Used 420K CPU hours on TeraGrid systems last year, usage expected to increase this and coming years What TeraGrid Staff Helped With (pre-ASTA) • GridShell development allows the TeraGrid to be used as a personal Condor pool – Condor jobs scheduled across multiple sites – Do not need shared architectures or queuing systems – Makes use of TeraGrid protocols for data transfer – Fits into existing Teragrid software stack • CMS production chain run through this system – 40,000 jobs – SC05 demo Current ASTA Projects Span Disciplines Project Discipline End Date Cellulose + Cellulase interactions using CHARMM, PI Brady Port, Scale and Optimize Code Molecular Dynamics 3/31/2006 MD Data Repository, PI Jakobsson Molecular Dynamics 3/31/2006 Liquid Rocket Engine Coaxial Injector Modeling, PI Heister Computational model development and implementation Computational Fluid Dynamics 3/31/2006 NekTar Arterial Tree Simulations, PI Karniadakis Code porting and optimization; MPICH-G2 and visualization support Computational Fluid Dynamics 3/31/2006 Vortonics: CFD with Vortex Degrees of Freedom, PI Boghosian MPICH-G2 and visualization support Computational Fluid Dynamics 3/31/2006 SPICE Non-Equilibrium Simulations, PIs Coveney and Boghosian Code deployment, grid and steering implementation support DNA Modeling 3/31/2006 ENZO Cosmic Simulator, PI Norman Cosmology 3/31/2006 Seismology 3/31/2006 CIG: Cyberinfrastructure for Geodynamics, PI Gurnis Develop software framework, repository, portal and training Geophysics 5/31/2006 BIRN (Biomedical Informatics Research Network), PI Ellisman Biomedical Imaging 9/30/2006 Implementation of architectural components Code optimization and scaling,network data handling and archiving SCEC TeraShake-2 and CyberShake, PI Olsen Code optimization, TG data handling and archiving, task flow mapping Develop and optimize codes; map task flows to TG Proposed ASTA Candidates Project Discipline LEAD: Storm-Scale Forecasts and Library Atmospheric modeling CERN LHC support: CMS; ATLAS High energy physics BNL RHIC experiment: STAR High energy physics NanoHub: Nemo-3D Nanotechnology NAMD-G Molecular Dynamics PPM: Turbulent Astrophysical Flows, interactive simulations Astrophysics TeraGrid Science Gateways Nancy Wilkins-Diehr San Diego Supercomputer Center Area Director for User Services Science Gateways A new initiative for the TeraGrid • Increasing investment by communities in their own cyberinfrastructure, but heterogeneous: •Resources •Users – from expert to K-12 •Software stacks, policies • Science Gateways – Provide “TeraGrid Inside” capabilities – Leverage community investment OGCE OGCE Portlets Portlets with with Container Container Service Service API API Apache Apache Jetspeed Jetspeed Internal Internal Services Services Grid Grid Service Service Stubs Stubs Local Local Portal Portal Services Services Rem Remote ote Content Content Services Services Grid Resources Workflow Composer Grid Protocols Java CoG Kit – Web-based Portals – Application programs running on users' machines but accessing services in TeraGrid – Coordinated access points enabling users to move seamlessly between TeraGrid and other grids. Build standard portals to meet the domain requirements of the biology communities Develop federated databases to be replicated and shared across TeraGrid OGCE Science Portal • Three common forms: Technical Approach Grid Service s Open Source Tools HTTP Rem ote Content Servers Initial Focus on 10 Gateways Listed in Program Plan Science Gateway Prototype Discipline Science Partner(s) TeraGrid Liaison Linked Environments for Atmospheric Discovery (LEAD) Atmospheric Droegemeier (OU) Gannon (IU), Pennington (NCSA) National Virtual Observatory (NVO) Astronomy Szalay (Johns Hopkins) Williams (Caltech) Network for Computational Nanotechnology (NCN) and “nanoHUB” Nanotechnology Lundstrum (PU) Goasguen (PU) Open Life Sciences Gateway Biomedicine and Biology Schneewind (UC), Osterman (Burnham/UCSD), DeLong (MIT), Dusko (INRA) Stevens (UC/Argonne) Biology and Biomedical Science Gateway Biomedicine and Biology Cunningham (Duke), Magnuson (UNC) Reed (UNC), Blatecky (UNC) Neutron Science Instrument Gateway Physics Cobb (ORNL) Cobb (ORNL) Grid Analysis Environment High-Energy Physics Newman (Caltech) Bunn (Caltech) Transportation System Decision Support Homeland Security Stephen Eubanks (LANL) Beckman (Argonne) Groundwater/Flood Modeling Environmental Wells (UT-Austin), Engel (ORNL) Boisseau (TACC) Science Grid [GrPhyN/ivDGL/Grid3] Multiple Pordes (FNAL), Huth (Harvard), Avery (Uflorida) Foster (UC/Argonne), Kesselman (USCISI), Livny (UW) Proposed Supplemental Activity: Empowering Science, Research, and Discovery Russ Miller, Mark Green, University of Buffalo •Enabling scientific and engineering domain applications using Grid-enabling Application Templates (GATs), •Porting 16 applications per year as well as providing support in terms of training 20-30 research groups per year So how will we meet all these needs? • With RATS! (Requirements Analysis Teams) • Collection, analysis and consolidation of requirements to jump start the work – Interviews with 10 Gateways – Common user models, accounting needs, scheduling needs • Summarized requirements for each TeraGrid working group – Accounting, Security, Web Services, Software • Areas for more study identified • Primer outline for new Gateways in progress • And milestones Implications for TeraGrid working groups • Accounting – Support for accounts with differing capabilities – Ability to associate compute job to a individual portal user – Scheme for portal registration and usage tracking – Support for OSG’s Grid User Management System (GUMS) – Dynamic accounts • Security – Community account privileges – Need to identify human responsible for a job for incident response – Acceptance of other grid certificates – TG-hosted web servers, cgi-bin code • Web Services – Initial analysis completed 12/05 – Some Gateways (LEAD, Open Life Sciences) have immediate needs – Many will build on capabilities offered by GT4, but interoperability could be an issue – Web Service security – Interfaces to scheduling and account management are common requirements • Software – Interoperability of software stacks between TG and peer grids – Software installations for gateways across all TG sites – Community software areas – Management (pacman, other options) Significant Progress in CY2005 • • • • • January-March – Initial Gateway interviews and requirements analysis completed April – Internal web page • Project descriptions, RAT reports, staffing, milestones, email archives, presentations May – Biweekly calls begin • Variety of issues discussed, special presentations – Accounts for all developers – Progress tracking for all gateways – Special presentations • Edward Walker, gridshell • Lee Liming, GT4 – Address recommendations to and from tgacctmgmt and security-wg – Three new RATs • Portal technology (John Cobb) • Web services (Ivan Judson) • OSG (Stuart Martin) June – International Science Gateways workshop at GGF14 August – Repo area for software exchanges • JDBC SQL for accounting queries to be first piece of contributed code • • • • September – Security-wg provides requirements for community accounts October – Gateways provide means to collect required info, expanded user responsibilites form for community accounts in production – Production community accounts in use (nanoXX, bioportal) – Discussions with security-wg about portal hosting within TG (NVO, HEP) – SC05 prep begins– demos, posters, movie clips, images, booth scheduling – Web Services recommendations complete – “How to become a gateway” at www.teragrid.org – User-friendly listing of gateways November – SC05 focus continues – GT4 deployment evaluation, Mike Showerman joins call – Special presentations • GridChem • PURSE and GAMA – Call with Roy Williams and security-wg to discuss “weak cert” concept – Gateway plans collected for Program Plan December – Finalize Program Plan input – Outline plans for next quarter Early CY2006 Plans • CI Channel presentation (March) • Montana State Workshop sponsored by Lariat (March) – How Grid Computing can Accelerate Research – Special Talks on Bio-informatics and the Grid • Portal Technology RAT, John Cobb • Account management through User Portal, Eric Roberts • Audit trails for community accounts • Begin implementation of TG and Gateway provided web services • Complete further analysis of scheduling requirements and implementation ideas • Full day training session at TG AHM Gateways Under the Hood: Open Life Science Gateway and Web Services • OLSG integrates four components: – Tools from National Microbial Pathogen Data Resource (http://www.nmpdr.org) and TheSeed (http://theseed.uchicago.edu/FIG/index.cgi) – Open bioinformatics tools and data – Web Services – TeraGrid resources • Providing: – Web-based access for account administration, trivial access to resources, and documentation. – Web service based access to tools, including: •Taverna, Kepler, other workflow tools •Microsoft Development Environment •Open Source Web Service Toolkits: – SOAP::Lite [perl], ZSI [python], Apache Axis [c/java] • Bioinformatics toolkits such as BioPerl and BioPython – Data access • TeraGrid presentation requested at for February NIH meeting • http://lsgw.mcs.anl.gov/ OLSG Helps Define TG-wide Policies • Q1FY06 Accomplishments – Web Service Enabled SEED Software – Developed Life Science Gateway Architecture – Led Web Services RAT, working to develop the right model for Gateways with respect to TeraGrid Resources, Security, and User Model • Q2FY06 Plans – Deploy prototype web/grid services based TeraGrid hosted access to community developed computational phylogeny tools (e.g., PHYLIP suite) – Develop strategy for supporting large-scale computing needs for the National Centers for Biomedical Computing (i.e., the BISTI Centers) Gateways Under the Hood: LEAD, Workflows and Web Services •Providing tools that are needed to make accurate predictions of tornados and hurricanes •Data exploration and Grid workflow Log in and see your MyLEAD Space •x Creating a workflow for Data Mining • Use ADaM services from UAH Nexrad II Radar Data 3DMesocyclone Detection Feature Extraction Service ESML Descriptor ESML_Converter Data Transformation Service MinMaxNormalizer Data Normalization Service BayesClassifying Classification Service Visualization Monitor results in real time Large workflows can be composed Educational Resources Gateways Under the Hood: OSG and Grid Interoperation • OSG RAT led by Stuart Martin – Implementation of Grid Service Interoperability •Deploying and Supporting Common Grid Services and Protocols •Creating OSG Gateways – Basic Grid Interoperability Services •Authentication / Authorization / Accounting (AAA) •Information Services •Job Execution •Data Handling – User and Application Level Grid Interoperability Services •Resource Discovery / Selection •Resource Brokering •Job Submission and Bookkeeping •Data Management – Interoperability Quality Assessment •User Support and Troubleshooting •Application Performance • Grid Interoperability wg formed 12/05 Grid Interoperation • TeraGrid/OSG Interop work (Stuart Martin et al.) drove organization of a multi-grid interoperation initiative begun in 2005. • Leaders from TeraGrid, OSG, EGEE, APAC, NAREGI, DEISA, Pragma, UK NGS, KISTI will lead an interoperation initiative in 2006. • Six international “RATs” will meet for the first time at GGF-16 in February 2006 – Application Use Cases •(Bair/TeraGrid, Alessandrini/DEISA) – Authentication/Identity Mgmt •(Skow/TeraGrid) – Job Description Language •Newhouse/UK-NGS – Data Location/Movement •Pordes/OSG – Information Schemas •Matsuoka/NAREGI – Testbeds •Arzberger/Pragma Leaders from nine Grid initiatives met at SC05 to plan an application-driven “Inerop Challenge” in 2006. TeraGrid Education, Outreach and Training (EOT) and External Relations (ER) Scott Lathrop Argonne National Laboratory Director for EOT Mission, Goals, and Strategies The mission is to engage larger and more diverse communities of researchers, educators and learners in discovering, using, and contributing to TeraGrid. The goals are to: – Enable awareness and access to TeraGrid resources – Provide education and training for all disciplines, and all stages of learning (K-12 through professional) – Promote diversity among all TeraGrid activities – Expand the community of users of TeraGrid The strategies are to – Work with TeraGrid Science Gateways, User Support and the Core program – Leverage strategic external partnerships and – Assess the community impact. EOT and ER Team Members Using the User Support model, the GIG coordinates a TGwide EOT and ER program with an enthusiastic group of RP and Core/CIP staff. • • • • • • • • • Argonne/UChicago: Scott Lathrop, Ray Bair, Joe Insley CalTech: Sarah Bunn Indiana: Craig Stewart, Julie Wernert NCSA: Sandie Kappes, Edee Wiziecki, Mike Freemon, Bill Bell, Trish Barker ORNL: John Cobb, Betsy Riley PSC: Sergiu Sanielevici, Beverly Clayton, Cheryl Begandy, Mike Schneider, Sean Fulton Purdue: Sebastien Goasguen, Gary Bertoline, Krishna Madhavan, Steve Dunlop SDSC: Diane Baxter, Ange Mason, Don Frederick, Ashley Wood, Greg Lund, Diana Diehl, Tim Gumto TACC: Stephenie McLean, Faith Singer-Villalobos Education Plans and Effectiveness Plans • Professional development for and with UG faculty and secondary school teachers Development and dissemination of resources including software, curricular materials, and lesson plans • Mentoring of students in using cyberinfrastructure to learn math and science, and in pursuing advanced studies Effectiveness • Leading the SC Education Programs SC05-SC06 • NanoHUB used by 10 universities in dozens of UG/Grad courses. • Scaling-up successful EOT-PACI/EPIC projects (e.g. TeacherTECH) • External Partnerships: EPIC, NSDL-Computational Science Education Reference Desk, the National Computational Science Institute, and CIP SC Education Program Plans and Effectiveness • Purdue is leading the SC05 and SC06 Education Program, including summer workshops • TeraGrid Team has been asked to propose a multiyear Education Program starting with SC07-09 – Goal is to provide greater continuity and broader, sustained integration of computational science education for undergraduate education – Proposal being made to the SC Steering Committee next week to initiate the program in 2006 to prepare for SC07 – Engages a large national planning team representing multiple state and national programs that can help leverage and sustain the program Outreach Plans and Effectiveness Plans Raise awareness for TeraGrid’s impact on research and education Engage under-represented people in TeraGrid development and use, with a focus on MSI college faculty and students Outreach with new communities that have not traditionally been users of cyberinfrastructure and grid computing Effectiveness • New Science Gateways: Telescience, BIRN and NEES • Community engagement to applications via professional society meetings, conferences, and workshops; usage has increased • External Partnerships: Minority Serving Institution Network, Humanities, Arts, and Social Sciences (HASTAC and CHASS) Training Plans and Effectiveness Plans • Hands-on training for researchers on topics from introductory to advanced applications of grid computing. • Training venues include live workshops, Access Grid sessions, and on-line WebCT courses • Coordination of training opportunities across TeraGrid Effectiveness • Review of training materials by experts in the field • Post-workshop surveys by participants assessing the quality • Tracking of WebCT course usage for enhancement • User surveys provide feedback on quality and needs • Identification of needs by ASTA, Science Gateways, and User Support • Joint workshops and training activities by GIG, RPs, and CIP • PSC is investigating Standardized User Monitoring Suite • Established Partnerships: NMI, National Microbial Pathogen Data Resource (NMPDR), and CIP External Relations Plans and Effectiveness Plans • Promote TeraGrid use and adoption via publicity • Organize public relations efforts • Highlight TeraGrid’s value via communications • Communications of tech changes for smooth transitions • Provide internal communications strategies for all of TG Effectiveness • Press Releases, news stories, science nuggets • Publications – TeraGrid brochure, user publications lists • Website – increased usage • Presentations – multiple venues and multiple events • Event Management and Logistics (e.g. SCxx) • External Partnerships: OSG, ASCRIBE,GridToday, HPCwire