GITS – A Software Toolkit to enable Operational Monitoring & Grid Integration David Baker Research Computation Support Services University of Southampton Mark Mc Keown Grid Support Centre University of Manchester How do we know the Grid is functioning properly? • UK e-Science Grid is a large, very heterogeneous Grid. • Each site involved has different firewalls, security requirements, batch-systems, architectures and OS’s….. • A question initially addressed by eScience participants at So’ton… Solution • A script that can be run at one site to test the Grid functionality at another site. • Must be easy to use. • Each site should run the script regularly against all other sites to create a matrix of results. Grid Integration Test Script • Written in Perl. • Released under a GPL license. • Maybe one of the most comprehensive integration test scripts avail! • Avail. at www.soton.ac.uk/~djb1/gits.html GITS Script • The script is written in Perl, it requires one extra module Proc::Reliable • There are two GUI versions one written with Perl/Tk ( http://www.soton.ac.uk/~djb1/tkgits ), the other a C++/Qt executable which acts as a front end to the script ( http://vermont.mvc.mcc.ac.uk/qgits ). • The script can output its results in HTML format which can be put on a Web Server. Running the Script • Before running the script some site specific parameters should be set within script: $timeout $local_giis $local_vo_name • These may become command line arguments in the future. Command Line Arguments • gits [-t abcdefghijklmn] [-h htmlfile] [-x XMLfile] contact1 [contact2 ...] • contact1 etc. are in the same format as used by globusrun and globus-job-run eg. vermont.mvc.mcc.ac.uk vermont.mvc.mcc.ac.uk:fork • If no Jobmanager is defined then fork is assumed – this may change. Tests -1 • a Ping Test - globusrun –a –r contact (if this test fails no other tests are run) • b RSL-Hello - globusrun –o –r contact ‘&(executable=/bin/echo)(arguments=“Hello World”)’ • c Hello World – globus-job-run contact /bin/echo “Hello World” Tests - 2 • d Stage – globus-job-run contact –s testscript • e RSL-Shell – globusrun –o –r contact ‘&(executable=$GLOBUS_LOCATION/bi n/globus-sh-exec)(arguments= -e ${GLOBUS_SH_UPTIME})’ Tests - 3 • f batch tests (Batch-Submit, Batch-Query and Batch-Cancel) globus-job-submit contact /bin/sleep 600 globus-job-status globus-job-clean –force • g Batch-Retrieve globus-job-submit contact /bin/echo hello globus-job-get-output Tests - 4 • h GASS – globusrun –s –r contact ‘&(executable=$GLOBUS_LOCATION/bin/globus-urlcopy) (arguments= $GLOBUS_GASS_URL/blah file:/tmp/blah) (environment= (LD_LIBRARY_PATH $GLOBUS_LOCATION/lib))’ Tests - 5 • i GridFTP – globus-url-copy file:/tmp/blah gsiftp://contact/tmp/blah • j GRIS – grid-info-search –h contact –x • k UK GIIS – grid-info-search –nowrap ginfo.grid-support.ac.uk –x –b “UK eScience,o=grid” –s sub “(objectclass=MdsComputer)” Tests - 6 • l Local GIIS - grid-info-search –nowrap $local_giis –x –b “$local_vo_name,o=grid” –s sub “(objectclass=MdsComputer)” • m Jobmanagers – checks for jobmanagers reported in UK GIIS • n Comparison – compares results from UK GIIS and Local GIIS Tests – 7 • j gsissh gsissh –p 2222 –o “BatchMode yes” host /bin/ehco “ETF gsissh test” Note – for the L2G the gsissh server should run on port 2222, which is not a restricted port. Script Output • STDOUT/STDERR • XML format (using –x option). Required to upload results to GITS Web Service • HTML – should be put on a Web Server with a link from https://www.grid-support.ac.uk/etf/wg/integrationtests.html GITS html output Operational Monitoring • Results from each site published on Web & monitored by the Grid Support Centre at RAL. • Any problems reported to sites concerned. • Difficult and time consuming process! GITS Web Service • Provides a means for storing/retrieving results from the GITS script at a central location using Web Service technology. • Uses Tomcat/JAX-RPC/MySQL on the server side and Perl on the client side. GITS Web Service • Simplifies monitoring of the Grid by having a central location for all results. • Allows historical analysis of Grid status – allows users to discover why their job may have failed. • Allows users/applications to check the status of the Grid before launching a job. GITS Web Service Clients • WSDL: http://vermont.mvc.mcc.ac.uk:8080/GITSqueryjaxrpc/GITSquery?WSDL • Perl client, gqec.pl, requires Perl modules SOAP::Lite installed on client machine. http://vermont.mvc.mcc.ac.uk/gqec/ • CGI interface: http://vermont.mvc.mcc.ac.uk/gqec/gqec_cgi.html • C++/Qt/gSOAP http://vermont.mvc.mcc.ac.uk/qgits/ Access Control • Only certain users can upload results to the database, downloading results is open to everyone • To get permission to upload results to the database send mark.mckeown@man.ac.uk your UK eScience certificate DN Perl Client • gqec.pl usage • gqec.pl wsdl (returns WSDL of service) • gqec.pl method parameters gqec.pl QueryHostJobDate vermont.mvc.mcc.ac.uk fork 2002-11-15 Uploading Results • gqec.pl UpLoader file1 file2 • gqec.pl XMLtoHTML file1 file2 .... • gqec.pl QueryHostJob vermont.mvc.mcc.ac.uk fork | gqec.pl XMLtoHTML Some Future Challenges • Further automate operational monitoring using Web Service. • Port tools to work with GT3. • Applications test suite? Contributors • • • • • • Work carried out through UK e-Science ETF with many sites and people contributing: Simon Cox - So’ton David Baker - So’ton Jon Hillier – Oxford Mark Mc Keown – Manchester Ron Fowler – RAL Marko Krznaric – Imperial