Automating Distributed Software Testing Steven Newhouse Deputy Director, OMII OMII Activity Integrating software from multiple sources Deployment across multiple platforms Established open-source projects Commissioned services & infrastructure Currently: SUSE 9.0 (server & client) WinXP (client) Future: WinXP (server), RHEL Verify interoperability between platforms & versions © Distributed Software Testing Automatic Software Testing vital for the Grid Build Testing – Cross platform builds Unit Testing – Local Verification of APIs Deployment Testing – Deploy & run package Distributed Testing – Cross domain operation Regression Testing – Compatibility between versions Stress Testing – Correct operation under real loads Distributed Testbed Need a breadth & variety of resources not power Needs to be a managed resource – process © Contents Experience from ICENI Recent ETF activity NMI build system What next? © In another time in another place… (thanks to Nathalie Furmento) ICENI Daily builds from various CVS tags On a successful build deploy the software Run tests against the deployed software Experience Validate the state of the checked in code Validate that the software still works! On reflection… probably needed more discipline in the development team & even more testing! © Therefore several issues… Representative resources to build & deploy software Software to specify & manage the builds Automated distributed co-ordinated tests Reporting and notification process © Secure Flocked Condor Pool Activity within the UK Engineering Task Force Collaboration between: Steven Newhouse - OMII (Southampton) John Kewley - CCLRC Daresbury Laboratory Anthony Stell - Glasgow Mark Hayes - Cambridge Andrew Carson - Belfast e-Science Centre Mark Hewitt - Newcastle © Stage 1: Flocked Condor Pools Configure flocking between pools: Set FLOCK_TO & FLOCK_FROM Set HOSTALLOW_READ & HOSTALLOW_WRITE Firewalls: A reality for most institutions Configure outgoing & incoming f/wall traffic Set LOWPORT & HIGHPORT Experiences http://www.doc.ic.ac.uk/~marko/ETF/flocking.html http://www.escience.cam.ac.uk/projects/camgrid/documentatio n.html © Issues Good News Well documented & mature code Bad News Firewalls Access Policy Need to open large port range to many hosts Depending on your site policy this may be a problem! Need access control mechanisms Scalability © Flocking & firewalls Execution Node © Manager Node Upcoming Solution: Condor-C Condor: Submit a job which is managed by the schedd Schedd discovers a startd through matchmaking and starts job on remote resource Condor-G: Submit a job which is managed by the schedd Schedd launches job through gatekeeper on remote Globus enabled resource Condor-C: Submit a job which is managed by the schedd Schedd sends job to a schedd on remote Condor pool This is a good: Submission machine needs no direct route to startd just remote schedd. http://www.opensciencegrid.org/events/meetings/boston0904/doc s/vdt-roy.ppt © Stage 2: Configuring Security Use ‘standard’ Grid authentication X.509 certificates & GSI proxy certificates Condor Configuration Require authentication by using the local filesystem or GSI SEC_DEFAULT_AUTHENTICATION = REQUIRED SEC_DEFAULT_AUTHENTICATION_METHODS = GSI, FS Point to the location of the certificate directory (authentication) GSI_DAEMON_DIRECTORY = /etc/grid-security Point to the location of the gridmap file (authorisation) GRIDMAP = /etc/grid-security/grid-mapfile.condor © Stage 3: Authorising Access List trusted masters (possibly all hosts?) All entries on one line UK CA requires DN in two forms: Email & emailAddress GSI_DAEMON_NAME = /C=UK/O=eScience/OU=Southampton/L=SeSC/CN=polaris.e cs.soton.ac.uk/emailAddress=s.newhouse@omii.ac.uk, /C=UK/O=eScience/OU=Southampton/L=SeSC/CN=polaris.e cs.soton.ac.uk/Email=s.newhouse@omii.ac.uk, … OTHER HOST DNs © Stage 4: Controlling Access Girdmap file has same layout as in GT “DN” USER@DOMAIN "/C=UK/O=eScience/OU=Southampton/L=SeSC/CN=polaris.ecs.s oton.ac.uk/emailAddress=s.newhouse@omii.ac.uk" host@polaris.ecs.soton.ac.uk "/C=UK/O=eScience/OU=Southampton/L=SeSC/CN=polaris.ecs.s oton.ac.uk/Email=s.newhouse@omii.ac.uk" host@polaris.ecs.soton.ac.uk "/C=UK/O=eScience/OU=Imperial/L=LeSC/CN=steven newhouse" snewhouse@polaris.ecs.soton.ac.uk © Issues Good News Bad News Authorised access to flocked Condor pools Provides know how for UK wide pool (NGS?) Documentation (will provide feedback) Implemented through a lot of trial & error Activity HowTo http://wiki.nesc.ac.uk/read/sfct © Exploiting the distributed pool Simple build portal Upload software, select resources, download binaries © Build Management We want to build a package Package may require patching Take existing source code package Patch before building Building on multiple platforms May have installed dependencies (e.g. compilers) May have build dependencies (e.g. openssl) Move source to the specified platform Build, package and return binaries to host Distributed Inter-dependent Tasks © Use Condor to manage builds Leverage Condor’s established features Execution of a job on a remote resource Persistent job execution Matching of job requirements to resource capability Management of dependent tasks – DAGMAN Integrated into NMI build system Manages the builds of the NMI releases Declare build parameters which are converted to Condor jobs © The Future… NMI & OMII Building OMII software on NMI system Rolling changes back into main software base Integrating OMII builds into NMI system Ongoing activity Adding in UK resources into the NMI pool Distributed deployment & testing still to be resolved ©