Automating Distributed Software Testing Steven Newhouse Deputy Director, OMII

advertisement
Automating Distributed
Software Testing
Steven Newhouse
Deputy Director, OMII
OMII Activity

Integrating software from multiple sources



Deployment across multiple platforms



Established open-source projects
Commissioned services & infrastructure
Currently:
 SUSE 9.0 (server & client) WinXP (client)
Future:
 WinXP (server), RHEL
Verify interoperability between platforms & versions
©
Distributed Software Testing

Automatic Software Testing vital for the Grid







Build Testing – Cross platform builds
Unit Testing – Local Verification of APIs
Deployment Testing – Deploy & run package
Distributed Testing – Cross domain operation
Regression Testing – Compatibility between versions
Stress Testing – Correct operation under real loads
Distributed Testbed


Need a breadth & variety of resources not power
Needs to be a managed resource – process
©
Contents




Experience from ICENI
Recent ETF activity
NMI build system
What next?
©
In another time in another place…
(thanks to Nathalie Furmento)

ICENI




Daily builds from various CVS tags
On a successful build deploy the software
Run tests against the deployed software
Experience



Validate the state of the checked in code
Validate that the software still works!
On reflection… probably needed more discipline
in the development team & even more testing!
©
Therefore several issues…




Representative resources to build & deploy
software
Software to specify & manage the builds
Automated distributed co-ordinated tests
Reporting and notification process
©
Secure Flocked Condor Pool


Activity within the UK Engineering Task Force
Collaboration between:






Steven Newhouse - OMII (Southampton)
John Kewley - CCLRC Daresbury Laboratory
Anthony Stell - Glasgow
Mark Hayes - Cambridge
Andrew Carson - Belfast e-Science Centre
Mark Hewitt - Newcastle
©
Stage 1: Flocked Condor Pools

Configure flocking between pools:



Set FLOCK_TO & FLOCK_FROM
Set HOSTALLOW_READ & HOSTALLOW_WRITE
Firewalls:




A reality for most institutions
Configure outgoing & incoming f/wall traffic
Set LOWPORT & HIGHPORT
Experiences
 http://www.doc.ic.ac.uk/~marko/ETF/flocking.html
 http://www.escience.cam.ac.uk/projects/camgrid/documentatio
n.html
©
Issues

Good News


Well documented & mature code
Bad News

Firewalls



Access Policy


Need to open large port range to many hosts
Depending on your site policy this may be a problem!
Need access control mechanisms
Scalability
©
Flocking & firewalls
Execution Node
©
Manager Node
Upcoming Solution: Condor-C





Condor:
 Submit a job which is managed by the schedd
 Schedd discovers a startd through matchmaking and starts job
on remote resource
Condor-G:
 Submit a job which is managed by the schedd
 Schedd launches job through gatekeeper on remote Globus
enabled resource
Condor-C:
 Submit a job which is managed by the schedd
 Schedd sends job to a schedd on remote Condor pool
This is a good:
 Submission machine needs no direct route to startd just remote
schedd.
http://www.opensciencegrid.org/events/meetings/boston0904/doc
s/vdt-roy.ppt
©
Stage 2: Configuring Security

Use ‘standard’ Grid authentication


X.509 certificates & GSI proxy certificates
Condor Configuration



Require authentication by using the local filesystem or GSI
 SEC_DEFAULT_AUTHENTICATION = REQUIRED
 SEC_DEFAULT_AUTHENTICATION_METHODS = GSI, FS
Point to the location of the certificate directory (authentication)
 GSI_DAEMON_DIRECTORY = /etc/grid-security
Point to the location of the gridmap file (authorisation)
 GRIDMAP = /etc/grid-security/grid-mapfile.condor
©
Stage 3: Authorising Access

List trusted masters (possibly all hosts?)


All entries on one line
UK CA requires DN in two forms:

Email & emailAddress
GSI_DAEMON_NAME =
/C=UK/O=eScience/OU=Southampton/L=SeSC/CN=polaris.e
cs.soton.ac.uk/emailAddress=s.newhouse@omii.ac.uk,
/C=UK/O=eScience/OU=Southampton/L=SeSC/CN=polaris.e
cs.soton.ac.uk/Email=s.newhouse@omii.ac.uk,
… OTHER HOST DNs
©
Stage 4: Controlling Access

Girdmap file has same layout as in GT

“DN” USER@DOMAIN
"/C=UK/O=eScience/OU=Southampton/L=SeSC/CN=polaris.ecs.s
oton.ac.uk/emailAddress=s.newhouse@omii.ac.uk"
host@polaris.ecs.soton.ac.uk
"/C=UK/O=eScience/OU=Southampton/L=SeSC/CN=polaris.ecs.s
oton.ac.uk/Email=s.newhouse@omii.ac.uk"
host@polaris.ecs.soton.ac.uk
"/C=UK/O=eScience/OU=Imperial/L=LeSC/CN=steven newhouse"
snewhouse@polaris.ecs.soton.ac.uk
©
Issues

Good News



Bad News



Authorised access to flocked Condor pools
Provides know how for UK wide pool (NGS?)
Documentation (will provide feedback)
Implemented through a lot of trial & error
Activity HowTo

http://wiki.nesc.ac.uk/read/sfct
©
Exploiting the distributed pool

Simple build portal

Upload software, select resources, download binaries
©
Build Management

We want to build a package



Package may require patching



Take existing source code package
Patch before building
Building on multiple platforms



May have installed dependencies (e.g. compilers)
May have build dependencies (e.g. openssl)
Move source to the specified platform
Build, package and return binaries to host
Distributed Inter-dependent Tasks
©
Use Condor to manage builds

Leverage Condor’s established features





Execution of a job on a remote resource
Persistent job execution
Matching of job requirements to resource capability
Management of dependent tasks – DAGMAN
Integrated into NMI build system


Manages the builds of the NMI releases
Declare build parameters which are converted to
Condor jobs
©
The Future… NMI & OMII

Building OMII software on NMI system



Rolling changes back into main software base
Integrating OMII builds into NMI system
Ongoing activity


Adding in UK resources into the NMI pool
Distributed deployment & testing still to be resolved
©
Download