Environment from the Molecular Level A NERC eScience testbed project The UCL Condor Pool Experience John Brodholt1, Paul Wilson3, Wolfgang Emmerich2 and Clovis Chapman2. 1. Department of Earth Sciences, University College London, Gower Street, London WC1E 6BT, UK 2. Department of Computer Science, University College London, Gower Street, London WC1E 6BT, UK 3. Anvil Software, London, UK. Environment from the Molecular Level A NERC eScience testbed project The UCL Condor Pool Approximately 946 Windows machines (yesterday) 1 to 2.4 GHtz Intel processors 256 to 512 MBytes memory (a few are more) They are in “open access” student cluster rooms PCs are all thin client “WTS” machines with network bootable operating systems. (Citrix/Bpbatch - hit spacebar to upload new operating systems image) The pool is very simple – one manager, one submit machine (via ssh). Environment from the Molecular Level A NERC eScience testbed project Q. Why was it someone from an Earth Science Dept. who got it going? 1. Because three years ago, the eScience grants call made me look up “the Grid” on the web and by chance I came across the Condor web site. 2. I also happened to know how Information Systems at UCL managed their student PCs. 3. Persuaded the Director of UCL’s Education and Information Systems Division that I could put it in our eMinerals grant (I think he assumed it wouldn’t get funded). Environment from the Molecular Level A NERC eScience testbed project Key Political Issues Even though the Director of EISD had agreed for us to put it in the grant, we had to convince Information Systems themselves. Numerous meetings …. IS produced a five page document outlining what they thought their policy on a large Condor cluster would be – i.e. the primary purpose of the student cluster rooms must not be compromised. Nor should IS staff use their time on the project … etc. Needed testing (one cluster, then one image type). Perhaps the key moment was when the UCL presented its eScience projects to Tony Hey and the UCL Provost. Environment from the Molecular Level A NERC eScience testbed project Timescale Desktop - June 2002 (2 nodes) Earth Science Student Cluster Room - Oct 2002 (18 nodes) Physics Department (one WTS image) – Jan 2003 (150 nodes) Campus – October 16th 2003 (930 nodes) 1 millionth hour of CPU – April 2nd, 2004 This matched exactly the timescale we outlined in the eMinerals grant Environment from the Molecular Level A NERC eScience testbed project Other Issues Difficult to persuade the scientists to get involved for just a few machines. Some needed to compile their codes for Windows machines – “It’s simple, just convert them to Java ..” Wolfgang Emmerich, 2002! Our central manager died a few times when a user submitted a few thousand jobs all at the same time (took 24 hours to repair disk with fsck). Now have a manager and a submit machine. Students will do anything to reserve a machine – steal the mouse, put out of order signs on them, and UNPLUG them. Also, IS themselves briefly turn machines in some clusters off in order to clear the room. This restricts the length of job. Environment from the Molecular Level A NERC eScience testbed project UCL Condor job time fluctuations. Dashed line shows 5 hr recommended maximum job time. 18.00 15.26 16.00 13.28 av. job times, hours 14.00 12.00 9.73 10.00 8.00 6.73 4.93 6.00 4.00 2.23 2.34 2.00 0.00 Oct 2OO3 Nov 2OO3 Dec 2OO3 Jan 2OO4 Feb 2OO4 Mar 2OO4 Apr2OO4 Environment from the Molecular Level A NERC eScience testbed project Spikes in user demand: a) Not many users b) Most are using simple schemes to produce lots of initial input files and send off to pool. Get results back and spend a long time processing them/extracting data/planning next set of inputs. Existing e-science technology Distributed Distributed resources Computing (Condor pools Portal etc.) User Input: Structural model Si/Al, cation types, [H2O] etc. Model/Configuration Generator Jobs Database Steering Database Improve generation / model strategy Analysis (geometry, energy, fit) Analysis Database User Input: Diffraction data, chemical analysis, building units Si/Al, cation types, [H2O] etc. Drip feeding and interactive steering of a Condor pool using relational databases Dewi Lewis, Rosie Coates and Sam French UCL Chemistry / RI Environment from the Molecular Level A NERC eScience testbed project THE Science. 1. Simulation of pollutants in the environment Binding of heavy metals and organic molecules in soils. 2. Studies of materials for long-term nuclear waste encapsulation Radiocactive waste leaching through ceramic storage media. 3. Studies of weathering and scaling Mineral/water interface simulations, e.g oil well scaling. also 4. The Earth’s core and mantle Many codes: DL-POLY, GULP, METADISE, CRYSTAL, CASTEP, SIESTA, … Environment from the Molecular Level A NERC eScience testbed project Now what? Expand pool to include staff WTS machines ~ 1500 machines (received 3 page email from IS who owns them?). UCL Staff machines at hospitals ~ ???? machines. Federate with other pools: hopefully make it more flexible smooth spikes in demand.