Parallel Reconstruction of CLEO III Data Gregory J. Sharp Christopher D. Jones Wilson Synchrotron Laboratory Cornell University Outline • Overview of CLEO reconstruction environment • The problems with the old reconstruction system • The solution - finer-grained parallelism • The benefits CLEO III Reconstruction Environment • Uses a farm of more than 130 Sun Netras • Sun Grid Engine ™ manages CPU allocation • Data read from & written to Objectivity/DB™ • Events must be written to DB in event-number order • Reconstruction rate has to equal average DAQ rate Former Reconstruction System • Output was written directly to the offline database • ~130 runs may be processed in parallel on the farm • Each run is processed in its entirety by a single CPU • Up to 9 days to reconstruct a single run on a single CPU • All failures required operator and/or DBA intervention Problems • Need to maximize CPU utilization • Load balancing between farms is difficult • Takes a long time to stop the farm safely • Output of the first few runs must be checked • Debugging reconstruction code More Problems • Low I/O rates to the database • Many locks held for long periods • Large window for failures to occur • Failure leaves database in an invalid state • No automation of failure detection and recovery The Solution • Split each run into roughly equal-sized chunks • Assign each chunk to a CPU • Save sub-job output in intermediate binary files in event-number order • Once all sub-jobs complete, collate binary files into database in event-number order The Job Manager • The JM submits all the reconstruction sub-jobs and monitors their progress, retrying failures • Once all reconstruction completes successfully the JM starts the collation sub-job • Once collation completes successfully the JM starts the merge histogram sub-job • Can be restarted at any time if it dies Structure Diagram Automation • JM restarts subjobs with transient failures • Runs may be submitted automatically when SGE queue is (almost) empty • A cron job generates status web pages Implementation Details • Written in Perl • Uses Sun Grid Engine to submit and track jobs • Uses CLEO III software infrastructure for reconstruction and population • Uses PAW for merging histograms Benefits • Less operator intervention/management • Faster debugging • Increased CPU utilization, which offsets extra CPU use • 20% Faster completion of reconstruction • Just-in-time pre-staging of data from HSM file system • The January ice storm Future Steps • Automate staging of data to cache disks • Automate posting of staged runs info to Reconstruction Conclusions • Multiple file formats made this possible • Substantial productivity gains • Higher utilization of computing resources • For more details: http://www.lepp.cornell.edu/~gregor/projects/parallelpass2