Parallel Reconstruction of
CLEO III Data
Gregory J. Sharp
Christopher D. Jones
Wilson Synchrotron Laboratory
Cornell University
Outline
• Overview of CLEO reconstruction environment
• The problems with the old reconstruction system
• The solution - finer-grained parallelism
• The benefits
CLEO III
Reconstruction Environment
• Uses a farm of more than 130 Sun Netras
• Sun Grid Engine ™ manages CPU allocation
• Data read from & written to Objectivity/DB™
• Events must be written to DB in event-number order
• Reconstruction rate has to equal average DAQ rate
Former Reconstruction System
• Output was written directly to the offline database
• ~130 runs may be processed in parallel on the farm
• Each run is processed in its entirety by a single CPU
• Up to 9 days to reconstruct a single run on a single CPU
• All failures required operator and/or DBA intervention
Problems
• Need to maximize CPU utilization
• Load balancing between farms is difficult
• Takes a long time to stop the farm safely
• Output of the first few runs must be checked
• Debugging reconstruction code
More Problems
• Low I/O rates to the database
• Many locks held for long periods
• Large window for failures to occur
• Failure leaves database in an invalid state
• No automation of failure detection and recovery
The Solution
• Split each run into roughly equal-sized chunks
• Assign each chunk to a CPU
• Save sub-job output in intermediate binary files in
event-number order
• Once all sub-jobs complete, collate binary files into
database in event-number order
The Job Manager
• The JM submits all the reconstruction sub-jobs and
monitors their progress, retrying failures
• Once all reconstruction completes successfully the JM
starts the collation sub-job
• Once collation completes successfully the JM starts
the merge histogram sub-job
• Can be restarted at any time if it dies
Structure Diagram
Automation
• JM restarts subjobs with transient failures
• Runs may be submitted automatically when SGE queue
is (almost) empty
• A cron job generates status web pages
Implementation Details
• Written in Perl
• Uses Sun Grid Engine to submit and track jobs
• Uses CLEO III software infrastructure for
reconstruction and population
• Uses PAW for merging histograms
Benefits
• Less operator intervention/management
• Faster debugging
• Increased CPU utilization, which offsets extra CPU use
• 20% Faster completion of reconstruction
• Just-in-time pre-staging of data from HSM file system
• The January ice storm
Future Steps
• Automate staging of data to cache disks
• Automate posting of staged runs info to
Reconstruction
Conclusions
• Multiple file formats made this possible
• Substantial productivity gains
• Higher utilization of computing resources
• For more details:
http://www.lepp.cornell.edu/~gregor/projects/parallelpass2