Efficiency profiles

–Unfortunately, this problems is not yet fully under control
–No enough information from monitoring that would allow us to
correlate poor performing jobs (CPU/Wall time) with possible
• version of AliRoot
• I/O patterns and source/destination
• Jobs owner
• Job type (analysis, reconstruction, user job)
–We need to tackle these issues and quickly react to problems
–More discussion during next Wednesday (Grid day)
P.Buncic @ALICE Offline board
Job efficiency
• Defined as CPU/Wall time of the process (batch)
i.e. if your job runs for 1 hour elapsed time and uses 30
minutes of CPU, the efficiency is 50%
• One of the most-watched metrics in the Grid world
Easy to calculate, understandable by all
• Just one of the metrics to judge what is the
‘usefulness’ of the processing
• We have to make sure the numbers are high
>85% for T0/T1s, >80% for T2s
• Otherwise, we enter into the ‘table of shame’
Efficiency profiles
• The following studies were done with information
already available in MonALISA
• Collected through the tools and job instrumentation
provided by Costin and Andrei
• All jobs are equipped with traffic and efficiency
factor, in the case of LEGO trains, this information is
aggregated per train
Summary of operations 2012/13
60% simulation
10% organized analysis
10% RAW data reco,
20% individual user analysis
465 individual users
One week of site efficiency
User jobs in the system
Clearly visible ‘weekday effects working hours’ pattern
Average efficiency = 84%
One week aliprod efficiency
• aliprod = MC + AOD filtering + QA
• No brainer – the efficiency is high
• Contributing to the overall efficiency at @60%
• Nothing to gain here, 100% efficiency is not
possible by definition
One week alidaq efficiency
• alidaq = RAW data processing + AOD filtering + QA
• Not too bad, contributing to the overall efficiency @10%
One week alitrain efficiency
• alitrain = LEGO trains
• Very respectable, low
efficiency (mostly)
when few jobs (tails of
merging etc..) – small
contribution to the
overall efficiency
• Contributes to the
overall efficiency @10%
One week individual user efficiencies
The ‘carpet’, 180 users, average = 26%, contribution to total @20% level
User job profile
Efficiency gains
• The largest efficiency gain can be achieved through
improvements in individual user jobs
• Despite high complexity, the LEGO train efficiency is
already high
• Just moving ½ of the individual user jobs to
LEGO would result in immediate ~7% increase in
overall efficiency!
• Then there is moving from ESDs to AODs
Always in the books, but uptake is still slow
… looking at the user jobs
Top 10….
Bottom 10…names removed to
protect the innocent
Top user job analysis
• Starting from job table, IDs, completion rate…. (also
quick access through ‘My jobs’ tab)
Top user job analysis (2)
• At a job level
Top user job analysis (3)
• And the summary
The efficiency is high – nothing obvious on site level
From the user – this is effectively an MC embedding task,
using ESDs
Another top user summary
Analysis on AOD with very high CPU component (see running time)
Analysis train code with ‘modifications’?
Low efficiency user job analysis
The efficiency does not show a clear site/number of files dependency
The input rate is not high
MC ESD analysis, same tag as the ‘top user’ efficiency
Could this be a ‘kinematics access’ data issue?
This is not a ‘test job’ – most of this user jobs over the last week are
similar: 649 master jobs, 4782 sub-jobs
Low efficiency user job analysis (2)
Job type same as previous slide, different period, different AN tag,
different user…
Medium efficiency user job analysis
Data intensive job…
PbPb AOD analysis, 100GB input are ~200K events
To achieve > 95% efficiency, this type of job requires ~3MB/sec/core
Clear efficiency dependency on the input rate
For an average 24-core node, this translates to 72MB/sec/node
Not reasonable to expect the centres to support such rates
This job clearly belongs in a train…
• Ttree cache enabling clearly improved the overall
job efficiency
• The LEGO trains system is highly efficient in average,
independent on the type of analysis
Study not in this talk
• Users do a variety of things on the Grid – they are
very proficient in using the resource
• The job efficiencies vary a lot, depending the tasks
• If we concentrate only on the low efficiency, the two
main reasons appear to be
Analysis of MC ESDs
Tasks with very high I/O component
What can we do
Continue pushing more users toward analysis trains
• Cannot do this without the help of PWGs
Continue pushing for AOD-based analysis
• No effort is spared to have up-to-date AODs for all
Send alerts to low-efficiency users (similar to the ill-fated
‘high memory users’)
Preventive measure – lower quotas until situation improves
These will be mostly ignored
This will be very unpopular
Push individual analysis elsewhere… ideas and capacities
Reduce the size of the containers
This will coincidentally help with our acute disk situation