Fast, but Approximate, Workflow-Runtime Estimation Using the Bell-Curve Calculus Alan Bundy Joint work with Lin Yang, Conrad Hughes and Dave Berry University of Edinburgh 31 May 2016 1 Overview The Bell Curve Calculus (BCC) Application to quality of service. – Estimating runtimes of e-Science workflows. Evaluation of accuracy and efficiency. – Compared to piecewise estimation (Agrajag) – Faster, but less accurate. 31 May 2016 2 Problem Scientists need to estimate quality of service properties of workflows. – Eg runtime, accuracy, reliability. Not just a number, but range and likelihood, – e.g. probability density function However, propagating PDFs within large workflows is computationally expensive. 31 May 2016 3 Key Idea BCC extends arithmetic operations to normal distributions (aka bell curves). – Analogous to Interval Arithmetic. A bell curve can be fully described with two parameters. – Mean and Standard Deviation . Assume output is also a bell curve – Calculate its & from input ones. – How bad an approximation is this? – How much does it speed up calculations? 31 May 2016 4 Bell Curve BC(,) = x. 31 May 2016 2/22 -(x-) e /(2p)0.5 5 Service Combinators Sequential: Runtime S1 S1 Parallel All: Parallel First: S2 + S Max S Min S Cond S2 S1 S2 Disjunctional: 31 May 2016 S1 Succeed Fail S2 6 Formal Definition of Problem Input Curves: BC(1,1) and BC(2,2). Perfect Curve: Fc(BC(1,1),BC(2,2)). – where c is Seq, PA, PF or D. – Not a bell curve in general. – Use Agrajag to estimate by piecewise approx. Best Bell Curve: BC((Fc (…)),(Fc (…))) – bell curve with and of perfect curve. BCC Estimate: BC(Mc(1,1,2,2),c(1,1,2,2)) – for each value of c. – Is always a bell curve. – Approximating Best Bell Curve, but without piecewise estimation. 31 May 2016 7 Sum of Two Bell Curves BBC Estimate = Perfect Curve 31 May 2016 8 Methodology Generate perfect curves and, hence, best bell curves. Inspect best bell curves and guess Mc and c. Plot errors of Mc and c and curve fit. Use error functions to improve Mc and c . Repeat until accuracy acceptable. Resulting definitions of Mc and c are very messy. 31 May 2016 9 Max of Two Bell Curves 31 May 2016 Worst case when input curves have similar means. 10 Accuracy of BCC on Workflow 31 May 2016 BCC estimate close to best bell curve, but not perfect curve. 11 BCC Efficiency on Workflow Family Both linear, but with very different slopes. 31 May 2016 12 Conclusion BCC shows both range and likelihood of QoS properties. – Tested for workflow runtimes. Extends arithmetic to bell curves. – Only approximate. Less accurate but much more efficient than piecewise estimation. – Good for quick, rough estimate. Extend to other QoS properties. Incorporate into workflow construction tool. 31 May 2016 13