Modelling composed behaviour 1 PDF and CDF for a normally distributed variable (mu=0, sigma=1) 1 Normal(0, 1) 0.35 0.8 Density at value 0.3 0.25 0.6 0.2 0.4 0.15 0.1 0.2 0.05 0 0 -4 -3 -2 -1 0 Value 2 1 2 3 4 Probability that variable takes values less than value 0.4 A normally distributed variable and its 11-piece approximation (error 0.004148) 1 PUNormal(0, 1) Normal(0, 1) 0.35 0.8 0.3 Density at value 0.25 0.6 0.2 0.4 0.15 0.1 0.2 0.05 0 0 -4 -2 0 Value 3 2 4 Probability that variable takes values less than value 0.4 PDF and CDF of a bimodal distribution 1 Bimodal 0.5 0.8 Density at value 0.4 0.6 0.3 0.4 0.2 0.2 0.1 0 0 0 1 2 3 4 Value 4 5 6 7 Probability that variable takes values less than value 0.6 PDF and CDF of a less simple distribution 1 Cutoff 0.35 0.8 Density at value 0.3 0.25 0.6 0.2 0.4 0.15 0.1 0.2 0.05 0 0 -5 -4 -3 -2 -1 Value 5 0 1 Probability that variable takes values less than value 0.4 1800 6000 Distribution of Google response times Distribution of Google ping times 1600 5000 1200 4000 1000 3000 800 600 2000 400 1000 200 0 0 0.2 0.4 0.6 Response time (s) 6 0.8 1 1.2 0 Number of pings Number of responses 1400 Two Normal variables and their sum 0.35 1 Normal(4, 2.5) Normal(21, 1) PUNormal(4, 2.5)+PUNormal(21, 1) 0.8 Density at value 0.3 0.25 0.6 0.2 0.4 0.15 0.1 0.2 0.05 0 0 0 10 20 Value 7 30 40 Probability that variable takes values less than value 0.4 Two Normal variables and their maximum 0.35 1 Normal(8, 2.5) Normal(9, 1) Maximum of PUNormal(8, 2.5), PUNormal(9, 1) 0.8 0.3 Density at value 0.25 0.6 0.2 0.4 0.15 0.1 0.2 0.05 0 0 0 5 10 Value 8 15 20 Probability that variable takes values less than value 0.4 Definition of time my $time = Agrajag::Property->new("time", "seconds", # Names "Agrajag::Values::Distributions::PiecewiseUniform", # Base type series => "add", # Numerical behaviour in series parallel_all => "max", # Numerical behaviour in parallel-all parallel_first => "min", # Numerical behaviour in parallel-first ); 9 Definition of cost my $cost = Agrajag::Property->new("cost", "pence", # Names "Agrajag::Values::Distributions::PiecewiseUniform", # Base type series => "add", # Numerical behaviour in series parallel_all => "add", # Numerical behaviour in parallel-all parallel_first => "add", # Numerical behaviour in parallel-first ); 10 Definition of failure rate my $failed = Agrajag::Property->new("failed", "per", "Agrajag::Values::Distributions::PiecewiseUniform", series => "complement_multiply", parallel_all => "complement_multiply", parallel_first => "multiply", ); 11 Effect on various metrics of combining redundant services 1 If first fails, try second Parallel, grab first result Original population 0.01 1e-04 failed 1e-06 1e-08 1e-10 1e-12 1e-14 1e-16 0 0.5 1 1.5 cost 12 2 2.5 Effect on various metrics of combining redundant services 1 If first fails, try second Parallel, grab first result Original population 0.01 1e-04 failed 1e-06 1e-08 1e-10 1e-12 1e-14 1e-16 0 20 40 60 time 13 80 100 120 Effect on various metrics of combining redundant services 110 If first fails, try second Parallel, grab first result Original population 100 90 80 time 70 60 50 40 30 20 10 0 0 0.5 1 1.5 cost 14 2 2.5 Lowest and highest costs Process "merge(0.963420737013613 * series(sample 5, nothing), 0.0365792629863866 * series(sample 5, sample 12))": cost: 0.912921336687748+/-0.168953806141398 pence failed: 0.00141813147151815+/-0.037631374870547 per time: 11.0147572693062+/-6.77101498117215 seconds (conditional) Process "parallel_first(sample 7, sample 3)": cost: 2.3 pence failed: 5.82485889377704e-06+/-0.00241346741531679 per time: 24.2532671224301+/-13.405833396648 seconds (parallel) 15 Lowest and highest values for cost 1 Lowest Highest 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 1 1.2 1.4 1.6 Value 16 1.8 2 2.2 Probability that variable takes values less than value Density at value 1 Lowest and highest times Process "parallel_first(sample 22, sample 6)": cost: 2.08 pence failed: 4.46342962590052e-12+/-2.11268303961588e-06 per time: 5.1738590825544+/-1.22048099490178 seconds (parallel) Process "merge(0.999515637891236 * series(sample 3, nothing), 0.000484362108764458 * series(sample 3, sample 11))": cost: 1.19045045676115+/-0.0204626947535409 pence failed: 1.33872660588175e-05+/-0.00365883681515382 per time: 109.097014736137+/-63.7990881680071 seconds (conditional) 17 Lowest and highest values for time 0.4 0.8 0.35 Density at value 0.3 0.6 0.25 0.2 0.4 0.15 0.1 0.2 0.05 0 0 0 100 200 300 400 Value 18 500 600 700 Probability that variable takes values less than value 1 Lowest Highest 0.45 Lowest and highest failure rates Process "parallel_first(sample 22, sample 16)": cost: 2.07 pence failed: 0 per time: 5.25660344792234+/-1.27088093492597 seconds (parallel) Process "parallel_first(sample 13, sample 12)": cost: 1.9 pence failed: 0.00219570724011997+/-0.0468069023738557 per time: 17.4037570609171+/-6.25604265053334 seconds (parallel) 19 Effect on various metrics of combining redundant services (better services cost more) 1 If first fails, try second Parallel, grab first result Original population 0.01 1e-04 failed 1e-06 1e-08 1e-10 1e-12 1e-14 1e-16 0 0.5 1 1.5 cost 20 2 2.5 Effect on various metrics of combining redundant services (better services cost more) 1 If first fails, try second Parallel, grab first result Original population 0.01 1e-04 failed 1e-06 1e-08 1e-10 1e-12 1e-14 1e-16 0 20 40 60 time 21 80 100 120 Effect on various metrics of combining redundant services (better services cost more) 110 If first fails, try second Parallel, grab first result Original population 100 90 80 time 70 60 50 40 30 20 10 0 0 0.5 1 1.5 cost 22 2 2.5 Lowest and highest costs Process "merge(0.972361037709955 * series(sample 0, nothing), 0.027638 cost: 0.904875066061041+/-0.147542504192504 pence failed: 0.000765704113656396+/-0.0276607630203276 per time: 98.5800379789004+/-56.9295426092033 seconds (conditional) Process "parallel_first(sample 24, sample 23)": cost: 2.3 pence failed: 0 per time: 8.34086497044595+/-2.08869234868699 seconds (parallel) 23 Lowest and highest times Process "parallel_first(sample 22, sample 19)": cost: 2.19 pence failed: 4.46342962590052e-12+/-2.11268303961588e-06 per time: 5.1738590825544+/-1.22048099490178 seconds (parallel) Process "merge(0.999515637891236 * series(sample 13, nothing), 0.00048 cost: 1.06042623865571+/-0.0193625498743187 pence failed: 1.33872660588175e-05+/-0.00365883681515382 per time: 109.097014736137+/-63.7990881680071 seconds (conditional) 24 Lowest and highest failure rates Process "parallel_first(sample 24, sample 23)": cost: 2.3 pence failed: 0 per time: 8.34086497044595+/-2.08869234868699 seconds (parallel) Process "parallel_first(sample 4, sample 3)": cost: 1.97 pence failed: 0.00219570724011997+/-0.0468069023738557 per time: 17.4037570609171+/-6.25604265053334 seconds (parallel) 25 Our experience with GT3 26 GT3 “One Point Oh” • 3α, 3β, 3.0, 3.0.2β(?), 3.0.2, not 3.2 (released end 2003). • Used service data, notification, dynamic invocation, registry, lifecycle management, service proxying. • Moved to pure web services in 2004: – More stable. – More widely used (problem solving easier). – Better documented. – FUD WRT grid vs WS models (subsequently → WSRF). • So lots of “one point oh” issues during those twelve months. • Anachronism, except possibly relevant to new release of GT4? 27 GT3 as product • Transparency — what’s really working, what’s getting priority now ? • Stability — interfaces changed substantially across every release — catch-up. . . • Documentation and examples — too basic, no help for common problems (assumes everything works). • Comments and code — obfuscated. • Testing — test-first philosophy would have improved release quality and provided code examples. 28 GT3 as development environment • Open source good — if not for fixing at least for understanding and work-arounds. • “Google debugging” depends on a good developer community: tick. • GT3/Axis borderline hazy — difficult to tell where your problem is. • Errors usually meaningless. • Debug logs too difficult to obtain. • GT3 container a mistake — no dynamic redeployment. 29 GT3 technologies and choices • Why GAR? — .deb or .rpm solve the same problem with much greater maturity. • MDS and alternatives — more vapourware than FUD — COF (confidence-optimism-faith)? • Potentially hugely useful range of interfaces, but with a performance hit. • GWSDL — necessary but made it difficult to support both Grid and WS in the same application. 30