Methodology easy but important 1 ToC 1. What is performance evaluation about ? 2. Metrics, Load and Goals 3. Hidden Factors 4. The Scientific Method 5. Patterns 2 What is Performance Evaluation ? Load You need to define the load under which your system operates Make the difference between Intensity of the load (e.g. nb jobs per second) Nature of the load Statistical details that may matter: e.g. job sizes are heavy tailed or not Benchmarks are artificial load generators; we will play with one of them Metric Define a metric; examples Response time Power consumption Throughput Define operational conditions under which metric is measured (« Viewpoint », see Chapter 11) Compare Windows vs Linux 6 Syscall Benchmark 7 Memory Access Time 8 Ghostscript 9 Metrics are often Multidimensional 10 A and D are non dominated 11 Know your goals A1 and A3 are comparisons, A2 is an absolute statement E2 is an engineering rule 3. Hidden Factors Factor: an element that may impact the performance (desired factors): intensity of load, number of servers (nuisance factors): time of the day, presence of denial of service attack 13 TCP Throughput Increases with Mobility TCP Throughput Decreases with Mobility Why were we fooled ? Hidden factor had a more important role than the factor we were interested in We interpreted correlation as causality Need to be aware of all factors and incorporate in the analysis Or randomize experiment to reduce impact of hidden factors Simpson’s Paradox A well known phenomenon -- Special case of Hidden Factor paradox when metric is success rate and factors are discrete Berkeley Sex Case 1973 (source: wikipedia) 18 Take Home Message Pitfall number 1 is the presence of hidden factor Any study is susceptible to it Easy for opponents to find 4. Be Scientific Joe measures performance of his Wireless Shop: what would you conclude ? Scientific Method Joe buys 2 more Access Points improvement ? Before After 21 Scientific Method A conclusion can only proven to be wrong Do not draw conclusions unless the experiment was designed to test the statement Measurement 1 suggested that the wireless network was congested, but the experiment was not designed to test this statement Joe should: design an experiment to validate: H1: “the wireless network is the bottleneck” for example: measure the number of collisions / packet loss result: collision · 1%; conclusion: H1 is not valid hypothesis H2: the server is saturated experiment: measure memory utilization : result ¼ 100% 22 Performance After Doubling Server Memory 23 Example from Nitin Vaidya, Mobicom 2000 Tutorial, slides 298-299 24 25 26 27 Use of Scientific Method Recognize a fact Pose a hypothesis Verify the hypothesis on simulations / measurements designed to test it TCP throughput may increase with mobility (1) Duration of link failure period is impacted by speed (2) It has a negative impact on TCP throughput Do more simulations measure distrib of link failure period Verify (1) and (2) How ? 28 Is ATM-ABR better than ATM-UBR ? 29 Take Home Message You should not conclude from an experiment without trying to invalidate the conclusion (Popper, 1934): you should alternate between the roles of Proponent Adversary 5. Patterns These are common traits found in different situations Knowing some of them may save a lot of time Bottlenecks may be your enemy Bottlenecks are like non invited people at a party – they may impose their agenda Previous example: what we are measuring is the bottleneck, not the intended factor 33 Bottlenecks are Your Friends Bottlenecks are Your Friends Simplify your life, analyze bottlenecks ! In many cases, you may ignore the rest 35 Behind a Bottleneck May Hide Another Bottleneck i i’ i” i” 37 i i’ i” i” 38 39 Congestion Collapse Definition: Offered load increases, work done decreases Frequent in complex systems May be due to cost per job increases with load Impatience Rejection of jobs before completion Designer must do something to avoid congestion collapse Eg. Admission control in Apache servers Eg. TCP congestion control Analyst must look for congestion collapse 40 Sources use TCP (= fair scheduling). Increase capacity of link 5 to 100 kb/s; what happens ? 41 Competition Side Effect System balances resources according to some scheduling Apparent paradox: put more resources, some get less 42 No TCP, users send as much can Increase capacity of link 2 from 10 to 1000 kb/s 10 kb/s 43 Competition Side Effect Apparent paradox: put more resources, all get less 44 Museum Audio Guide Low speed USB connections at docking station High speed 45 Latent Congestion Collapse System is susceptible to congestion collapse Low speed access prevents congestion collapse Adding resources reveals congestion collapse 46 Take Home Message Watch for patterns, they are very frequent Bottlenecks Congestion collapse Competition side effects Latent Congestion collapse 47 Now it’s your turn… 48