18_performance

System Performance & Scalability i206 Fall 2010 John Chuang QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. John Chuang http://bits.blogs.nytimes.com/2007/11/26/yahoos-cybermonday-meltdown/index.html 2 Computing Trends  Multi-core CPUs  Data centers  Cloud computing  What are the drivers? - scalability, availability, cost-effectiveness John Chuang Servic e Server Client Server Client Server 3 Lecture Outline  Performance Metrics  Availability  Queuing theory - M/M/1 queue  Scalability - M/M/m queue John Chuang 4 What is Performance?  Users want fast response time and high availability  Managers want happy users, and many of them, while minimizing cost  What are standard measures of system performance? John Chuang 5 Performance Metrics     Response time (seconds) Throughput (MIPS, Mbps, TPS, ...) Resource utilization (%) Availability (%) John Chuang 6 Availability QuickTime™ and a decompressor are needed to see this picture. Availability = MTTF / (MTTF + MTTR) -Mean-time-to-failure (MTTF) -Mean-time-to-recover (MTTR) Quic kT ime™ and a dec ompress or are needed to s ee this pi cture. Availability Down-time per year One hour down-time per: 90% 36 days 9 hours 99% 3.7 days 4.1 days 99.9% 9 hours 41.6 days 99.99% 53 minutes 1.14 years 99.999% 5 minutes 11.41 years John Chuang 7 Response Time Client Formulate request Network Server Message latency Queuing time Processing time Message latency Interpret response Adapted from: David Messerschmitt M/M/1 Queue (m = 100) Response Time (s) 0.25 0.2 0.15 0.1 0.05 0 0 John Chuang 0.2 0.4 0.6 Utilization 0.8 1 8 Queuing Theory 1. Arrival Process 5. Customer Population John Chuang 6. Service Discipline 4. System Capacity 2. Service Time Distribution 3. Number of Servers Source: Raj Jain 9 Kendall’s Notation (1953) 1. Arrival Process 5. Customer Population 2. Service Time Distribution 6. Service Discipline 4. System Capacity 3. Number of Servers  A/B/c/k/N/D - A: arrival process B: service time distribution c: number of servers k: system capacity N: population size D: service discipline John Chuang M: Markov (exponential, memoryless, random, Poisson) D: deterministic E: Erlang H: hyper-exponential G: general FCFS: first come first served FCLS: first come last served RR: round-robin etc. 10 Example Systems / /FCFS (simplified as M/M/1) 8 8  M/M/1/ - Markovian (Poisson, memoryless) arrival Markovian service time 1 server Infinite server capacity Infinite arrival stream First-come-first-serve discipline  Other examples: - M/M/1/k (finite capacity) - M/M/m (m servers) - G/D/1 (arbitrary arrival, deterministic service time) John Chuang 11 M/M/1 Queue  Poisson arrival, with average arrival rate of l jobs/sec  Poisson service, with average service rate of m jobs/sec  Single server with infinite queue  System utilization (hopefully < 1): r = l/m  Average number of jobs in system: N =  n·pn = r/(1 - r)  System throughput (if r < 1) : X=l  Average response time (from Little’s Law): R = N/X = 1/(m - l) John Chuang 12 Example: Web Server Web server receives 40 requests/second Web server can process 100 requests/second What is server utilization? At any given time, how many requests are at server (waiting plus being processed)?  What is the mean total delay at server (waiting plus processing)?  What happens when traffic rate doubles?     John Chuang 13 Example: Web Server      l = 40 requests/second m = 100 requests/second Utilization = r = l/m = 40/100 = 40% # of requests = N = r/(1 - r) = 0.67 Average time spent at server = R = N/X = 0.67/40 = 17ms John Chuang 14 Example: Traffic Doubled      l = 80 requests/second m = 100 requests/second Utilization = r = l/m = 80/100 = 80% # of requests = N = r/(1 - r) = 4 Average time spent at server = R = N/X = 4/80 = 50ms (more than doubled!) John Chuang 15 Approaching Congestion      l = 99 requests/second m = 100 requests/second Utilization = r = l/m = 99/100 = 99% # of requests = N = r/(1 - r) = 99 Average time spent at server = R = N/X = 99/99 = 1 second! John Chuang 16 Utilization Affects Performance M/M/1 Queue (m = 100) Response Time (s) 0.25 0.2 0.15 0.1 0.05 0 0 0.2 0.4 0.6 0.8 1 Utilization John Chuang 17 M/M/1/k Queue (Finite Capacity)  r = l/m  N = r/(1-r) – (k+1)rk+1/(1-rk+1)  R = N/X = N/leff - where leff = l(1-Pk) = effective arrival rate - and Pk = rk(1-r)/(1-rk+1) = probability of a full queue  Loss rate = l - leff John Chuang 18 M/M/1/k Response Time M/M/1 and M/M/1/k Queues (m = 100) 0.25 M/M/1 M/M/1/1 Response Time (s) 0.2 M/M/1/2 M/M/1/10 0.15 M/M/1/100 0.1 0.05 0 0 0.2 0.4 0.6 0.8 1 Utilization John Chuang 19 M/M/1/k Throughput Throughput given Service rate m = 100 jobs/sec 100 M/M/1 Throughput (jobs/sec) 90 M/M/1/1 80 M/M/1/2 70 M/M/1/10 60 M/M/1/100 50 40 30 20 10 0 0 0.2 0.4 0.6 0.8 1 Utilization John Chuang 20 Lecture Outline  Performance Metrics  Availability  Queuing theory - M/M/1 queue  Scalability - M/M/m queue John Chuang 21 Scalability  The capability of a system to increase total throughput under an increased load when resources (typically hardware) are added - Cost of additional resource - Performance degradation under increased load John Chuang 22 Scalability Example  Original web server: can process m requests/sec; accepts requests at l/sec  Now request rate increases to 10l/sec and web server is swamped (r = 10l/m)!  Need to add new hardware! John Chuang 23 Which is better?  Option 1: One big web server that can process 10m requests/sec  Option 2: Ten web servers, each can process m requests/sec; each accepts 10% of requests (l/sec per server)  Option 3: Ten web servers, each can process m requests/sec; share single queue (load balancer) that accepts requests at 10l/sec John Chuang 24 Option 2: (ten M/M/1 queues) l m l m l m l m l m l John Chuang 10l 10m m m m Option 3: M/M/10 queue m m l m l m l m l Option 1: M/M/1 queue with big server m m 10l m m m m m 25 M/M/m Queue (m Servers)  r = l/mm  N = mr + rf/(1-r) where and  QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. John Chuang 26 Which is Better? m = 10; m = 100; l = 50 Option 1 (M/M/1 big) Utilization (r) 0.5 Number of requests (N) Response Time (R) Option 2 (ten M/M/1) 0.5 Option 3 (M/M/10) 0.5 1 1*10 5.036 2ms 20ms 10.07ms Remember: Scalability is not just about performance! John Chuang 27

18_performance

Related documents

Products

Support

18_performance

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib