Benchmarking

Lecture 2c: Benchmarks Benchmarking Benchmark is a program that is run on a computer to measure its performance and compare it with other machines  Best benchmark is the users’ workload – the mixture of programs and operating system commands that users run on a machine.  Not practical  Standard benchmarks Benchmarking Types of Benchmarks  Synthetic benchmarks  Toy benchmarks  Microbenchmarks  Program Kernels  Real Applications Benchmarking Synthetic benchmarks Artificially created benchmark programs that represent the average frequency of operations (instruction mix) of a large set of programs • • • Whetstone benchmark Dhrystone benchmark Rhealstone benchmark Benchmarking • Synthetic benchmarks Whetstone benchmark • First written in Algol60 in 1972, today Fortran, C/C++, • • • • Java versions are available Represents the workload of numerical applications Measures floating point arithmetic performance Unit is Millions of Whetstone instructions per second (MWIPS) Shortcommings: • • Does not represent constructs in modern languages, such as pointers, etc. Does not consider cache effects Benchmarking • Synthetic benchmarks Dhrystone benchmark • • • • • • First written in Ada in1984, today Represents the workload of C version is available Statistics are collected on system software, such as operating system, compilers, editors and a few numerical programs Measures integer and string performance, no floating-point operations Unit is the number of program iteration completions per second Shortcommings: • • • Does not represent real life programs Compiler optimization overstates system performance Small code that may fit in the instruction cache Benchmarking • Synthetic benchmarks Rhealstone benchmark • • • Multi-tasking real-time systems Factors are: • • • • • • Task switching time Pre-emption time Interrupt latency time Semaphore shuffling time Dead-lock breaking time Datagram throughput time Metric is Rhealstones per second 6 ∑ wi . (1/ ti) i=1 Benchmarking Toy benchmarks 10-100 lines of code that the result is known before running the toy program • Quick sort • Sieve of Eratosthenes Finds prime numbers http://upload.wikimedia.org/wikipedia/commons/8/8c/New_Animation_Sieve_of_Eratosthenes.gif func sieve( var N ) var PrimeArray as array of size N initialize PrimeArray to all true for i from 2 to N for each j from i + 1 to N, where i divides j set PrimeArray( j ) = false Benchmarking Microbenchmarks Small, specially designed programs used to test some specific function of a system (eg. Floating-point execution, I/O subsystem, processor-memory interface, etc.) • • Provide values for important parameters of a system Characterize the maximum performance if the overall performance is limited by that single component Benchmarking Kernels Key pieces of codes from real applications. • LINPACK and BLAS • Livermore Loops • NAS Benchmarking • Kernels LINPACK and BLAS Libraries • • LINPACK – linear algebra package • • • • • Measures floating-point computing power Solves system of linear equations Ax=b with Gaussian elimination Metric is MFLOP/s DAXPY - most time consuming routine Used as the measure for TOP500 list BLAS – Basic linear algebra subprograms • LINPACK makes use of BLAS library Benchmarking • Kernels LINPACK and BLAS Libraries • SAXPY – Scalar Alpha X Plus Y • • • Y = a X + Y, where X and Y are vectors, a is a scalar SAXPY for single and DAXPY for double precision Generic implementation: for (int i = m; i < n; i++) { y[i] = a * x[i] + y[i]; } Benchmarking • Kernels Livermore Loops • • • Developed at LLNL Originally in Fortran, now also in C 24 numerical application kernels, such as: • hydrodynamics fragment, • incomplete Cholesky conjugate gradient, • inner product, • banded linear systems solution, tridiagonal linear systems solution, • general linear recurrence equations, • first sum, first difference, • 2-D particle in a cell, 1-D particle in a cell, • Monte Carlo search, • location of a first array minimum, etc. • Metrics are arithmetic, geometric and harmonic mean of CPU rate Benchmarking • Kernels NAS Parallel Benchmarks • • • Developed at NASA Advanced Supercomputing division Paper-and-pencil benchmarks 11 benchmarks, such as: • Discrete Poisson equation, • Conjugate gradient • Fast Fourier Transform • Bucket sort • Embarrassingly parallel • Nonlinear PDE solution • Data traffic, etc. Benchmarking Real Applications Programs that are run by many users • C compiler • Text processing software • Frequently used user applications • Modified scripts used to measure particular aspects of system performance, such as interactive behavior, multiuser behavior Benchmarking Benchmark Suites    Desktop Benchmarks • SPEC benchmark suite Server Benchmarks • • SPEC benchmark suite TPC Embedded Benchmarks • EEMBC Benchmarking SPEC Benchmark Suite  Desktop Benchmarks • •  CPU-intensive • SPEC CPU2000 • • 11 integer (CINT2000) and 14 floating-point (CFP2000) benchmarks Real application programs: • C compiler • Finite element modeling • Fluid dynamics, etc. Graphics intensive • • SPECviewperf • Measures rendering performance using OpenGL SPECapc • • • Pro/Engineer – 3D rendering with solid models Solid/Works – 3D CAD/CAM design tool, CPU-intensive and I/O intensive tests Unigraphics – solid modeling for an aircraft design Server Benchmarks • • SPECWeb – for web servers SPECSFS – for NFS performance, throughput-oriented Benchmarking TPC Benchmark Suite    Server Benchmark Transaction processing (TP) benchmarks Real applications • • • •   TPC-C: simulates a complex query environment TPC-H: ad hoc decision support TPC-R: business decision support system where users run a standard set of queries TPC-W: business-oriented transactional web server Measures performance in transactions per second. Throughput performance is measured only when response time limit is met. Allows cost-performance comparisons Benchmarking EEMBC Benchmarks  for embedded computing systems  34 benchmarks from 5 different application classes: • • • • • Automotive/industrial Consumer Networking Office automation Telecommunications Benchmarking Benchmarking Strategies  Fixed-computation benchmarks  Fixed-time benchmarks  Variable-computation and variable-time benchmarks Benchmarking Benchmarking Strategies  Fixed-computation benchmarks  Fixed-time benchmarks  Variable-computation and variable-time benchmarks Benchmarking Fixed-Computation benchmarks W: fixed workload (number of instructions, number of floating-point operations, etc) T: measured execution time R: speed R  W T Compare Speedup  R1 R2  W / T1 W / T2  T2 T1 Benchmarking Fixed-Computation benchmarks Amdahl’s Law Benchmarking Fixed-Time benchmarks On a faster system, a larger workload can be processed in the same amount of time T: fixed execution time W: workload W R: speed R  T Compare Sizeup  R1 R2  W1 / T W2 / T  W1 W2 Benchmarking Fixed-Time benchmarks Scaled Speedup Benchmarking Variable-Computation and Variable-Time benchmarks In this type of benchmark, quality of the solution is improved. Q: quality of the solution T: execution time Quality improvements per second: Q T

Benchmarking

Related documents

Products

Support

Benchmarking

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib