Monte Carlo Simulation Fall 2013 By Yaohang Li, Ph.D. Department of Computer Science Old Dominion University yaohang@cs.odu.edu Administrivia • Class Web Page – http://www.cs.odu.edu/~yaohang/cs714814 – Syllabus • Class Policy – Class Notes • Posted before class • Read class notes before class – Assignments • Posted after class • Pay attention to the due dates • Blackboard – Posting grades – Sending out emails to class Administrivia • Instructional E-Mail Addresses – yaohang@cs.odu.edu • Instructor: Yaohang Li – Office phone: 757-683-6001x5085 – Office location: 3212 E&CS – Office hours: • T, T: 1:00PM-3:00PM • by appointment Administrivia • Grading Policy – (6~8) Assignments 80% • Programming Assignment • Or Theoretical Proof • Late Assignment Policy – 0~24 hrs: -5% – 24~48 hrs: -10% – >48 hrs: grade = 0 – (1) Presentation/Term Paper: 20% Administrivia • Textbook – J. M. Hammersley and D. C. Handscomb. Monte Carlo Methods. Methuen, London, 1964. – Handouts Honor Code • All assignments, unless explicitly specified, are to be completed on your own • ODU Honor Council – http://orgs.odu.edu/hc/ • Evidence of cheating, plagiarism, or unauthorized collaboration will result in a 0 grade for quiz/assignment/exam – May have further consequences How to get help? • Ask questions in class (or after class) • Attend office hours • Email me – Make sure that you put “CS714 or CS814” in your subject line – Send it from your .odu account • It wouldn’t come to my spam folder – State clearly what you need in your email How to Get an A in this Class • Attendance – Attend class regularly and on time – Ask questions • Notes – Read over class notes before class – Review class notes after class • Homework – Get started as early as possible – Contact me if you encounter problems Topics • Introduction (today) – Computational Science – High Performance Computing – Monte Carlo Methods • High Performance Computing – Parallel and Distributed Systems – MPI and OpenMP – PBS • Computer Simulation Methods – Continuous Simulation – Discrete Simulation • Nature of Monte Carlo Methods • Review of Relevant Elementary Probability Statistics Topics (Cont.) • Direct Simulation • Error Estimation – Truncation Error and Statistical Error • Monte Carlo Integration – Crude Monte Carlo Integration – Variance Reduction Techniques • Random Number Generation – Test of Randomness – Sampling for Non-uniform Distribution • Quasi-Monte Carlo – Difference from Monte Carlo – Quasirandom Number Generation Topics (Cont.) • Markov Chain Monte Carlo – – – – Optimization Problem Metropolis Method Simulated Annealing Simulated Tempering • Monte Carlo and Linear Algebra • Green’s Function Monte Carlo – Random Walk • Quantum Monte Carlo Topics (Cont.) • Monte Carlo Application in Science and Engineering – – – – – – Computational Physics Computational Material Science Nuclear Simulation Computational Biology Computational Financing Visualization What you will learn • • • • • • Basic Concepts in Computational Science Usage of High Performance Systems Simulation Methods Principles of Monte Carlo Methods Monte Carlo Applications to Science and Engineering Interdisciplinary Research Applications About Me • I got my Master’s and Ph.D. Degrees from Florida State University. • I did my postdoc at Oak Ridge National Laboratory under University of Tennessee • I taught 7 years at North Carolina A&T State University • My research – Computational Biology – High Performance Computing • How about you? – Name/Year/Major – Something interesting about yourself – Expectation in this class What is Computational Science? • Definition: A common characteristic of the field is that problems… – – – – Have a precise mathematical model. Are intractable by traditional methods. Are highly visible. Require in-depth knowledge of some field in science, engineering, or the arts. • Computational science is neither computer science, mathematics, some traditional field of science, engineering, a social science, nor a humanities’ field. It is a blend. Computational Science Community • Computational science projects are always multidisciplinary. – Applied math, computer science, and… – One or more science or engineering fields are involved. • Computer science’s role tends to be – – – – – A means of getting the low level work done efficiently. Develop and implementation of algorithms Similar to mathematics in solving problems in engineering. Oh, yuck… a service role if the computer science contributors are not careful. Provides tools for data manipulation, visualization, and networking. • Mathematics’ role is in providing analysis of (new?) numerical algorithms to solve the problems, even if it is done by computer scientists. Computational Scientist Requirements • • • • • • Good understanding of an applied discipline. High Performance Computing Architecture Development of Numerical Algorithms Analysis of the Numerical Algorithms Parallelization Visualization How Significant are Algorithm and Computer Improvements? • There is a race to see if computers can be speeded up through new technologies faster than new algorithms can be developed. – Computers have doubled in speed every 18 months over many decades. – Some algorithms cause quantum leaps in productivity: • FFT reduced solve time from O(N2) to O(NlogN). • Multigrid reduced solve times from O(N3/2) to O(N), which is optimal. • Monte Carlo is used when no known reasonable algorithm is available. High-Performance Computing Architectures • Parallel supercomputers – Multiple processors per node with shared memory on the node (a node is a motherboard with memory and processors on it). – Very fast electrical network between nodes with direct memory access and communications processors just for moving data. – SGI Origin 3000, IBM SP4, SUN Enterprise, HP Superdome. • Cluster of PC’s – Take many of your favorite computers and connect them with a fast ethernet running 100-1000 Mbs. – Usually runs Linux, IRIX, True64 UNIX, HP-UX, AIX-L, Solaris, or Windows NT with MPI and/or PVM. – Intel, Alpha, or SPARC processors. Intel is most common in clusters of cheap micros. • Desktop HPC – GPGPU • Large number of streaming processors • Memory hierarchy Peak Speeds of Selected Computers Machine Mflops (per processor) CDC 6600 3 Cray 1 160 Cray C90 1000 Compaq Alpha 21264 1000 Cray T90 2200 Intel Pentium 4 (2200MHz) 2200 AMD Athlon XP 1900+ 3200 HP PA 8700 3200 Intel Itanium 3400 Sony PlayStation 2 6300 Top Supercomputers www.top500.org Network Speeds Transmission Time (seconds) Name Speed (bps) T3 45M Cable modem 24 bit screen Bible Encyclopedia Britannica 0.5 1.2 60 30M/10M/2M 3 6 270 T1 1.544M 15 36 30 min 56kbs 53,000 7 min 1 hour 2 days Computer Simulation • Simulation – A computer simulation or a computer model is a computer program which attempts to simulate an abstract model of a particular system. – Useful part of modeling many natural systems • physics • chemistry • biology • human systems in economics • social science • in the process of engineering new technology, to gain insight into the operation of those systems. – “Simulation is a lie to tell the truth” Theoretical Math and Experimental Math • Theoretical Math – Deduce conclusions from postulates • Experimental Math – Infer conclusions from observations • Induction and Deduction make the difference Earliest Experimental Math • Old Testament: Chronicles: iv: “4:2 Also he made a molten sea of ten cubits from brim to brim, round in compass, and five cubits the height thereof; and a line of thirty cubits did compass it round about.” Introduction to Monte Carlo Method • A Monte Carlo method – provides approximate solutions to a variety of mathematical problems by performing statistical sampling experiments on a computer. – In brief, methods consuming random numbers – Can be applied to • Stochastic problems • Deterministic problems • Characteristics of Monte Carlo methods – – – – A natural fit for high-performance computing systems Monte Carlo methods are stochastic techniques. Monte Carlo method is very general. We can find MC methods used in many areas from economics to nuclear physics to regulating the flow of traffic. Top 10 Algorithms in Last Century http://www.siam.org/pdf/news/637.pdf Numerical Errors • Round off error – Error in representing a number in computer operations • Truncation error – Approximating an infinite sum with finite sum • Statistical error (unique in Statistics-based Simulations) – Amount of observation deviating from the its expected values Monte Carlo Example: Estimating p If you are a very poor dart player, it is easy to imagine throwing darts randomly at the above figure, and it should be apparent that of the total number of darts that hit within the square, the number of darts that hit the shaded part (circle quadrant) is proportional to the area of that part. In other words, If you remember your geometry, it's easy to show that (x, y) x = (random#) y = (random#) distance = sqrt (x^2 + y^2) if distance.from.origin (less.than.or.equal.to) 1.0 let hits = hits + 1.0 Monte Carlo Applications • • • • • • • • • • • Nuclear reactor design Quantum chromodynamics Radiation cancer therapy Traffic flow Stellar evolution Econometrics Dow-Jones forecasting Oil well exploration VLSI design Protein structure prediction Material Science Monte Carlo Methods in Grand Challenges • Grand Challenges • – fundamental problems in science and engineering with potentially broad social, political, economic, and scientific impact that can be advanced by applying high performance computing resources. Example of Monte Carlo Methods in Grand Challenges – CHAMMP: http://www.epm.ornl.gov/chammp/chammpions.html • Oak Ridge and Argonne National Labs and NCAR collaborated to improve NCAR’s Community Climate Model (CCM2). • A sample visualization of a computer run: • How did Monte Carlo simulation get its name? – Named for Monte Carlo, Monaco • where the primary attractions are casinos containing games of chance. Games of chance such as roulette wheels, dice, and slot machines, exhibit random behavior. • The random behavior in games of chance is similar to how Monte Carlo simulation selects variable values at random to simulate a model • The name and the systematic development of Monte Carlo methods dates from about 1940’s. • There are however a number of isolated and undeveloped instances on much earlier occasions. History of Monte Carlo Method • In the second half of the nineteenth century a number of people performed experiments, in which they estimate the value of PI • In 1899 Lord Rayleigh showed that a one-dimensional random walk without absorbing barriers could provide an approximate solution to a parabolic differential equation. History of Monte Carlo method • In early part of the twentieth century, British statistical schools indulged in a fair amount of unsophisticated Monte Carlo work. • In 1908 Student (W.S. Gosset) used experimental sampling to help him towards his discovery of the distribution of the correlation coefficient. • In the same year Student also used sampling to bolster his faith in his so-called t-distribution, which he had derived by a somewhat shaky and incomplete theoretical analysis. Student - William Sealy Gosset (13.6.1876 - 16.10.1937) This birth-and-death process is suffering from labor pains; it will be the death of me yet. (Student Sayings) A. N. Kolmogorov (12.4.1903-20.10.1987) In 1931 Kolmogorov showed the relationship between Markov stochastic processes and certain integro-differential equations. History (cont.) • The real use of Monte Carlo methods as a research tool stems from work on the atomic bomb during the second world war. • This work involved a direct simulation of the probabilistic problems concerned with random neutron diffusion in fissile material; but even at an early stage of these investigations, von Neumann and Ulam refined this particular "Russian roulette" and "splitting" methods. However, the systematic development of these ideas had to await the work of Harris and Herman Kahn in 1948. • About 1948 Fermi, Metropolis, and Ulam obtained Monte Carlo estimates for the eigenvalues of Schrodinger equation. John von Neumann (28.12.1903-8.2.1957) History (cont.) • In about 1970, the newly developing theory of computational complexity began to provide a more precise and persuasive rationale for employing the Mont Carlo method. • Karp (1985) shows this property for estimating reliability in a planar multiterminal network with randomly failing edges. • Dyer (1989) establish it for estimating the volume of a convex body in M-dimensional Euclidean space. • Broder (1986) and Jerrum and Sinclair (1988) establish the property for estimating the permanent of a matrix or, equivalently, the number of perfect matchings in a bipartite graph. Buffon’s Needle • The earliest analytical Monte Carlo method for estimation of the value of PI • Experiment Environment – A tabletop with a number of infinite parallel lines drawn on it – Equally spaced (spacing is 1 inch) – A needle (also 1 inch) • Experiment – Drop the needle on the table • the needle crosses one of the lines • the needle crosses no lines – Total number of drops (N) – Total number of crossing lines (C) • Result – 2N/C is approaching PI Summary • Introduction to Monte Carlo Methods – – – – Definition Examples History Monte Carlo Applications • Introduction to Computational Science • High Performance Computing • Computer Simulation Learning Monte Carlo Method wouldn’t help you win $$$ in Casino. What I want you to do • Enjoy your new semester