Monte Carlo Simulation Fall 2013 By Yaohang Li, Ph.D. Department of Computer Science

advertisement
Monte Carlo Simulation
Fall 2013
By Yaohang Li, Ph.D.
Department of Computer Science
Old Dominion University
yaohang@cs.odu.edu
Administrivia
• Class Web Page
– http://www.cs.odu.edu/~yaohang/cs714814
– Syllabus
• Class Policy
– Class Notes
• Posted before class
• Read class notes before class
– Assignments
• Posted after class
• Pay attention to the due dates
• Blackboard
– Posting grades
– Sending out emails to class
Administrivia
• Instructional E-Mail Addresses
– yaohang@cs.odu.edu
• Instructor: Yaohang Li
– Office phone: 757-683-6001x5085
– Office location: 3212 E&CS
– Office hours:
• T, T: 1:00PM-3:00PM
• by appointment
Administrivia
• Grading Policy
– (6~8) Assignments 80%
• Programming Assignment
• Or Theoretical Proof
• Late Assignment Policy
– 0~24 hrs: -5%
– 24~48 hrs: -10%
– >48 hrs: grade = 0
– (1) Presentation/Term Paper: 20%
Administrivia
• Textbook
– J. M. Hammersley and D. C. Handscomb. Monte Carlo
Methods. Methuen, London, 1964.
– Handouts
Honor Code
• All assignments, unless explicitly specified, are to be
completed on your own
• ODU Honor Council
– http://orgs.odu.edu/hc/
• Evidence of cheating, plagiarism, or unauthorized
collaboration will result in a 0 grade for
quiz/assignment/exam
– May have further consequences
How to get help?
• Ask questions in class (or after class)
• Attend office hours
• Email me
– Make sure that you put “CS714 or CS814” in your subject line
– Send it from your .odu account
• It wouldn’t come to my spam folder
– State clearly what you need in your email
How to Get an A in this Class
• Attendance
– Attend class regularly and on time
– Ask questions
• Notes
– Read over class notes before class
– Review class notes after class
• Homework
– Get started as early as possible
– Contact me if you encounter problems
Topics
• Introduction (today)
– Computational Science
– High Performance Computing
– Monte Carlo Methods
• High Performance Computing
– Parallel and Distributed Systems
– MPI and OpenMP
– PBS
• Computer Simulation Methods
– Continuous Simulation
– Discrete Simulation
• Nature of Monte Carlo Methods
• Review of Relevant Elementary Probability Statistics
Topics (Cont.)
• Direct Simulation
• Error Estimation
– Truncation Error and Statistical Error
• Monte Carlo Integration
– Crude Monte Carlo Integration
– Variance Reduction Techniques
• Random Number Generation
– Test of Randomness
– Sampling for Non-uniform Distribution
• Quasi-Monte Carlo
– Difference from Monte Carlo
– Quasirandom Number Generation
Topics (Cont.)
• Markov Chain Monte Carlo
–
–
–
–
Optimization Problem
Metropolis Method
Simulated Annealing
Simulated Tempering
• Monte Carlo and Linear Algebra
• Green’s Function Monte Carlo
– Random Walk
• Quantum Monte Carlo
Topics (Cont.)
• Monte Carlo Application in Science and Engineering
–
–
–
–
–
–
Computational Physics
Computational Material Science
Nuclear Simulation
Computational Biology
Computational Financing
Visualization
What you will learn
•
•
•
•
•
•
Basic Concepts in Computational Science
Usage of High Performance Systems
Simulation Methods
Principles of Monte Carlo Methods
Monte Carlo Applications to Science and Engineering
Interdisciplinary Research Applications
About Me
• I got my Master’s and Ph.D. Degrees from Florida
State University.
• I did my postdoc at Oak Ridge National Laboratory
under University of Tennessee
• I taught 7 years at North Carolina A&T State
University
• My research
– Computational Biology
– High Performance Computing
• How about you?
– Name/Year/Major
– Something interesting about yourself
– Expectation in this class
What is Computational Science?
• Definition: A common characteristic of the field is that
problems…
–
–
–
–
Have a precise mathematical model.
Are intractable by traditional methods.
Are highly visible.
Require in-depth knowledge of some field in science, engineering, or the
arts.
• Computational science is neither computer science, mathematics,
some traditional field of science, engineering, a social science, nor
a humanities’ field. It is a blend.
Computational Science Community
• Computational science projects are always multidisciplinary.
– Applied math, computer science, and…
– One or more science or engineering fields are involved.
• Computer science’s role tends to be
–
–
–
–
–
A means of getting the low level work done efficiently.
Develop and implementation of algorithms
Similar to mathematics in solving problems in engineering.
Oh, yuck… a service role if the computer science contributors are not careful.
Provides tools for data manipulation, visualization, and networking.
• Mathematics’ role is in providing analysis of (new?) numerical algorithms to
solve the problems, even if it is done by computer scientists.
Computational Scientist Requirements
•
•
•
•
•
•
Good understanding of an applied discipline.
High Performance Computing Architecture
Development of Numerical Algorithms
Analysis of the Numerical Algorithms
Parallelization
Visualization
How Significant are Algorithm and Computer
Improvements?
• There is a race to see if computers can be speeded up
through new technologies faster than new algorithms can
be developed.
– Computers have doubled in speed every 18 months over many
decades.
– Some algorithms cause quantum leaps in productivity:
• FFT reduced solve time from O(N2) to O(NlogN).
• Multigrid reduced solve times from O(N3/2) to O(N), which is
optimal.
• Monte Carlo is used when no known reasonable algorithm is
available.
High-Performance Computing Architectures
• Parallel supercomputers
– Multiple processors per node with shared memory on the node (a node is a
motherboard with memory and processors on it).
– Very fast electrical network between nodes with direct memory access and
communications processors just for moving data.
– SGI Origin 3000, IBM SP4, SUN Enterprise, HP Superdome.
• Cluster of PC’s
– Take many of your favorite computers and connect them with a fast
ethernet running 100-1000 Mbs.
– Usually runs Linux, IRIX, True64 UNIX, HP-UX, AIX-L, Solaris, or
Windows NT with MPI and/or PVM.
– Intel, Alpha, or SPARC processors. Intel is most common in clusters of
cheap micros.
• Desktop HPC
– GPGPU
• Large number of streaming processors
• Memory hierarchy
Peak Speeds of Selected Computers
Machine
Mflops (per processor)
CDC 6600
3
Cray 1
160
Cray C90
1000
Compaq Alpha 21264
1000
Cray T90
2200
Intel Pentium 4 (2200MHz)
2200
AMD Athlon XP 1900+
3200
HP PA 8700
3200
Intel Itanium
3400
Sony PlayStation 2
6300
Top Supercomputers
www.top500.org
Network Speeds
Transmission Time (seconds)
Name
Speed (bps)
T3
45M
Cable modem
24 bit screen
Bible
Encyclopedia
Britannica
0.5
1.2
60
30M/10M/2M
3
6
270
T1
1.544M
15
36
30 min
56kbs
53,000
7 min
1 hour
2 days
Computer Simulation
• Simulation
– A computer simulation or a computer model is a computer
program which attempts to simulate an abstract model of a
particular system.
– Useful part of modeling many natural systems
• physics
• chemistry
• biology
• human systems in economics
• social science
• in the process of engineering new technology, to gain
insight into the operation of those systems.
– “Simulation is a lie to tell the truth”
Theoretical Math and Experimental Math
• Theoretical Math
– Deduce conclusions from postulates
• Experimental Math
– Infer conclusions from observations
• Induction and Deduction make the difference
Earliest Experimental Math
• Old Testament: Chronicles: iv: “4:2 Also he made a
molten sea of ten cubits from brim to brim, round in
compass, and five cubits the height thereof; and a line of
thirty cubits did compass it round about.”
Introduction to Monte Carlo Method
• A Monte Carlo method
– provides approximate solutions to a variety of mathematical problems by
performing statistical sampling experiments on a computer.
– In brief, methods consuming random numbers
– Can be applied to
• Stochastic problems
• Deterministic problems
• Characteristics of Monte Carlo methods
–
–
–
–
A natural fit for high-performance computing systems
Monte Carlo methods are stochastic techniques.
Monte Carlo method is very general.
We can find MC methods used in many areas from economics
to nuclear physics to regulating the flow of traffic.
Top 10 Algorithms in Last Century
http://www.siam.org/pdf/news/637.pdf
Numerical Errors
• Round off error
– Error in representing a number in computer operations
• Truncation error
– Approximating an infinite sum with finite sum
• Statistical error (unique in Statistics-based Simulations)
– Amount of observation deviating from the its expected values
Monte Carlo Example:
Estimating p
If you are a very poor dart player, it is easy to imagine throwing
darts randomly at the above figure, and it should be apparent that of
the total number of darts that hit within the square, the number of
darts that hit the shaded part (circle quadrant) is proportional to the
area of that part. In other words,
If you remember your geometry, it's easy to show that
(x, y)
x = (random#)
y = (random#)
distance = sqrt (x^2 + y^2)
if distance.from.origin (less.than.or.equal.to) 1.0
let hits = hits + 1.0
Monte Carlo Applications
•
•
•
•
•
•
•
•
•
•
•
Nuclear reactor design
Quantum chromodynamics
Radiation cancer therapy
Traffic flow
Stellar evolution
Econometrics
Dow-Jones forecasting
Oil well exploration
VLSI design
Protein structure prediction
Material Science
Monte Carlo Methods in Grand Challenges
• Grand Challenges
•
– fundamental problems in science and engineering with potentially broad social,
political, economic, and scientific impact that can be advanced by applying high
performance computing resources.
Example of Monte Carlo Methods in Grand Challenges
– CHAMMP: http://www.epm.ornl.gov/chammp/chammpions.html
• Oak Ridge and Argonne National Labs and NCAR collaborated to improve NCAR’s
Community Climate Model (CCM2).
• A sample visualization of a computer run:
• How did Monte Carlo simulation get its name?
– Named for Monte Carlo, Monaco
• where the primary attractions are casinos
containing games of chance. Games of
chance such as roulette wheels, dice, and
slot machines, exhibit random behavior.
• The random behavior in games of chance is
similar to how Monte Carlo simulation
selects variable values at random to
simulate a model
• The name and the systematic development of
Monte Carlo methods dates from about 1940’s.
• There are however a number of isolated and
undeveloped instances on much earlier
occasions.
History of Monte Carlo Method
• In the second half of the nineteenth century a number
of people performed experiments, in which they
estimate the value of PI
• In 1899 Lord Rayleigh showed that a one-dimensional
random walk without absorbing barriers could provide
an approximate solution to a parabolic differential
equation.
History of Monte Carlo method
• In early part of the twentieth century, British statistical
schools indulged in a fair amount of unsophisticated
Monte Carlo work.
• In 1908 Student (W.S. Gosset) used experimental
sampling to help him towards his discovery of the
distribution of the correlation coefficient.
• In the same year Student also used sampling to bolster
his faith in his so-called t-distribution, which he had
derived by a somewhat shaky and incomplete
theoretical analysis.
Student - William Sealy Gosset (13.6.1876 - 16.10.1937)
This birth-and-death process is suffering from labor pains; it
will be the death of me yet. (Student Sayings)
A. N. Kolmogorov (12.4.1903-20.10.1987)
In 1931 Kolmogorov showed the relationship between
Markov stochastic processes and certain integro-differential
equations.
History (cont.)
• The real use of Monte Carlo methods as a research tool stems
from work on the atomic bomb during the second world war.
• This work involved a direct simulation of the probabilistic
problems concerned with random neutron diffusion in fissile
material; but even at an early stage of these investigations, von
Neumann and Ulam refined this particular "Russian roulette"
and "splitting" methods. However, the systematic development
of these ideas had to await the work of Harris and Herman
Kahn in 1948.
• About 1948 Fermi, Metropolis, and Ulam obtained Monte
Carlo estimates for the eigenvalues of Schrodinger equation.
John von Neumann (28.12.1903-8.2.1957)
History (cont.)
• In about 1970, the newly developing theory of computational
complexity began to provide a more precise and persuasive
rationale for employing the Mont Carlo method.
• Karp (1985) shows this property for estimating reliability in a
planar multiterminal network with randomly failing edges.
• Dyer (1989) establish it for estimating the volume of a convex
body in M-dimensional Euclidean space.
• Broder (1986) and Jerrum and Sinclair (1988) establish the
property for estimating the permanent of a matrix or,
equivalently, the number of perfect matchings in a bipartite
graph.
Buffon’s Needle
• The earliest analytical Monte Carlo method for estimation of the
value of PI
• Experiment Environment
– A tabletop with a number of infinite parallel lines drawn on it
– Equally spaced (spacing is 1 inch)
– A needle (also 1 inch)
• Experiment
– Drop the needle on the table
• the needle crosses one of the lines
• the needle crosses no lines
– Total number of drops (N)
– Total number of crossing lines (C)
• Result
– 2N/C is approaching PI
Summary
• Introduction to Monte Carlo Methods
–
–
–
–
Definition
Examples
History
Monte Carlo Applications
• Introduction to Computational Science
• High Performance Computing
• Computer Simulation
Learning Monte Carlo Method wouldn’t help you win $$$ in
Casino.
What I want you to do
• Enjoy your new semester
Download