Stochastic Optimization Review of Probability Theory Water Resources Planning and Management: M6L1 D Nagesh Kumar, IISc Objectives To introduce the concept of probability To define random variables and its statistical properties To introduce commonly used probability distributions 2 Water Resources Planning and Management: M6L1 D Nagesh Kumar, IISc Introduction Most water resources decision problems face the risk of uncertainty Uncertainty - Randomness of the variables Hydrologic random variables: rainfall in a command area, inflow to a reservoir, evapo-transpiration of crops etc. Optimization models developed for water resources management - optimal decisions with an indication of the associated hydrologic uncertainty Two classical approaches to deal with the hydrologic uncertainty in optimization models are: • Implicit Stochastic Optimization (ISO) • Explicit Stochastic Optimization (ESO) 3 Water Resources Planning and Management: M6L1 D Nagesh Kumar, IISc Introduction… Implicit Stochastic Optimization (ISO) Hydrologic uncertainty is implicitly incorporated Optimization model is a deterministic model Hydrologic inputs are varied with a number of equi-probable sequences Deterministic optimization model is run once with each of the input sequences Output set is then statistically analyzed to generate a set of optimal decisions. 4 Water Resources Planning and Management: M6L1 D Nagesh Kumar, IISc Introduction… Explicit Stochastic Optimization (ESO) Stochastic nature of the inputs is explicitly included through their probability distributions Optimization model is a stochastic model A single run of the model specifies the optimal decisions Two commonly used ESO techniques are: Chance Constrained Linear Programming (CCLP) and Stochastic Dynamic Programming (SDP) (will be discussed in the following lectures) A background of probability theory is essential for ESO 5 Water Resources Planning and Management: M6L1 D Nagesh Kumar, IISc Concept of Probability Sample space S: Area containing all possible outcomes of an experiment An event is one subset of these outcomes Probability is a measure of the likelihood of occurrence of an event Probability can be assessed in two ways: 1. Objective or posterior probability which is based on the observation of events and 2. Subjective or prior probability which is based on experience or judgement. Three basic axioms of probability are: (i) Totality: P(S) = 1 where S is the sample space (ii) Nonnegativity: P(A) ≥ 0 where A is an event (iii) Mutually exclusive: If A and B are two mutually exclusive events, then P A B = P(A) + P(B) For mutual exclusive events P A B = 0. The third axiom after relaxing mutual exclusiveness will be P = P(A) + P(B) - P A B 6 Water Resources Planning and Management: M6L1 D Nagesh Kumar, IISc Random Variable Random variable (r.v): Variable whose value is not known or cannot be measured with certainty (or is nondeterministic) Examples of random variables in water resources: Rainfall, streamflow, time between hydrologic events (e.g. floods of a given magnitude), evaporation from a reservoir, groundwater levels, re-aeration and de-oxygenation rates etc. Any function of a random variable is also a random variable Random variable is denoted using an upper case letter and the corresponding lower case letter is used to denote the value that it takes For example, daily rainfall may be denoted as X. The value it takes on a particular day is denoted as x. We then associate probabilities with events such as X ≥ x, 0 X x 7 Water Resources Planning and Management: M6L1 D Nagesh Kumar, IISc Random Variable… Random variable can be essentially classified into two categories: Discrete Continuous. Discrete r.v.: X can take on only discrete values x1, x2, x3, ...,. Eg.: Number of rainy days in a year which may take values such as, 10, 20, etc. Can assume a finite number of values Continuous r.v.: X can take on all real values in a range Most variables in hydrology are continuous random variables Number of values that a continuous random variable can assume is infinite. 8 Water Resources Planning and Management: M6L1 D Nagesh Kumar, IISc Probability Distributions For discrete random variables, the probability distribution is called a probability mass function For continuous random variables, the probability distribution is called a probability density function (pdf) The cumulative distribution function (CDF), F(x), represents the probability that X is less than or equal to x, i.e. F(x) = P(X x). The probability mass function (PMF) of X is defined as p(x) = P(X = x). 9 Water Resources Planning and Management: M6L1 D Nagesh Kumar, IISc Probability Distributions… (a) PMF and (b) CDF of a discrete random variable F(x) p(x) 1 . . . F(x2) . . . x 1 x2 x3 . . . xN-1 x N x (a) F(x2) = p(x1) + p(x2) x1 x2 x 3 . . . xN-1 x N x (b) For a discrete random variable, there are spikes of probability associated with the values that the random variable assumes. CDF appears as a staircase 10 Water Resources Planning and Management: M6L1 D Nagesh Kumar, IISc Probability Distributions… For a continuous random variable, the probability density function (PDF) is: f x dF x dx where F(x) is the CDF of X. (a) PDF and (b) CDF of a continuous random variable F(x) x2 F x f x dx 2 f(x) 1 F(x2) 0 x2 (a) 11 Water Resources Planning and Management: M6L1 x x2 x (b) D Nagesh Kumar, IISc Probability Distributions… Probability distributions of continuous random variables are smooth curves CDF of a continuous random variable denoted by F(x) is a non-decreasing function with a maximum value of 1 CDF represents the probability that X is less than or equal to x, i.e. F(x) = P(X x). Any function f(x) defined on the real line can be a valid probability density function if and only if i. f(x) 0 for all x, and ii. f(x) 1 for all x. Given the PMF or PDF, the CDF can be obtained as F x p x i i n F x for discrete random variables x f x dx 12 i Water Resources Planning and Management: M6L1 for continuous random variables D Nagesh Kumar, IISc Probability Distributions… Area under the curve to the left of x = a is f(x) Area Pa X b P(X ≤ a) Area under the curve to the left of x = b is P(X ≤ b) Area between x = a and x = b is P[a ≤ X ≤ b]. 0 a b x Probability density function For a continuous random variable, probability of the random variable taking a value exactly equal to a given value is zero because d P X d Pd X d f x dx 0 d 13 Water Resources Planning and Management: M6L1 D Nagesh Kumar, IISc Probability Distributions… F(x) 0.8 F-1(0.5) = 10 F-1(0.8) = 15 0.5 10 15 x Cumulative density function For any given probability α, 0 ≤ α ≤ 1, the value x of the random variable can be determined from the CDF as x = F-1(α). 14 Water Resources Planning and Management: M6L1 D Nagesh Kumar, IISc Statistical properties of random variables Population: Set of all the values taken by a random process Sample: Subset of the population Expected value of (X – x0)r is the rth moment of a random variable X about any reference point X = x0. Mathematically, x x f xdx E X x0 r r 0 for continuous case N E X x0 xi x0 p xi r r i 1 for discrete case where E[ ] is a statistical expectation operator. The first three moments describe the central tendency, variability and asymmetry of the distribution of a random variable. 15 Water Resources Planning and Management: M6L1 D Nagesh Kumar, IISc Statistical properties of random variables… Expected Value or Mean The central tendency is expressed as an expectation as EX x f x dx for continuous case N E X xi p xi for discrete case i 1 The mean of a r.v is denoted by μ is equal to the expected value, i.e., μ = E[X]. 16 Water Resources Planning and Management: M6L1 D Nagesh Kumar, IISc Statistical properties of random variables… Variance Second order central moment. Variance of a continuous r.v. is defined as Var X 2 E X 2 x f x dx 2 The positive square root of variance is called the standard deviation, σ. Coefficient of variation C v Skewness The asymmetry of PDF of a r.v. is measured by skew coefficient Defined as 17 E X 3 3 Water Resources Planning and Management: M6L1 D Nagesh Kumar, IISc Example Probability density function (pdf) of a random variable X is f(x) = 6 x2 = 0 0 x1 else where Determine (1) Cumulative distribution function (cdf); (2) Expected value, E(X); (3) Variance, Var (X); (4) P[X 0.6]; and (5) P[0.4 X 0.7] Solution: 1. Cumulative distribution function x F x f x dx x 6 x 2 dx 2 x 3 0 18 Water Resources Planning and Management: M6L1 0 x 1 D Nagesh Kumar, IISc Example… 2. Expected value, E(X): 3. Variance, Var (X) 1 E X x f x dx x 6 x 2 dx 3 / 2 0 Var X 2 x f x dx 1 x 3 / 2 6 x 2 dx 1.2 2 0 4. P[X 0.6] PX 0.6 1 PX 0.6 1 F 0.6 1 2 0.63 0.568 5. P[0.4 X 0.7] P0.4 X 0.7 PX 0.7 PX 0.4 F 0.7 F 0.4 19 Water Resources Planning and Management: M6L1 D Nagesh Kumar, IISc Commonly used Probability Distributions Three commonly used distributions in water resources are: Normal, Lognormal and Exponential distributions. Normal distribution Also called as Gaussian distribution Two parameters are involved in this distribution: mean and variance A normal random variable with mean μ and variance σ2 is denoted as X ~ N(μ, σ2) PDF of the normal distribution given by f(x) is expressed as 1 x 2 1 f x exp 2 2 20 Water Resources Planning and Management: M6L1 for x D Nagesh Kumar, IISc Normal distribution f(x) PDF: Bell-shaped and symmetric at x = μ μ CDF of a normal distribution is f x x x Normal PDF 1 x 2 1 exp dx 2 2 for x Normal random variables are usually transformed to standardized variate Z with zero mean and unit variance i.e., Z = (X - μ) / σ. Then PDF of Z can be expressed as z2 z exp 2 2 1 for z Values of (z) obtained by numerical integration are used in the computations 21 Water Resources Planning and Management: M6L1 D Nagesh Kumar, IISc Example The monthly streamflow at a reservoir site is represented by a random variable X which follows normal distribution with a mean of 100 units and a standard deviation of 50 units. Find (1) P[X > 150]; (2) P[X ≤ 40] and (3) The flow value which will be exceeded with a probability of 0.8. Solution: The monthly streamflow at a reservoir site is represented by a random variable X which (1) P[X > 150] PX 150 P X / 150 100 / 50 PZ 1 1 PZ 1 1 0.8413 0.1587 22 Water Resources Planning and Management: M6L1 D Nagesh Kumar, IISc Example… (2) P[X ≤ 40] PX 40 P X / 40 100 / 50 PZ 1.2 0.1539 (3) To find P[X ≥ x] = 0.8 P X x 0.8 PZ x / 0.8 1 PZ z 0.8 PZ z 0.2 z x 100 / 50 0.84 x 58 units 23 Water Resources Planning and Management: M6L1 D Nagesh Kumar, IISc Lognormal distribution Used when random variable cannot be negative A r.v. X is lognormally distributed if its logarithmic transform Y=ln(X) has a normal distribution with mean μlnX and variance σ2lnX The PDF of lognormal r.v. is 1 f x 2 X ln X 24 1 ln X ln X exp 2 ln X Water Resources Planning and Management: M6L1 2 for X D Nagesh Kumar, IISc Exponential distribution PDF of an exponential distribution with parameter λ is: e x f x 0 x0 x0 λ > 0 is the parameter of the distribution Mean E[X] = 1 / λ Var (X) = E[X2] - E[X] 2 = 1 / λ2. CDF is given by: 1 e x F x f x dx 0 x 25 Water Resources Planning and Management: M6L1 x0 x0 D Nagesh Kumar, IISc Thank You Water Resources Planning and Management: M6L1 D Nagesh Kumar, IISc