Maximum Entropy RESM 575 Spring 2011 Lecture 7 From Pearson, 2007 2 Maximum entropy (Phillips et al. 2008) History E. T. Janes 1957 Thermodynamics Inference and information theory 3 The Maximum Entropy Method Origins: Jaynes 1957, statistical mechanics Recent use: machine learning, eg. automatic language translation To estimate an unknown distribution: 1. Determine what you know (constraints) 2. Among distributions satisfying constraints: Output the one with maximum entropy 4 5 What is it? Maxent is a general-purpose method for making predictions or inferences from incomplete information. Its origins lie in statistical mechanics (Jaynes, 1957), that explores applications in diverse areas such as Astronomy portfolio optimization image reconstruction statistical physics and signal processing. 6 Like other Bayesian models… Uses prior information Maxent is an alternative to methods of inference of classical statistics 7 Maximum Entropy Principle The fact that a certain probability distribution maximizes entropy subject to certain constraints representing our incomplete information, is the fundamental property which justifies the use of that distribution for inference; it agrees with everything that is known but carefully avoids assuming anything that is not known (Jaynes, 1990). 8 Why? Introduced as a general approach for presence only modeling of species distributions, suitable for all existing applications involving presence-only datasets. 9 Modeling species distributions Yellow-throated Vireo occurrence points … environmental variables Predicted potential distribution 10 Estimating a probability distribution Given: Map divided into cells Environmental variables, with values in each cell Occurrence points: samples from an unknown distribution Our task is to estimate the unknown probability distribution Note: The distribution sums to 1 over the whole map Most probability values will be very small Different from estimating probability of presence 11 Entropy More entropy : more spread out, closer to uniform distribution 2nd law of thermodynamics: - Without external influences, a system moves to increase entropy Maximum entropy method: - Apply constraints to remove external influences - Species spreads out to fill areas with suitable conditions 12 Using Maxent for Species Distributions “Features” “Constraints” “Regularization” 13 Features impose constraints precipitation Feature = environmental variable, or function thereof sample average temperature find distribution distribution p p of such that find maximum entropy such that for all all features features ff:: mean(f) mean(f) == sample sample average average of of ff for 14 Features Environmental variables or functions thereof. Maxent has these classes of features (others are possible): 1. 2. 3. 4. Linear Quadratic Product Binary (indicator) 5. Threshold … … … … variable itself square of variable product of two variables membership in a category … 1 0 Environmental variable 6. Hinge … 1 0 Environmental variable 15 Constraints Each feature type imposes constraints on output distribution Linear features … mean Quadratic features … variance Product features … covariance Threshold features … proportion above threshold Hinge features … mean above threshold Binary features (categorical) … proportion in each category 16 Regularization precipitation confidence region true mean sample average temperature find distribution p of maximum entropy such that Mean(f) in confidence region of sample average of f 17 The Maxent distribution … is always a Gibbs distribution: qλ(x) = exp(Σj λjfj(x)) / Z Z is a scaling factor so distribution sums to 1 fj is the j’th feature λj is a coefficient, calculated by the program 18 Maxent is penalized maximum likelihood Log likelihood: LogLikelihood(qλ) = 1/m Σi ln(qλ(xi)) where x1 … xm are the occurrence points. Maxent maximizes regularized likelihood: LogLikelihood(qλ) - Σj βj|λj| where βj is the width of the confidence interval for fj Similar to Akaike Information Criterion (AIC). 19 Output When Maxent is applied to presence-only species distribution modeling, the pixels of the study area make up the space on which the Maxent probability distribution is defined, Pixels with known species occurrence records constitute the sample points, and the features are climatic variables, elevation, soil category, vegetation type or other environmental variables, and functions thereof. 20 To note Sometimes both presence and absence occurrence data are available for the development of models, in which case general-purpose statistical methods can be used (for an overview of the variety of techniques currently in use, see Corsi et al., 2000; Elith, 2002; Guisan and Zimmerman, 2000; Scott et al., 2002). 21 Opportunity However, while vast stores of presence-only data exist, (records etc.) absence data are rarely available, Poorly sampled areas, remote, difficult… Absence data may be of questionable value in many situations 22 23 Background 16 modeling methods 226 well surveyed species in 6 regions of the world 24 The authors used three statistics, the area under the Receiver Operating Characteristic curve (AUC), correlation (COR) and Kappa, to assess the agreement between the presence-absence records and the predictions. 25 Maximum Entropy Only useful when applied to testable information. (whether a given distribution is consistent with it) Given testable information, the maximum entropy procedure consists of seeking the probability distribution which maximizes information entropy, subject to the constraints of the information. This constrained optimization problem is typically solved using the method of Lagrange multipliers. 26 Michael Dougherty – GIS Project Manager WVDNR Develop statewide conservation prioritization map based on the distribution of: 1. Species of Greatest Conservation Need (SGCN) 2. Habitats of Concern 3. Existing public land The Challenge: • Develop distribution models for 500 state-tracked species • Species include: plants, herps, birds, bats, mammals, aquatics • Modeling process must be defensible, transparent, and repeatable 27 Occurrence data: 1. State Natural Heritage Program “Biotics” database: • Biologists collect “Source Features” • “Source Features” are grouped into “Element Occurrences” (EOs) • EOs represent known populations • Species identification is accurate and spatial accuracy documented • Use of EOs seems to greatly reduce spatial autocorrelation 2. Community Ecologists’ Vegetation Plots Database 28 Predictor Variables: Developed a broad range of predictor variables: • • • • • • • Climate Landcover Terrain Ecoregions Geology Soils Disturbances 29 Workflow Overview: • Build an array of workstations to run models • Develop R scripts to automate running the maxent models by iterating through all 500 species • Develop web-based map viewer to assist biologists in reviewing maxent model results • Perform patch and connectivity analysis using FunConn • (TBD) Assign weights to patches and connectors 30 31 32 33