WE PROBABLY COULD HAVE MORE FUN TALKING ABOUT THESE TRAFFIC STOPPERS KSU Monitoring Designs # 1 WHO CLEARLY HAVE THE RIGHT OF WAY! BUT… KSU Monitoring Designs # 2 DESIGNING MONITORING SURVEYS OVER TIME (PANEL SURVEYS) POWER, VARIANCE and RELATED TOPICS N. Scott Urquhart Senior Research Scientist Department of Statistics Colorado State University Fort Collins, CO 80527-1877 KSU Monitoring Designs # 3 OUTLINE Anatomy Of Sampling Studies Of Ecological Responses Through Time Collaborator = Tony Olsen, EPA, WED http://www.oregonstate.edu/instruct/st571/urquhart/anatomy/index.htm Urquhart, N.S. (1981). Anatomy of a study. HortScience 16:621-627. Elaboration on Survey Designs – GRTS – Work of Don Stevens Temporal Designs Power to detect trend – joint with Tom Kincade Uses components of variance Current work = estimating variance Work of Sarah Williams, finishing MS this month KSU Monitoring Designs # 4 A CONTEXT “EMAP-TYPE SITUATIONS” EMAP = US EPA’S Environmental Monitoring and Assessment Program Estimate status, changes, and trends in selected indicators of our nation’s ecological resources on a regional scale with known confidence. Estimate status, changes, and trends in the extent and geographic coverage of our nation’s ecological resources on a regional scale with known confidence. Describe associations between indicators of anthropogenic stress and indicators of condition. KSU Monitoring Designs # 5 WHO MUST COMMUNICATE Ecologists & Other Biologists Statisticians Geographers Geographic Information Specialists Information Managers Quality Assurance Personnel Managers, at Various Levels KSU Monitoring Designs # 6 “SAMPLING” A WORD OF MANY MEANINGS A statistician often associates it with survey sampling An ecologist may associate it with the selection of local sites or material A laboratory scientist may associate it with the selection of material to be analyzed from the material supplied Common general meaning, varied specific meanings KSU Monitoring Designs # 7 THE SPECIAL NEED Communication Demands a Distinction Between The local process of evaluating a response, and The statistical selection of a sampling unit, for example, A lake A point on a steam A point in vegetation The terms Response design Sampling design or survey design Can be used to make this distinction KSU Monitoring Designs # 8 BASIC ROLES Survey Design Tells Us Where To Go to Collect Sample Information or Material Response Design Tells Us What To Do Once We Get There But These Two Components Exist in a Broader Context KSU Monitoring Designs # 9 AN IMPORTANT DISTINCTION Monitoring Strategy Conceptual Impacted by objectives Addressable without regard to the inference strategy Inference Strategy Places to evaluate the response Relation between points evaluated and the population Ie, the basis for inference KSU Monitoring Designs # 10 SAMPLING STUDIES OF ECOLOGICAL RESPONSES THROUGH TIME HAVE Monitoring Strategy Universe model Statistical population Domain design Response design These components exist regardless of the inference strategy Inference Strategy Survey design Temporal design Quality assurance design These components exist for any monitoring strategy KSU Monitoring Designs # 11 The UNIVERSE MODEL Reality (Universe): Ecological Entity Within a Defined Geographic Area to be Monitored Model of the Universe: Development of monitoring approach requires construction of a model for the universe Elements Of The Universe Model: Set of Entities Composing the Entire Universe of Concern KSU Monitoring Designs # 12 The UNIVERSE MODEL Population Description And Its Sampling Require Definition Of the “Units” in the Population Discrete units: Lakes may be viewed this way Individual trees can be viewed this way, too Continuous structure in space of some dimension: 2-SPACE: Forests or Agroecosystems 1-SPACE: Streams 3-SPACE: Groundwater KSU Monitoring Designs # 13 A CONTINUOUS MODEL FOR STREAMS Strahler Orders Second Order First Orders First Orders First Orders First Order KSU Monitoring Designs # 14 The STATISTICAL POPULATION The Collection of Units (as modeled) Over Some Region of Definition Spatial Temporal Spatial and Temporal Population Definition Could Include Features Which Depend on Response Values EX: acid sensitive streams at upper elevations KSU Monitoring Designs # 15 The DOMAIN Design Specifies Subpopulations or “Domains” of Special Interest May Specify Meaningful Comparisons Between Domains Similar to “planned comparisons” in experimental design situations Domain design may depend in response values EX: Warm Versus Cold Water Lakes KSU Monitoring Designs # 16 The RESPONSE DESIGN The Response Design Specifies The process of obtaining a response At an individual element (site) Of the resource During a single monitoring period Response: What Will Be Determined on an Element Needs to be responsive to the objectives of the monitoring activity KSU Monitoring Designs # 17 The INFERENCE STRATEGY Is The Basis For Scientific Inference Provides The Connection Between Objectives and the Monitoring Strategy Monitoring Strategy Usually Must Rely On Obtaining Information on a Subset Of All Possible Elements in the Universe Specifies Which Elements of the Universe Will Have Responses Determined on Them Can Be Based on Either Judgment selection of units Inferential validity rests on knowledge of relation between the universe and the units evaluated – Why do a study if you know this much about the population? Probability selection of units The focus here KSU Monitoring Designs # 18 The SURVEY Design Probability Based Survey Designs are Considered Here May Be Somewhat Limited To Sedentary Resources Positive Features -- As An Observational Study Permit clear statistical inference to well-defined populations Measurements often can be made in natural settings, giving to greater realism to results KSU Monitoring Designs # 19 The SURVEY DESIGN - CONTINUED Disadvantages Limited control over predictor variables Restricts causative inference Usually will produce inaccessible sampling points Good - for inference Bad - for logistics KSU Monitoring Designs # 20 The TEMPORAL Design The TEMPORAL DESIGN specifies the pattern of revisits to sites selected by the Survey Design Sampled population units are partitioned into one (degenerate case) or more PANELS. Each population unit in the same panel has the same temporal pattern of revisits. Panel definition could be probabilistic or systematic Several temporal designs follow after a brief discussion of the rest of the Anatomy, and a bit on site selection. KSU Monitoring Designs # 21 QUALITY ASSURANCE DESIGN Defines Those Activities Intended to Provide Data of Known Quality: Blind duplicates Accepted chemical standards, etc Can Provide Valid Estimates of the Variance Of Pure Measurement Error KSU Monitoring Designs # 22 ON SITE SELECTION Systematically Selected Sites Good for means & totals, but do not support design-based estimate of variance Probably OK for large areas like national forests, Systematic designs can systematically miss things that have a natural layout. EX: Triangular grid (deliberately skewed) in early EMAP got fowled up with – Coastline in the Northeast – The canal network in Florida – Lakes east of the Cascade Mountain Range in Oregon How to select spatially balanced, but random sites? KSU Monitoring Designs # 23 GENERALIZED RANDOM TESSELLATION STRATIFIED (GRTS) DESIGN Due to Don Stevens – see references Allows A continuous population model Variable density sampling by defined areas Accommodates an “imperfect frame” = reality Sequential addition of points while maintaining spatial balance Differing measurements Lots of points for inexpensive measures A subset for more expensive measures A further subset for very expensive measures Implemented in Southern California Bight KSU Monitoring Designs # 24 GENERALIZED RANDOM TESSELLATION STRATIFIED (GRTS) DESIGN Two GIS-based implementations EMAP R code operates on ARC “Shape” files, and returns points there Begin at http://www.epa.gov/nheerl/arm/ http://www.epa.gov/nheerl/arm/designpages/monitdesign/monitoring_design_info.htm http://www.epa.gov/nheerl/arm/documents/design_doc/psurvey.design_2.2.1.zip STARMAP – Dave Theobald RRQRR operates completely in ArcGIS http://www.nrel.colostate.edu/projects/starmap/rrqrr_index.htm Both Allow Variable (spatial) Sampling Rates Generally much better than stratification (We can talk about this more if you want) KSU Monitoring Designs # 25 THE FOLLOWING MATERIAL WAS ADAPTED FROM Urquhart, N.S. and T.M Kincaid (1999). Designs for detecting trend from repeated surveys of ecological resources. Journal of Agricultural, Biological and Environmental Statistics 4: 404 - 414. Initially presented at the invited conference Environmental Monitoring Surveys Over Time, held at the University the Washington, Seattle, in 1998 KSU Monitoring Designs # 26 MOTIVATING SITUATION In 1986 Oregon Department of Fisheries and Wildlife Sought a “One Time” Probability Sampling Design To Survey Coastal Salmon. They Used It In 1990. It showed earlier estimates of salmon returns to spawn to have been grossly overstated. Consequence: continue to repeat an available design. How Good Is The Repeated Use Of Such a Design For Estimating Trend? KSU Monitoring Designs # 27 CONCLUSIONS General: Power for Trend Detection Planned revisits are far superior to obtaining revisits from random “hits” Year Variance: Power Deteriorates Fast as Increases Site Variance: 2 YEAR No problem with revisit designs. Without revisits it increases residual variance. Sampling Rate: Power Increases with Sampling Rate (No surprise!) KSU Monitoring Designs # 28 EVALUATION CONTEXT General Perspective Finite population sampling But model assisted A generalization of the “error analysis” perspective of samplers But recognizing realities of natural resource sampling Specific Perspective Finite population, like of stream segments. Response exists continuously in time, or at least for reoccurring blocks of time. Take independent samples at different points in time (during an “index window”) KSU Monitoring Designs # 29 EVALUATION CONTEXT (CONTINUED) Model: Sites (or stream segments) = a random effect Years = a random effect, but may contain trend Residual = a random effect Specific evaluation time Variation introduced by collection protocol Crew effect, if present – (often present for large surveys) “Measurement error” - broadly interpreted KSU Monitoring Designs # 30 PANEL PLANS = “TEMPORAL DESIGNS” Sampled Population Units are Partitioned into One (Degenerate Case) or More Panels Each population unit in the same panel has the same temporal pattern of revisits. Panel definition could be probabilistic or systematic Specific Plans Always revisit Never revisit repeated surveys Random revisits and other plans KSU Monitoring Designs # 31 TEMPORAL DESIGN #1: ALWAYS REVISIT = ONE PANEL (This is Wayne Fuller’s “PURE PANEL”) PANEL 1 TIME PERIOD ( ex: YEARS) 1 2 3 4 5 6 7 8 9 10 11 12 13 ... X X X X X X X X X X X X X KSU Monitoring Designs # 32 TEMPORAL DESIGN #2: NEVER REVISIT = NEW PANEL EACH YEAR (INDEPENDENT SURVEYS IN A LARGE POPULATION) PANEL 1 2 3 4 5 6 7 8 9 1 2 X X TIME PERIOD ( ex: YEARS) 3 4 5 6 7 8 9 10 11 12 13 ... X X X X X X X KSU Monitoring Designs # 33 TEMPORAL DESIGN #3: ROTATING PANEL like NASS PANEL 1 2 3 4 5 6 7 8 9 1 X 2 X X TIME PERIOD ( ex: 3 4 5 6 7 8 X X X X X X X X X X X X X X X X X X X X X X X X X X X YEARS) 9 10 11 12 13 ... X X X X X X X X X X X X X X X KSU Monitoring Designs # 34 TEMPORAL DESIGN #3: ROTATING PANEL A Rotating Panel Design Is The Temporal Design Used By The National Agricultural Statistical Service (US - “NASS”) This Temporal Design Is “Connected” In The Experimental Design Sense It is fairly well suited for estimation “status,” But not nearly particularly powerful for detecting trend over intermediate time spans KSU Monitoring Designs # 35 TEMPORAL DESIGN: SERIALLY ALTERNATING (ORIGINAL EMAP) TIME PERIOD ( ex: YEARS) PANEL 1 2 3 4 5 6 7 8 9 10 11 12 13 ... 1 X X X X 2 X X X 3 X X X 4 X X X This Temporal Design Is “Unconnected” in the Experimental Design Sense. KSU Monitoring Designs # 36 TEMPORAL DESIGN #5: AUGMENTED SERIALLY ALTERNATING (CURRENTLY USED BY EMAP FOR SURFACE WATERS) TIME PERIOD ( ex: YEARS) PANEL 1 2 3 4 5 6 7 8 9 10 11 12 13 ... 1 2 3 4 1A 1B … 2A … X X X X X X X X X X X X X X X X X X X X X X X X X X X This Temporal Design Is “Connected” in the Experimental Design Sense. KSU Monitoring Designs # 37 TEMPORAL DESIGN #6: RANDOM PANELS YEAR PANEL 1 2 3 … 1 X 2 X 3 X 4 X X 5 X X 6 X X 7 X X X NO VISIT NUMBERS OF OCCURENCES N = 240 N = 600 SAMPLE 1 SAMPLE 2 SAMPLE 1 SAMPLE 2 37 38 35 9 12 11 2 36 35 34 9 10 11 5 46 46 46 6 6 6 2 48 48 49 6 5 5 1 96 100 442 438 KSU Monitoring Designs # 38 STATISTICAL MODEL Consider A Finite Population Of Sites {S1 , S2 , … , SN } and a Time Series Of Response Values At Each Site: {Y1 (t ), Y2 (t ),, YN (t )} and their average: Y (t ) A finite population of time series Time is continuous, but suppose Only a sample can be observed in any year, and Only during an index window of, say, 10% of a year KSU Monitoring Designs # 39 STATISTICAL MODEL -- II AGAIN CONSIDER THE UNDERLYING TIME SERIES DURING AN INDEX WINDOW {Y1 (t ), Y2 (t ), , YN (t )} and their averages: Yi (), Y (t ), and Y (). 2SITE = var{Yi ()}, 2 YEAR var{Y (t )} 2RESIDUAL var{Yi (t ) Yi () Y (t ) Y ()} KSU Monitoring Designs # 40 PART OF A TIME SERIES DURING AN INDEX WINDOW RESPONSE VALUES 20 EU |V D|W 15 2 RESIDUAL 10 5 3.4 3.5 3.6 3.7 YEARS KSU Monitoring Designs # 41 STATISTICAL MODEL -- III {Yi (t )} {Yij } i indexes sites R where S Tj indexes " years" Yij Y (Yi Y ) (Y j Y ) (Yij Yi Y j Y ) Y Si Tj Eij 2 and Si ~ (0, 2SITE ), Tj ~ (0, YEAR ), and Eij ~ (0, 2RESIDUAL ), with these random variables otherwise uncorrelated. KSU Monitoring Designs # 42 STATISTICAL MODEL -- IV If P Indexes Panels, Then Sites are nested in panels: p( i ) and Years of visit are indicated by panel with npj > 0 or npj = 0 for panels visited or not visited in year j The vector of cell means ( of “visited” cells) has a covariance matrix S : ch 2 cov Ypj S ( 2SITE , YEAR , 2RESIDUAL , n pj ) KSU Monitoring Designs # 43 STATISTICAL MODEL -- V Now Let X Denote a Regressor Matrix Containing a Column Of 1’s and a Column of the Numbers of the Time Periods Corresponding to the Filled Cells. The Second Elements of 1 1 1 (X'S X ) X'S Y , 1 1 cov( ) ( X ' S X ) and Contain an Estimate Of Trend and its Standard Error. KSU Monitoring Designs # 44 TOWARD POWER Ability of a Panel Plan to Detect Trend Can Be Expressed As Power. We Will Evaluate Power in Terms of Ratios of Variance Components: 2 2SITE / 2RESIDUAL and YEAR / 2RESIDUAL and of 0 / RESIDUAL , so approximately, ~ N ( , 2 ) KSU Monitoring Designs # 45 A SIMULATION STUDY TO MAKE POWER COMPARISONS 2SITES 2 RESIDUAL 0, 1.875, 2.5 2 YEARS 0, 0.075, 0.15, 0.3 2 RESIDUAL n = 60 N = 60, 240, 600, 1200, 10,000 ==> Sampling rates of 100%, 25%, 10%, 5%, ~ 0% KSU Monitoring Designs # 46 POWER FOR DETECTING TREND SAMPLING A FINITE POPULATION OF SIZE N 2 2SITES 1875 . and YEARS 0.000 POWER for TREND 1 ALWAYS REVISIT, or EMAP-LIKE 0.8 0.6 N = 60, n = 60 0.4 0.2 0 0 5 10 15 20 TIME ( = YEARS ) KSU Monitoring Designs # 47 POWER FOR DETECTING TREND SAMPLING A FINITE POPULATION OF SIZE N 2 2SITES 1875 . and YEARS 0.000, 0.075, 0.15, 0.30 POWER for TREND 1 ALWAYS REVISIT, or EMAP-LIKE 0.8 0.6 N = 60, n = 60 0.4 0.2 0 0 5 10 15 20 TIME ( = YEARS ) KSU Monitoring Designs # 48 POWER FOR DETECTING TREND SAMPLING A FINITE POPULATION OF SIZE N 2 2SITES 1875 . and YEARS 0.000 POWER for TREND 1 ALWAYS REVISIT, or EMAP-LIKE 0.8 0.6 N = 60, n = 60 0.4 0.2 0 0 5 10 15 20 TIME ( = YEARS ) KSU Monitoring Designs # 49 POWER FOR DETECTING TREND SAMPLING A FINITE POPULATION OF SIZE N 2 2SITES 1875 . and YEARS 0.000 POWER for TREND 1 ALWAYS REVISIT, or EMAP-LIKE 0.8 0.6 0.4 NEVER REVISIT N = 60, n = 60 N = 10,000, n = 60 0.2 0 0 5 10 15 20 TIME ( = YEARS ) KSU Monitoring Designs # 50 POWER FOR DETECTING TREND SAMPLING A FINITE POPULATION OF SIZE N 2 2SITES 1875 . and YEARS 0.000 POWER for TREND 1 N = 60, n = 60 ALWAYS REVISIT, or EMAP-LIKE 0.8 RANDOM REVISIT 0.6 N = 600, n = 60 0.4 NEVER REVISIT 0.2 N = 10,000, n = 60 0 0 5 10 15 20 TIME ( = YEARS ) KSU Monitoring Designs # 51 POWER FOR DETECTING TREND SAMPLING A FINITE POPULATION OF SIZE N 2 2SITES 1875 . and YEARS 0.000 POWER for TREND 1 ALWAYS REVISIT, or EMAP-LIKE 0.8 N = 60, n = 60 RANDOM REVISIT 0.6 N = 600, n = 60 0.4 NEVER REVISIT 0.2 N = 10,000, n = 60 0 0 5 10 15 20 TIME ( = YEARS ) KSU Monitoring Designs # 52 POWER FOR DETECTING TREND: AS A FUNCTION OF TEMPORAL DESIGN POWER for TREND 1 0.8 0.6 4&5 1 0.4 2 ROTATING PANEL 3 0.2 0 0 5 10 15 20 TIME ( = YEARS ) KSU Monitoring Designs # 53 CONCLUSIONS General: Power for Trend Detection Planned revisits are far superior to obtaining revisits from random “hits” Year Variance: Power Deteriorates Fast as Increases Site Variance: 2 YEAR No problem with revisit designs. Without revisits it increases residual variance. Sampling Rate: Power Increases with Sampling Rate (No surprise!) KSU Monitoring Designs # 54 CURRENT WORK Stevens D.L. Jr and A.R. Olsen (2003). Variance estimation for spatially balanced samples of environmental resources. Environmetrics 14: 593-610. Proposed a local estimator for variance. I have been using some variance component estimators. How do these two approaches relate? Should one be used rather than the other? MS Student – Sarah Williams Use local estimator for things like status measures Because it includes some site variance Use components of variance for trend studies Revisits to sites remove most of the effect of that component Currently investigating variance component of trend And its impact on trend detection KSU Monitoring Designs # 55 FUNDING ACKNOWLEDGEMENT The work reported here today was developed under the STAR Research Assistance Agreement CR-829095 awarded by the U.S. Environmental Protection Agency (EPA) to Colorado State University. This presentation has not been formally reviewed by EPA. The views expressed here are solely those of presenter and STARMAP, the Program he represented. EPA does not endorse any products or commercial services mentioned in this presentation. This research is funded by U.S.EPA – Science To Achieve Results (STAR) Program Cooperative # CR - 829095 Agreement KSU Monitoring Designs # 56 20 40 0 20 40 0.0606 0.0612 0.0618 0.0684 0.0703 0.0722 0.0791 0.0845 0.0899 Percent = 25% RATE N = 240 0 20 40 0.0606 0.0612 0.0618 0.0684 0.0703 0.0722 0.0791 0.0845 0.0899 Percent = 10% RATE N = 600 0 Percent = 5% RATE N = 1,200 DISTRIBUTION OF SIMULATED POWER: 10 YEARS SITE VARIANCE = 1.875; YEAR VARIANCE: 0.30 0.10 0.075 0.0606 0.0612 0.0618 0.0684 0.0703 0.0722 0.0791 0.0845 0.0899 Power Power Power KSU Monitoring Designs # 57 20 40 0 Percent 0.144 0.148 0.208 0.222 0.237 0.302 0.342 0.382 0.139 0.144 0.148 0.208 0.222 0.237 0.302 0.342 0.382 0.139 0.144 0.148 0.208 0.222 0.237 0.302 0.342 0.382 20 40 0.139 0 20 40 0 Percent Percent 5% RATE N = =1,200 10% RATE N ==600 25% RATE N ==240 DISTRIBUTION OF SIMULATED POWER: 20 YEARS SITE VARIANCE = 1.875; YEAR VARIANCE: 0.30 0.10 0.075 Power Power Power KSU Monitoring Designs # 58 20 40 0 Percent = 5% RATE N = 1,200 DISTRIBUTION OF SIMULATED POWER: 10 YEARS SITE VARIANCE = 2.50; YEAR VARIANCE: 0.30 0.10 0.075 20 40 0 Percent = 10% RATE N = 600 0.0606 0.0612 0.0618 0.0684 0.0703 0.0722 0.0791 0.0845 0.0899 20 40 0 Percent = 25% RATE N = 240 0.0606 0.0612 0.0618 0.0684 0.0703 0.0722 0.0791 0.0845 0.0899 0.0606 0.0612 0.0618 0.0684 0.0703 0.0722 0.0791 0.0845 0.0899 Power Power Power KSU Monitoring Designs # 59 20 40 0.144 0.148 0.208 0.222 0.237 0.302 0.342 0.382 0.139 0.144 0.148 0.208 0.222 0.237 0.302 0.342 0.382 0.139 0.144 0.148 0.208 0.222 0.237 0.302 0.342 0.382 20 40 0.139 20 40 0 Percent = 25% RATE N = 240 0 Percent = 10% RATE N = 600 0 = 5% RATE N = 1,200 Percent DISTRIBUTION OF SIMULATED POWER: 20 YEARS SITE VARIANCE = 2.50; YEAR VARIANCE: 0.30 0.10 0.075 Power Power Power KSU Monitoring Designs # 60