Wageningen 2004 # 1 USES OF POWER IN DESIGNING LONG-TERM ENVIRONMENTAL SURVEYS N. Scott Urquhart Department of Statistics Colorado State University Fort Collins, CO 80523-1877 Wageningen 2004 # 2 OUTLINE FOR TONIGHT Long-Term Environmental Surveys Agencies involved Sorts of Summaries of Interest Sources of Variation – Major ones A Statistical Model Superimposed on an Adapted Classical Sampling Model Calculation of Power Using this Model Illustrations General Specific Generalizations - as Time Allows Wageningen 2004 # 3 LONG-TERM ENVIRONMENTAL SURVEYS Objective: To Establish The Current Status Detect Long-Term Trends Evaluate “Extent” of Various Classes Of the Resource(s) of Interest Usually Ecological or Living Resources Agencies = Who US Environmental Protection Agency (EPA)* States and Tribes, and Local Jurisdictions Response to Legislation Like the Clean Water Act Forest Service – “Forest Health” National Park Service* Soil Conservation Service (not the current name) National Marine Fisheries Service ( “ ) National Wetlands Inventory Wageningen 2004 # 4 RESPONSES of INTEREST EPA Variety of Chemical Measures of Water Quality Nitrogen to Heavy Metals to Pesticides Acid Neutralizing Capacity (ANC) Important in Evaluating the Effect of “Acid Rain” Composition of “Bugs” in the Aquatic Community Thought to Contain Better Info on total Effects than Individual Chemicals Fish Populations – Composition, not size Clean Water Act Includes Reporting on Temperature Pollution Wageningen 2004 # 5 RESPONSES of INTEREST (continued) National Park Service (Eg: Olympic NP in WA) Vegetation Bird Populations Composition Size of Various Species Streams/Rivers Fish Populations Macroinvertebrate Communities Extent of Intermittent Streams Health of Glaciers Extent – Shrinking with Global Warming? Composition Wageningen 2004 # 6 RESPONSES of INTEREST (continued II) Grand Canyon National Park Erosion Around Archeological Resources Near-river Terrestrial Environment (GCMRC) Wageningen 2004 # 7 SPATIAL EXTENT Generally Large Areas This is the Way Congress Writes Laws Regions can be very large 12 Western States ND, SC, MT, WY, CO, ID, UT, NV, AZ, WA, OR, CA Midatlantic Highlands parts of PA, VA, WV, DE, MD Individual States Lands of Several related Tribes, or Even Only One Groups of National Parks Groups of Sanitation Districts, or even Individual Sanitation Districts* Wageningen 2004 # 8 SUMMARIES of INTEREST Extent by Classes Track Changes Between Classes National Wetlands Inventory Major focus Has Very Good Graphic Depiction of Class Changes “Status” Often is summarized as an Estimated Cumulative Distribution Function (cdf) Pose some Interesting Statistical Inference Problems Due to Variable Probability Sampling – Almost Always Needed Spatially Continuous Resources – No List Can Exist Wageningen 2004 # 9 EXAMPLE OF STATUS, SUMMARIZED BY A cdf Wageningen 2004 # 10 ESTIMATED CUMULATIVE DISTRIBUTION FUNCTION OF SECCHI DEPTH, EMAP AND “DIP-IN” Wageningen 2004 # 11 SUMMARIES of INTEREST (continued) Trends Directional Changes in Responses Reality: Detection of Short-Term Cycles is Beyond the Resources for the Foreseeable Future Great Big Changes Don’t Require Surveys So Interest Lies in Modest-Sized Long-Term Changes in One Direction This means Changes the Scale of 1% to 2% Per Year Usually a Trend for a Region Regional Summaries of Individual Site Trends Sometimes how trend varies in relation to other things Wageningen 2004 # 12 IMPORTANT COMPONENTS OF VARIANCE 2 ( POPULATION VARIANCE: LAKE ) 2 ( YEAR VARIANCE: YEAR ) 2 ( RESIDUAL VARIANCE: RESIDUAL ) Wageningen 2004 # 13 IMPORTANT COMPONENTS OF VARIANCE ( - CONTINUED) 2 ( LAKE ) POPULATION VARIANCE: VARIATION AMONG VALUES OF AN INDICATOR (RESPONSE) ACROSS ALL LAKES IN A REGIONAL POPULATION OR SUBPOPULATION Wageningen 2004 # 14 IMPORTANT COMPONENTS OF VARIANCE ( - CONTINUED II) 2 ( YEAR VARIANCE: YEAR ) CONCORDANT VARIATION AMONG VALUES OF AN INDICATOR (RESPONSE) ACROSS YEARS FOR ALL LAKES IN A REGIONAL POPULATION OR SUBPOPULATION NOT VARIATION IN AN INDICATOR ACROSS YEARS AT A LAKE DETRENDED REMAINDER, IF TREND IS PRESENT EFFECTIVELY THE DEVIATION AWAY FROM THE TREND LINE (OR OTHER CURVE) Wageningen 2004 # 15 IMPORTANT COMPONENTS OF VARIANCE ( - CONTINUED - III) RESIDUAL COMPONENT OF VARIANCE HAS SEVERAL SUBCOMPONENTS YEAR*LAKE INTERACTION ( 2 RESIDUAL ) THIS CONTAINS MOST OF WHAT MOST ECOLOGISTS WOULD CALL YEAR TO YEAR VARIATION, I.E. THE LAKE SPECIFIC PART INDEX VARIATION MEASUREMENT ERROR CREW-TO-CREW VARIATION LOCAL SPATIAL = PROTOCOL SHORT TERM TEMPORAL Wageningen 2004 # 16 BIOLOGICAL INDICATORS HAVE SOMEWHAT MORE VARIABILITY THAN PHYSICAL INDICATORS – BUT THIS VARIES, TOO Subsequent slides show the relative amount of variability Ordered by the amount of residual variability: least to most (aquatic responses) Acid Neutralizing Capacity Ln(Conductance) Ln(Chloride) pH(Closed system) Secchi Depth Ln(Total Nitrogen) Ln(Total Phosphorus) Ln(Chlorophyll A) Ln( # zooplankton taxa) Ln( # rotifer taxa) Maximum Temperature And others, both aquatic and terrestrial Wageningen 2004 # 17 COMPOSITION OF TOTAL VARIANCE Acid Neutralizing Capacity LAKE COMPONENT OF VARIANCE Ln(Conductance) Ln(Chloride) pH(Closed system) Secchi Depth Ln(Total Nitrogen) Ln(Total Phosphorus) Ln(Chlorophyll A) YEAR Ln( # zooplankton taxa) Ln( # rotifer taxa) Maximum Temperature 0.00 RESIDUAL COMPONENT OF VARIANCE 0.20 0.40 0.60 0.80 1.00 PROPORTION OF VARIANCE Wageningen 2004 # 18 SOURCE OF COMPONENTS OF VARIANCE FROM GRAND CANYON Grand Canyon Monitoring and Research Center Effects of Glen Canyon Dam on the Near-River Habitat in the Grand Canyon At Various Heights Above the River Height Is Measured as the Height of the River’s Water at Various Flow Rates Eg: 15K cfs, 25K cfs, 35K cfs, 45K cfs & 60K cfs Using First Two Years’ Data Mike Kearsley – UNA Design = Spatially Balanced With about 1/3 revisited Wageningen 2004 # 19 COMPOSITION OF TOTAL VARIANCE GRAND CANYON -- NEAR RIVER VEGETATION Richness - 60K cfs Richness - 45K cfs SITE COMPONENT OF VARIANCELAKE COMPONENT YEAR RESIDUAL COMPONENT OF VARIANCE Richness - 35K cfs Richness - 25K cfs Richness - 15K cfs Veg - 60K cfs Veg - 45K cfs Veg - 35K cfs Veg - 25K cfs 0.00 0.20 0.40 0.60 0.80 PROPORTION OF VARIANCE 1.00 Wageningen 2004 # 20 ALL VARIABILITY IS OF INTEREST The Site Component of Variance is One of the Major Descriptors of the Regional Population The Year Component of Variance Often is Small, too Small to Estimate. If Present, it is a Major Enemy for Detecting Trend Over Time. If it has even a moderate size, “sample size” reverts to the number of years. In this case, the number of visits and/or number of sites has no practical effect. Wageningen 2004 # 21 ALL VARIABILITY IS OF INTEREST ( - CONTINUED) Residual Variance Characterizes the Inherent Variation in the Response or Indicator. But Some of its Subcomponents May Contain Useful Management Information CREW EFFECTS ===> training VISIT EFFECTS ===> need to reexamine definition of index (time) window or evaluation protocol MEASUREMENT ERROR ===> work on laboratory/measurement problems Wageningen 2004 # 22 DESIGN TRADE-OFFS: TREND vs STATUS How do we Detect Trend in Spite of All of This Variation? Recall Two Old Statistical “Friends.” Variance of a mean, and Blocking Wageningen 2004 # 23 DESIGN TRADE-OFFS: TREND vs STATUS ( - CONTINUED) VARIANCE OF A MEAN: var (mean) 2 m Where m members of the associated population have been randomly selected and their response values averaged. Here the “mean” is a regional average slope, so "2" refers to the variance of an estimated slope --Wageningen 2004 # 24 DESIGN TRADE-OFFS: TREND vs STATUS ( - CONTINUED - II) Consequently Becomes var (mean) 2 m 1 2 var (regional mean slope) m ( ti t ) 2 Note that the regional averaging of slopes has the same effect as continuing to monitor at one site for a much longer time period. Wageningen 2004 # 25 DESIGN TRADE-OFFS: TREND vs STATUS ( - CONTINUED - III) Now, 2, in total, is large. If we take one regional sample of sites at one time, and another at a subsequent time, the site component of variance is included in 2. Enter the concept of blocking, familiar from experimental design. Regard a site like a block Periodically revisit a site The site component of variance vanishes from the variance of a slope. Wageningen 2004 # 26 STATISTICAL MODEL CONSIDER A FINITE POPULATION OF SITES {S1 , S2 , … , SN } and A TIME SERIES OF RESPONSE VALUES AT EACH SITE: {Y1 (t ), Y2 (t ),, YN (t )} and their average: Y (t ) A FINITE POPULATION OF TIME SERIES TIME IS CONTINUOUS, BUT SUPPOSE ONLY A SAMPLE CAN BE OBSERVED IN ANY YEAR, and ONLY DURING AN INDEX WINDOW OF, SAY, 10% OF A YEAR Wageningen 2004 # 27 STATISTICAL MODEL -- II AGAIN CONSIDER THE UNDERLYING TIME SERIES DURING AN INDEX WINDOW {Y1 (t ), Y2 (t ), , YN (t )} and their averages: Yi (), Y (t ), and Y (). 2SITE = var{Yi ()}, 2 YEAR var{Y (t )} 2RESIDUAL var{Yi (t ) Yi () Y (t ) Y ()} Wageningen 2004 # 28 STATISTICAL MODEL -- III {Yi (t )} {Yij } i indexes sites R where S Tj indexes " years" Yij Y (Yi Y ) (Y j Y ) (Yij Yi Y j Y ) Y Si Tj Eij 2 and Si ~ (0, 2SITE ), Tj ~ (0, YEAR ), and Eij ~ (0, 2RESIDUAL ), with these random variables otherwise uncorrelated. Wageningen 2004 # 29 STATISTICAL MODEL -- IV IF p INDEXES PANELS, THEN Sites are nested in panels: p ( i ) and Years of visit are indicated by panel with npj = 0 or npj> 0 for panels visited in year j. The vector of cell means (of visited cells) has a covariance matrix S : ch cov Ypj S ( 2 SITE , 2 YEAR , 2 RESIDUAL , n pj ) Wageningen 2004 # 30 STATISTICAL MODEL -- V Now let X denote a regressor matrix containing a column of 1s and a column of the numbers of the time periods corresponding to the filled cells. The second elements of 1 1 1 (X'S X ) X'S Y , 1 1 cov( ) ( X ' S X ) and contain an estimate of the regional trend and its variance. Wageningen 2004 # 31 TOWARD POWER Ability of a panel plan to detect trend can be expressed as power. We will evaluate power in terms of these ratios of variance components 2 2SITE / 2RESIDUAL and YEAR / 2RESIDUAL Power depends on the ratios of variance components, the panel plan, and on 0/ RESIDUAL ; approximately, ˆ ~ N (, 2ˆ ) Wageningen 2004 # 32 NOW PUT IT ALL TOGETHER Question: “ What kind of temporal design should you use for Northwest National Parks? We’ll investigate two (families) of recommended designs. All illustrations will be based on 30 site visits per year, a reasonable number given resources. General relations are uninfluenced by number of sites visited per year, but specific performance is. We’ll use the panel notation Trent McDonald published. Wageningen 2004 # 33 RECOMMENDATION OF FULLER and BREIDT Based on the Natural Resources Inventory (NRI) Iowa State & US Department of Agriculture Oriented toward soil erosion & Changes in land use Their recommendation MATH RECOME 100% 50% Pure panel =[1-0] =“Always Revisit” 0% 50% Independent =[1-n]=“Never Revisit” Evaluation context No trampling effect – remotely sensed data No year effects Administrative reality of potential variation in funding from year to year Wageningen 2004 # 34 TEMPORAL LAYOUT OF [(1-0), (1-n)] YEAR 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 [1-0] X X X X X X X X X X X X X X X X X X X X [1-n] X X X X X X X X X X X X X X X X X X X X Wageningen 2004 # 35 FIRST TEMPORAL DESIGN FAMILY 30 site visits per year [1-0] 30 20 10 0 [1-n] 0 10 20 30 ALWAYS REVISIT NEVER REVISIT Wageningen 2004 # 36 POWER TO DETECT TREND FIRST TEMPORAL DESIGN FAMILY NO YEAR EFFECT 1 30:0 20:10 10:20 0:30 0.8 POWER Always Revisit 0.6 0.4 0.2 0 0 5 10 15 Never Revisit 20 YEARS Wageningen 2004 # 37 POWER TO DETECT TREND FIRST TEMPORAL DESIGN FAMILY, MODEST (= SOME) YEAR EFFECT 1 30:0 20:10 10:20 0:30 POWER 0.8 0.6 0.4 0.2 0 0 5 10 15 20 YEARS Wageningen 2004 # 38 POWER TO DETECT TREND FIRST TEMPORAL DESIGN FAMILY BIG (= LOTS) YEAR EFFECT 1 30:0 20:10 10:20 0:30 POWER 0.8 0.6 0.4 0.2 0 0 5 10 15 20 YEARS Wageningen 2004 # 39 SERIALLY ALTERNATING TEMPORAL DESIGN [(1-3)4 ] SOMETIMES USED BY EMAP YEAR 1 FIA X [(1-3)4 ] X 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 X X X X X X X X X X X X X X X 21 X X X X X X X Wageningen 2004 # 40 SERIALLY ALTERNATING TEMPORAL DESIGN [(1-3)4 ] SOMETIMES USED BY EMAP YEAR 1 FIA X [(1-3)4 ] X 2 3 4 5 6 7 8 9 10 11 X X X X X X … X X X … X X Unconnected in an experimental design sense … Very weak design for estimating year effects, if present Wageningen 2004 # 41 … … SPLIT PANEL [(1-4)5 , --- ] YEAR 1 FIA X [(1-4)5 ] X 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 X X X X X X X X X X X X X X X X X X X 21 X X X AGAIN, Unconnected in an experimental design sense Matches better with FIA Still a very weak design for estimating year effects, if present Wageningen 2004 # 42 SPLIT PANEL [(1-4)5 ,(2-3)5 ] YEAR 1 FIA X [(1-4)5 ] X 2 3 4 5 6 7 8 9 X 12 13 14 15 16 17 18 19 20 X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X 21 X X X X 11 X X [(2-3)5 ] 10 X X X X X X X X X X This Temporal Design IS connected Has three panels which match up with FIA Wageningen 2004 # 43 X SECOND TEMPORAL DESIGN FAMILY 30 site visits per year [1-4] 30 20 10 0 [2-3] 0 5 10 15 Wageningen 2004 # 44 POWER TO DETECT TREND SECOND TEMPORAL DESIGN FAMILY NO YEAR EFFECT 1 30:0 20:5 10:10 0:15 POWER 0.8 0.6 0.4 0.2 0 0 5 10 YEARS 15 20 Wageningen 2004 # 45 POWER TO DETECT TREND SECOND TEMPORAL DESIGN FAMILY SOME YEAR EFFECT 1 30:0 20:5 10:10 0:15 POWER 0.8 0.6 0.4 0.2 0 0 5 10 YEARS 15 20 Wageningen 2004 # 46 POWER TO DETECT TREND SECOND TEMPORAL DESIGN FAMILY LOTS OF YEAR EFFECT 1 30:0 20:5 10:10 0:15 POWER 0.8 0.6 0.4 0.2 0 0 5 10 YEARS 15 20 Wageningen 2004 # 47 COMPARISON OF POWER TO DETECT TREND DESIGN 1 & 2 = ROWS YEAR EFFECT 1 NONE SOME 0.6 1 0.6 0.8 0.4 0.6 POWER POWER 0.8 POWER LOTS 0.8 1 0.4 0.4 0.2 0.2 0.2 0 0 0 0 5 10 15 5 10 20 15 20 0 0 YEARS 5 1 1 0.8 0.8 0.8 0.6 0.6 0.6 0.4 0.2 POWER 1 0.4 0.2 0 5 10 YEARS 15 20 15 20 15 20 0.4 0.2 0 0 10 YEARS POWER POWER YEARS 0 0 5 10 YEARS 15 20 0 5 10 YEARS Wageningen 2004 # 48 POWER TO DETECT TREND VARYING YEAR EFFECT AND TEMPORAL DESIGN 1 TEMPORAL DESIGN 2 0.8 NONE POWER TEMPORAL DESIGN 1 0.6 SOME 0.4 LOTS 0.2 0 0 5 10 15 20 YEARS Wageningen 2004 # 49 STANDARD ERROR OF STATUS TEMPORAL DESIGN 1, NO YEAR EFFECT 0.5 SE STATUS 0.4 TOTAL OF 30 SITES 0.3 30:0 20:10 10:20 0:30 0.2 110 SITES VISITED BY YEAR 5 0.1 0 0 5 10 410 SITES VISITED BY YEAR 20 15 YEARS Wageningen 2004 # 50 20 STANDARD ERROR OF STATUS TEMPORAL DESIGN 2, NO YEAR EFFECT 0.5 SE STATUS 0.4 30:0 20:5 10:10 0:15 TOTAL OF 75 SITES 0.3 0.2 0.1 TOTAL OF 150 SITES 0 0 5 10 15 20 YEARS Wageningen 2004 # 51 GENERALIZATIONS Each site can have its own trend These very likely differ How should we approach this reality? There is a cdf of trends across the region Variation in trends can be partitioned Components are very similar to those used for responses: Years Rivers Sites within rivers Wageningen 2004 # 52 ILLUSTRATION Stoddard, J.L., Kahl, J.S., Deviney, F.A., DeWalle, D.R., Driscoll, C.T., Herlihy, A.T., Kellogg, J.H., Murdoch, J.R. Webb, J.R., and Webster, K.E. (2003). Response of Surface Water Chemistry to the Clean Air Act Amendments of 1990. EPA/620/R-02/004. US Environmental Protection Agency, Washington, DC. Wageningen 2004 # 53 Wageningen 2004 # 54 FUNDING ACKNOWLEDGEMENT The work reported here today was developed under the STAR Research Assistance Agreement CR-829095 awarded by the U.S. Environmental Protection Agency (EPA) to Colorado State University. This presentation has not been formally reviewed by EPA. The views expressed here are solely those of presenter and STARMAP, the Program he represents. EPA does not endorse any products or commercial services mentioned in this presentation. This research is funded by U.S.EPA – Science To Achieve Results (STAR) Program Cooperative # CR - 829095 Agreement Wageningen 2004 # 55