Some Statistical Issues in Climate Science Research

advertisement
Some Statistical Issues
in
Climate Science Research
PSW Climate Sciences Workshop,
Oct. 15-16, 2003, Albany
by
Haiganoush K. Preisler
Environmental Statistics Unit
Common Data Types in Climate Research
1) Time Series: Y(t)
(a) unequispaced series
80
70
60
Temperature
90
(most time series methods are for equispaced data)
||||| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| |||||||||||||||||||||
160
180
200
220
Time
240
260
Data from deep sea cores measured at various
depths: Y(τ(d))
(b) How to estimate τ, the depth-time association, mapping?
(c) Finding relationships between two time series
2) Extreme events
Highest Recorded Temperatures
Degrees
Fahrenheit
Degrees
Celsius
Place
Date
World (Africa)
El Azizia, Libya
Sept. 13, 1922
136
58
North America (U.S.)
Death Valley, Calif.
July 10, 1913
134
57
Asia
Tirat Tsvi, Israel
Dead Sea, Palestine
June 21, 1942
129
54
Australia
Cloncurry, Queensland
Jan. 16, 1889
128
53
Europe
Seville, Spain
Aug. 4, 1881
122
50
South America
Rivadavia, Argentina
Dec. 11, 1905
120
49
Canada
Midale and Yellow Grass,
Saskatchewan, Canada
July 5, 1937
113
45
April 29, 1912
108
45.6
Aug. 5, 1924
96
36
Jan. 5, 1974
59
15
Dec. 27, 1978
7.5
–14
• Most statistical
Tuguegarao, Philippines
methods are for Oceania
Persian Gulf (sea-surface)
means.
Antarctica
Vanda Station, Scott Coast
• Poor prediction South Pole
of extremes.
• Statistics of extremes
(Generalized extreme value distribution,
Pareto, truncated Pareto, etc.)
3) Spatial-temporal data
• Statistical downscaling and interpolation problems
• Spatial-temporal processes - examples
a) Fire occurrence data : N(x,y,t)
point process
(=infinite series of random variables)
b) Size or cost of fire: A(x,y,t)
marked point process
c) Height of tree : H(x,y,t)
marked point process
An example from fire prediction/risk
Task:
Characterize relationships (if any) between weather/fuel
variables and probability of fire occurrence
(and large fires)
Goal:
Short and long-term predictions of fire season
severity
• Fire occurrence is a spatial-temporal point process model
Assume a conditional intensity function
λ ( x, y, t | θ ) = Pr ob{dN ( x, y, t ) = 1 | H t } / dxdydt
N = number of fires in (x,x+dx] x (y,y+dy] x (t,t+dt]
Ht = history up to time t
• Discrete Case
N = number of fires in (x,x+∆x] x (y,y+∆y] x (t,t+∆t]
(∆x, ∆y, ∆t) = km x km x day voxel (volume-pixel)
λ = probability of fire in a km^2 area at (x,y) on day=day
Data
• number of fires in Zone W2
• weather, fuel, topography at ???
• values at nearest weather station?
(elevation, slope, fuel type, temperature, …)
• Work at km x km x day voxel level
• Add estimated probabilities at voxel level to arrive
at estimates for number of fires per zone per week
(per month etc.)
Explanatory variables:
1. At all voxels with an observed fire occurrence.
2. At all voxels with no observed fire.
300
200
0
100
Number of fires
400
500
Historic weekly number of fires in Oregon Federal lands
1990
1992
1994
Oregon:
federal lands, 8 years data = 578,192,400 voxels
1996
Year
15,786 = 0.0027% with fire
Sample (S)
Include all voxels with fire and a sample (π%) of
voxel with no fire.
S: for Oregon 8 years
15,786 voxels with fire
58,094 voxels with no fires (π = .012%)
λk = Pr ob{N k = 1 | H t }
γ k = Pr ob{N k = 1 | H t , k ∈ S}
How to relate γk (a parameter estimable from the sampled
data) to λk (the parameter of interest).
300
200
100
0
Km
400
500
600
Probability of fire occurrence
Estimated odds ratios relative to 8 year average
0
200
400
600
Km
Some results from the fire/risk problem
Conditional probability of large fire (> 100 acres)
500
600
Estimated odds ratios relative to 8 year average
300
1
100
200
1
1
1
0
Km
400
1
0
1
200
400
Km
600
0.6
0.5
0.4
0.3
0.2
Estimated Mutual Information
0.7
Probability of fire occurrence
Fuel
DBT
RH
BI
ER
SC
FP
TH
ST
F
Model
ST : spatial –temporal (ie, lat, long, elevation, and day in year)
F : Full Model includes lat, long, elevation, day in year, TH, RH, DBT
Summary
• Many ‘interesting’ statistical issues
• Identify areas where further statistical
research is needed
• Collaboration between climate research
scientists and statisticians
Download