Time series applied to volcanic data: A review and application to Fuego volcano. Rüdiger Escobar Wolf MTU, September 2010. Outline: 1. What are time series and where do they come from? 2. In what way are time series useful and what can they tell us? 3. How can we analyze time series and what methods are out there? 4. The Fuego case: different domains and different time scales. 5. The larger context of Fuego What are time series and where do they come from? • A variable “varying” in time. • The domain of the variable: physical continuous magnitudes Seismic… velocity. Seismic… RSAM GOES… Thermal But also discrete or even categorical data Where do time series come from? • Measuring and recording the time-varying “variable”. • Discretization. • Sampling resolution. • Reliability & uncertainty. • How well do we know the variable? I. e. the size of an eruption? Degradation with time and size. • Older and smaller events are more difficult to record. • Dataset or “catalogue” completeness. How far back do I think my dataset is reliable for a given size of event? • Implications for the statistical properties of the dataset. 4 VEI 3 2 1 1500 1600 1700 1800 Year 1900 2000 Measuring and recording the time variable • Sampling resolution. • Reliability & uncertainty. • Aggregating data. Dating… • Some geologist are so boring that they end up dating rocks… • Dating prehistoric events: 40Ar/39Ar, 14C, K-Ar, U series, etc… • Point vs. interval data, and how to combine them. 14C 14C 40Ar/39Ar The time of historical events • Chronicles and their interpretation • Facts, myths and everything in between. The year of [fifteen hundred] eighty one, on December twenty six, the volcano started to throw fire more than usual, and it was so much what it threw, and with such a fury, the next day of December twenty seven, through a mouth that it has in the highest part, that from the abundant ash that came out, the air became black and thick, such that people couldn’t see each other [in Antigua Guatemala?]… …that ash reached many leagues from Guatemala, in the province of Xoconusco, where the trees were found to be covered by it… The next month of January, at the beginning of the year [fifteen hundred] eighty two, on the fourteenth of that month, the same volcano started to throw so much fire, that a great mishap was feared, because in the twenty four hours that the fury lasted, one couldn’t see anything from the volcano but rivers of fire and very large rocks made embers, which came out of the volcanoes mouth and came down with enormous fury and impetus… That fire caused much damage from the coast to the southeast, where it ruined a pueblo de indios named San Pedro, two leagues from [Antigua] Guatemala, although there were no deaths, because it happened during the day, and prevented by fear, all the indios escaped with time, abandoning their homes… Ciudad – Real , Antonio de, 1873, Relacion breve y verdadera de algunas cosas de las muchas que sucedieron al Padre Fray Alonso Ponce en las provincias de la Nueva España, siendo comisario general de aquellas partes. Tomo I. Imprenta de la Viuda de Calero. Madrid. The problem of defining discrete events for continuous variables • Different approaches: • Values over threshold • Local peaks In what way are time series useful and what can they tell us? • From very straightforward, i. e. simple trends… • To very complex, i. e. cyclic behavior, etc. Seismic… RSAM Thermal GOES Cyclic behavior at Fuego 2002 – 2007? 2002 2003 2004 2005 2006 2007 Correlation between time series • If they share some phenomenological (causal?) relationship, chances are that they may vary together (co-vary) • Volcanologists (and other geo-scientists) try this a lot! Seismic and thermal Stochastic nature and statistical structure of some time series data • Some element of randomness but not completely unpredictable • The question is “what can be predicted or forecasted”? • Parametric vs. Non-parametric Probability distributions and fitting of data • What is the “distribution of the population from which the sampled set comes from”? • How to fit? • MLE. The recent (2002 – 2007) Fuego dataset Assumptions and elegant lies… • Can we just assume certain things about the distribution of the population? E. g. Normality, independence, Poisson process? • Is this assumption valid/justified? How do we know? • Stationary vs. Non-stationary time series. A note on time series and early warning… • The “ideal” model of increasing risk, acceptable risk threshold and warning/response to that threshold. • Some “real world details”. Crisis time path (development) posibilities 1 Perceived risk (probability?) A lethal eruption WILL happen soon. A lethal eruption will NOT happen soon. II Tends to… III Peak and decrease curve. This case reflects and initial increase in the perceived risk, followed by a rapid decrease, either due to a decrease of the observed activity, or due to a process of desensitizing. Highly concave curve. This case is unlikely to be maintained for a prolonged period of time because of accustomization and desensitizing. It usually tends to become case III over time. IV Linear increase curve. This case represents a steady increase in the perceived risk, pointing towards the actual occurrence of the eruption. This is arguably the best case. Highly convex curve. This case represents the “sudden” or “surprise” scenario, in which warning and evacuation actions can be severely limited (even impossible) due to the short time available to carry them out. This is arguably the worst case. I 0 Time Progressively narrowing time window for evacuation. As the window narrows The lethal the options for action decrease and the event evacuation becomes more difficult. HAPPENS Crisis time path (development) posibilities 1 Perceived risk (probability?) A lethal eruption WILL happen soon. In real life, the changes in perceived risk don’t happen as a continuously varying function of time. They tend to happen as jumps or drops (discontinuities) associated to the occurrence of key events and findings (e. g. the initiation of the eruption or the issue of a warning). A lethal eruption will NOT happen soon. 0 Time Progressively narrowing time window for evacuation. As the window narrows The lethal the options for action decrease and the event evacuation becomes more difficult. HAPPENS Crisis time path (development) posibilities A lethal eruption WILL happen soon. 1 Perceived risk (probability?) II A lethal eruption will NOT happen soon. III IV I 0 Time Progressively narrowing time window for evacuation. As the window narrows The lethal the options for action decrease and the event evacuation becomes more difficult. HAPPENS How can we analyze time series and what methods are out there? • Organizing dataset for analysis. • Database for large, multidimensional datasets… or just a simple table (the simplest database) for small, low dimensionality datasets. Tools and platforms… • From the very basic (and limited!): Excel… • ...to the more complex (and powerful!) Matlab, R, etc… • Level of automatization and available (built in) tools… Depending on how big and good your dataset is you can apply different tools. • Borrowing tools from signal processing community (electronic and communications engineering, seismologists, etc): • Time vs. frequency domain. • Auto and cross-correlation. • Fourier and Laplace transformations. Focus on cyclic behavior and correlation Stochastic structure and statistical methods • Parametric: Choosing a distribution and searching for a “best fit”. • Hypothesizing on why they fit the distribution or how the fit can be interpreted, for instance in physical terms. The recent (2002 – 2007) Fuego dataset Parameter k = 1.55 > 0 Aging process? The Fuego case: Different domains and different time scales. Summarizing… • Prehistoric data: very sparse and uncertain… • goes back to 230 ka, but most relevant for the last 3.5 ka. • A “fairly good” (detailed and time precise) historical record. • Some issues with interpretation for assessing the size, explosivity, and other relevant characteristics of the events. A much more detailed record of recent (since ~1960’s) activity. Combined data sources: • Pre-historic: dating and deposit assessment. • Historic – older: Accounts from witnesses and scientific reports. • Historic – recent: INSIVUMEH bulletins, OVFUEGO / INSIVUMEH records of lava flow lengths and number of daily eruptions, GOES and MODIS / MODVOLC thermal data, RSAM from INSIVUMEH and J. Lyons, SE-CONRED bulletins and personal notes. Some caveats… • Non-stationary: turning the volcano “on” and “off”. • Trends and non-homogeneous processes. • How can we account for that in the time series analysis? Cyclic behavior at Fuego 2002 – 2007? 2002 2003 2004 2005 2006 2007 Clustering of eruptions since 1524? 4 VEI 3 2 1 1500 1600 1700 1800 Year 1900 2000 Thanks! Questions? Discussion?…