HASSET A probability event tree tool to evaluate future eruptive scenarios using Bayesian Inference. Presented as a plugin for QGIS. USER MANUAL STEFANIA BARTOLINI1, ROSA SOBRADELO1,2, JOAN MARTÍ1 1 Group of volcanology, (SIMGEO-UB) CSIC, Institute of Earth Sciences “Jaume Almera”, Barcelona, Spain 2 Aon Benfield UCL Hazard Centre, Department of Earth Sciences, University College London, London, UK HASSET 1.0 HASSET 1 1. INTRODUCTION 2. INSTALLATION 2.1 PREVIOUS REQUIREMENTS 3. START HASSET 3.1 INPUT DATA 3.2 LOCATION AND SIZE 3.3 DATA SET TOTAL TIME AND PROBABILITY ESTIMATE TIME INTERVAL 3.4 RUN BUTTON 4. PROBABILITY AT EACH NODE 5. PROBABILITY FOR DIFFERENT SCENARIOS 6. MOST LIKELY SCENARIOS REFERENCES CONTACTS 2 3 5 5 7 8 9 10 11 12 15 19 22 22 1. INTRODUCTION Event tree structures constitute one of the most useful and necessary tools of modern volcanology to assess the volcanic hazard of future eruptive scenarios. It evaluates the most relevant sources of uncertainty in estimating the probability occurrence of a future volcanic event. An event tree is a tree graph representation of events in the form of nodes and branches. Each node represents a step and contains a set of possible branches (outcome for that particular category). The nodes are alternative steps from a general prior event, state, or condition through increasingly specific subsequent events to final outcomes. The nodes are independent and the branches are mutually exclusive and exhaustive, this is, they cannot happen simultaneously and they sum up to one. The objective is to outline all relevant possible outcomes of volcanic unrest at progressively higher degrees of detail and assess the hazard of each scenario by estimating its probability of occurrence within a future time interval. HASSET, Hazard Assessment Event Tree, uses this event tree structure (Fig. 1) to make these estimations based on a statistical methodology called Bayesian Inference. HASSET accounts for the possibility of flank eruptions, as opposed to only central eruptions, geothermal or seismic unrest, as opposed to only magmatic, felsic or mafic composition, as opposed to no composition, relevant volcanic hazards as possible outputs of an eruption, and the distance of that hazard. HASSET introduces the Delta method to approximate the accuracy in the probability estimates, by constructing a one standard deviation variability interval around the expected probability value for each scenario. Figure 1. HASSET Event Tree Structure 3 HASSET is presented as a free software package in the form of a plugin for the open source geographic information system Quantum Gis (QGIS), as a friendly and dynamic graphical user interface (GUI) plugin, which, once properly installed following a few easy steps, creates a new option in the QGIS menu bar called “Volcano”, where the HASSET model is installed. Currently, the software has been developed for Mac OS (tested on version 10.7.4 and above) and Linux (tested on Ubuntu 10.10). The version for Windows OS is under development. HASSET implements the Bayesian Event Tree where the user previously defines a forecasting time interval. The user provides all volcanological data for the analysis, where then HASSET mergers all this information using the Bayesian Event Tree approach. To do that, a user-friendly interface will guide the user through all the steps: To enter all the data for the analysis. To compute the probability estimates for each branch in the event tree, and corresponding variability. To compute the total probability estimate for different scenarios. 4 2. INSTALLATION HASSET package is available upon request to the authors or it can be downloaded online at the website of the CSIC Group of Volcanology of Barcelona (http://www.gvb-csic.es) on the “Software & Databases” tab. 2.1 PREVIOUS REQUIREMENTS - Linux (tested on Ubuntu 10.10 and Ubuntu 12.04) or Mac Os X (tested on Lion and Maverick ) operating systems - Matplotlib http://matplotlib.org/downloads.html Ubuntu 12.04 users: Open Terminal. Then type: sudo apt-get install python-matplotlib Mac Os X users: If you have the following error: “Import Error: matplotlib requires pyparsing”. Please, install pyparsing https://pypi.python.org/pypi/pyparsing/1.5.7 Open Terminal. Then type (you need to go to the folder where pyparsing has been downloaded): sudo python setup.py install - R http://cran.r-project.org Ubuntu 12.04 users: Open Terminal. Then type: sudo apt-get install r-base-core - QGIS (http://www.qgis.org/en/site/forusers/download.html). We recommend stable version 1.7.0, 1.8.0, and 2.0.1. 2.2 WHERE DO I COPY “volcano_plugin” FOLDER? 1.a Linux users: - unrar volcano_plugin.rar in /user/.qgis/python/plugins/* To visualize hidden files (any file that begins with a “.”, in our case “.qgis” on Ubuntu 10.10 or “.qgis2” on Ubuntu 12.04) open Home Folder and in View select Show Hidden Files (CTRL+H). 1.b Mac users: - unrar volcano_plugin.rar in /user/.qgis/python/plugins/* To visualize hidden files (any file that begins with a “.”, in our case “.qgis”): a) Open the terminal (found in /Applications/Utilities/) b) Type the following (without quotation marks) to show hidden files: c) “defaults write com.apple.finder AppleShowAllFiles –bool true” 5 d) Hit enter e) Type the following (without quotation marks) to restart the Finder: “killall Finder” 5.Hit enter (You can turn hidden files back off by doing the same thing, but switching the “true” to “false” in step 2. (¡¡¡In Mac Os X Leopard change “-bool true” with “YES”!!!)) * If you can’t see in /user/.qgis/python/plugins/ (/user/.qgis2/python/plugins/) the “plugins” folder: Open Qgis In the menu select Plugins Manage plugins In search, type: “Zoom to Point” and activate the plugin. Automatically the “plugins” folder is created 2. Open QGIS 3. In the menu Plugins Manage plugins Simulation Models 4. Now in the menu you can see Volcano (Fig. 2) and run HASSET 6 Figure 2. Volcano in QGIS menu 3. START HASSET This is the main GUI (Graphical User Interface) in HASSET (Fig. 3) Figure 3. HASSET main form The main form is composed by: - The selection of input parameters through the “IMPORT DATA (.csv file)” button or manually and their visualization in a table representing the event tree structure - The introduction of the “Dataset Time Window” and “Probability estimate Time Windows” - Save the .csv Probabilities result - “RUN” button - “INFO HASSET” for an overview of HASSET and the reference paper 7 - “UPDATE RESULTS”, a button activated after the first simulation and allows to update results if user changes some values of past data – prior weight – data weight. 3.1 INPUT DATA First of all the user must select the .csv (comma-separated value) file, where the input data for Past Data, Priori Weight, and Data Weight is selected from a drop down menu (Fig. 4). The “BROWSER .csv file” button allows to visualize the folder in your computer with .csv extension. Figure 4. Import .csv file The format of input data for the model is the following (Fig. 5) 8 Figure 5. The format of input data 3.2 LOCATION AND SIZE User has to define the possible locations for the imminent eruption in five different areas. In the example, we have named them as Central, North, South, East and West (Fig. 6). The coverage area for each location would vary for each volcanic system according to topography, surroundings, and/or important topographic barriers which may impose a different level of hazard and risk depending on what side of the volcano the eruption occurs. 9 Figure 6. Location node Such as the location node, the user must express the size of the eruption in terms of either the volcanic explosive index (VEI), or simply the magnitude of the eruption. This node is grouped in four mutually exclusive and exhaustive categories. In the example, we consider the VEI grater than 5, VEI 4, VEI 3, and VEI less than 2 (Fig. 7). Figure 7. Size node 3.3 DATASET AND PROBABILITY ESTIMATE TIME WINDOWS The Dataset is the range of the volcanic activity analyzed. In our example, we consider a range of eruptions in the last eight thousand years. The Probability estimate Time Windows is the time window range to evaluate the probability to have at least one eruption. For example, our dataset is eight thousand years and we want to estimate the probability of at least one eruption in the next one hundred years, we have eighty time intervals of data for the study of one hundred years each. For each branch we count the number of intervals where at least one event of that type 10 has occurred. For example, out of 80 time intervals, 18 observed an episode of unrest and 62 did not (Fig. 8). So, the number of time intervals is the result of the ratio between the dataset time interval and the probability time interval. This is evaluated automatically and checks if this value corresponds to the sum of the introduced episodes of unrest and no unrest. Figure 8. Number of Time Intervals 3.4 RUN BUTTON Once all data have been entered user can click the RUN button (Fig. 9) to evaluate the probability and the standard deviation for each branch of all the eight nodes, and displays them in table and graphical format for simplicity. 11 Figure 9. RUN button If user wants to save the results of the probabilities at each node, the plugin allows to save them as .csv files, so user needs to introduce the output name and the path. 4. PROBABILITY AT EACH NODE Once all data is entered, HASSET computes a probability estimate and corresponding variability for each branch of all the eight nodes, and displays them in table (Fig. 10) and graphical format for simplicity. 12 Figure 10. Unrest node The pie chart displays graphically these probabilities and user can export the pie chart graph (Fig. 11). Figure 11. Unrest pie chart On the event tree graph we see the node of unrest highlighted in green to show the user at what point of the event tree we are (Fig. 12). The same applies for all of the remaining seven nodes. 13 Figure 12. Unrest node: info point of the event tree Furthermore, information about each node is present (Fig. 13). Figure 13. Info Unrest node Figure 12 shows an example of how the Unrest tab displays the output on HASSET. We see the initial believes entered for this node in column of Priori and Data weight, and we can see the 80 time windows of which 18 had an episode of unrest and 62 did not. With this data, the 14 probability estimate of having at least one unrest in the next time window of 100 years is 23.17%, versus the complement 76.83% of no unrest. Figures 14 and 15 show the results for the Hazards and Distance node, where after observing the data we get that Fallout and Lava flows account for nearly 80% of the total probability estimate of the occurrence of these particular hazards in the next 100 years, while the possibility of any of this hazard affecting an medium/large area is nearly 50%. Figure 14. Hazards node Figure 15. Distance node 5. PROBABILITY FOR DIFFERENT SCENARIOS HASSET computes the total probability estimate for different scenarios. From the window 15 “Scenarios” (Fig. 16) user can select all scenarios of interest to be evaluated by clicking on the desired branch (Fig. 17). Figure 16. HASSET menu: Scenarios Figure 17. Scenarios As we can see in Figure 18, to compute the probability there are five steps: 1. Select a scenario of interest to be evaluated by clicking on the desired branch 2. Push “EVALUATE TOT PROBABILITY OF SELECTED SCENARIO” button. The relative probability and standard deviation will be evaluated. 3. Refresh the scenario already evaluated and to calculate following scenarios start again from step 1. User can compare up to five scenarios. 4. To evaluate the total sum of probability for all the scenarios selected, push “TOTAL SUM” button. 5. To analyze other scenarios, user can delete the previous scenarios with the “Delete Scenarios” button and start again from step 1. 16 Figure 18. Scenarios steps Note that some eruptive scenarios are formed of different combinations, as in Figure 19, where different types of unrest can trigger a magmatic eruption, and so HASSET allows computing and summing up all cases. 17 Figure 19. Scenarios Also, the Hazards node allows to select more than one option (Figures 20, 21), since the same eruption could produce different hazards. Figure 20. Scenario 18 Figure 21. Scenarios 19 6. MOST LIKELY SCENARIOS "Most likely scenarios" (Fig. 22) is a HASSET option to compute the five most likely scenarios to occur up to a particular node. Figure 22. Most likely scenarios The interpretation of these probabilities in terms of significance is down to the personal judgment of the user, unless we have similar volcanoes to compare with. However, we can identify the relative importance of several scenarios by comparing their probabilities of occurrence, providing an important tool for the decision maker to redirect resources and prioritize emergency plans based on what’s most likely to occur. User can visualize the most likely scenario for each node just pushing the button associated with that node (Fig. 23). Figure 23. Most likely scenarios buttons In Figure 24 we see that the most likely scenarios to happen are magmatic eruptions, mainly of VEI 2 or less, on the North or East sides of the volcano producing lava flows and fallouts of short distance. 20 Figure 24. Extent node: most likely scenarios 7. DATA WEIGHT SENSITIVITY ANALYSIS The DATA WEIGHT input value accounts for the epistemic uncertainty, this is, the uncertainty related to how well we know the system. The more data and information we have, the better we know the system and so the more confident we feel about assigning the PRIOR WEIGHTS to for our prior probabilities, so we will need more new evidence to modify those PRIOR WEIGHTS. The DATA WEIGHT allows controlling for this level of confidence. Figure 25. Sensitivity analysis for DATA WEIGHT 21 As we see in Figure 25 the impact on the posterior probability varies significantly depending on the value of the data weight. The more confident we are, the larger the value of the DATA WEIGHT, meaning the lower the epistemic uncertainty, and so a new data point will modify less our prior weight of 0.5. The impact on the posterior probabilities levels out after a large enough DATA WEIGHT value, meaning that after a particular value, the arrival of new evidence will have the same impact on the posterior probabilities. The results will vary depending on how much new data arrives, eg. if we receive 1 or 18 new data points, that is the reason why the value for the DATA WEIGHT is not upper bounded. The minimum value is 1, which accounts for the state of total uncertainty. REFERENCES Sobradelo, R., Bartolini, S., Martí, J. HASSET: A probability event tree tool to evaluate future eruptive scenarios based on Bayesian Inference. Presented as a plugin for QGIS. Bull. Of Volcanology, 2013, in press. Sobradelo R., Martí J. (2010) Bayesian event tree for long-term volcanic hazard assessment: application to Teide-Pico Viejo stratovolcanoes, Tenerife, Canary Islands. J Geophysical Res 115. doi:10.1029/2009JB006566 CONTACTS STEFANIA BARTOLINI: mailto:sbartolini@ictja.csic.es ROSA SOBRADELO: mailto:rsobradelo@ictja.csic.es JOAN MARTÍ MOLIST: mailto:jmarti@ictja.csic.es Web GVB: http://www.gvb-csic.es/index_ENG.htm The Institute of the Earth Sciences Jaume Almera (ICTJA) Lluís Solé i Sabarís s/n 08028 Barcelona (Spain) Phone: +34 93 409 54 10 22