Investigating Space Weather: Improvements in accuracy when a consensus is used Sahana Rao Centreville HS Period 5 2012-2013 Table of Contents Title……………………………………………………….1 Table of Contents…………………………………………2 Introduction……………………………………………..3-4 Materials and Methods………………………………….5-6 Data Reporting …………….…………………………..7-53 Data Analysis and Explanation…………………………..54 Discussion…………………………………………….55-59 Conclusion……………………………………………60-62 Bibliography……………………………………….....63-64 Acknowledgements………………………………………65 Introduction This experiment uses space weather forecasts made by participants, with various levels of experience, in a space weather forecasting contest [15] to compare consensus and individual forecasters in regards to accuracy. The primary drive of space weather is solar wind emitted from the sun. The solar wind is composed of nearly fully ionized particles released from the sun into interplanetary space. Major space weather events that modify solar wind include coronal mass ejections, geomagnetic storms, solar radiation storms, and radio blackouts. Space weather indicators are predicted daily by NOAA’s Space Weather Prediction Center [11], but can be estimated by anyone using solar wind measurements and from previous forecasts. The forecasts used in this experiment consist of three values. The electron flux values measure the amount of electrons coming from the sun in solar wind. The Planetary K index or kp, represents the activity of the Earth’s magnetic field on a scale with 9 values. The solar wind velocity measures the speed of solar wind coming from the sun. The purpose of this experiment is to compare the forecasts made by individual users to the consensus forecast for accuracy. Consensus collaboration for greater accuracy was noted by Francis Galton in 1907 in the Nature magazine [6]. In the article, Galton surveyed many people on the weight of a certain cattle, some of who had knowledge on the weight of cattle and some who didn’t. His results displayed the consensus or ‘vox populi’ with a lower probable error than the individual estimators. The main drive for comparing accuracy using a consensus in this experiment is to note if similar results would arise when forecasting space weather. It is important that space weather is accurately forecasted, as it can have many dangerous effects on the Earth. For example, space weather events such as solar radiation storms and geomagnetic storms expose astronauts and even passengers in commercial airplanes to a large amount of radiation. Spacecrafts and satellites can be damaged by the charged particles and plasma in solar wind. Space weather storms can also cause power outages and disruption of radio signals here on Earth. Accurate space weather forecasts are important, because many of these effects can be mitigated if we know ahead of time. In this experiment, 30 different forecasters around the world with various levels of knowledge of space weather have forecasted the 3 space weather indicators for 101 days. Using the programming software MATLAB, their data is compared to the verification, or actual measurements for that day. The consensus forecast was compiled by averaging all of the forecaster’s forecasts. The consensus forecast is then also compared to the individual forecasters to see which had greater accuracy. The number of days forecasted, the method used to compare forecasts, the programming platform used, the space weather values, and the source of the data files are kept constant in this experiment. The hypothesis is that if all forecasts are combined then the forecasts will be more accurate. This is because the consensus forecast will be an average of all the individual estimates which would reduce the outliers in the individual user’s data. The consensus should be closest to accurate compared to any of the other forecasts. Materials Computer Forecaster Data Files MATLAB Software Verification Data file Procedures 1. Write a program in MATLAB for required analysis and visualization routines a. Load files one by one into a matrix and the transfer into a 3-D matrix. b. Columns of the matrix should be separated for the date, electron flux, Planetary K index, and Solar Wind Velocity. c. Each forecaster should have an individual page in the 3-D matrix, beginning with the verification as the 1st page. d. Remove the persistence in the data to eliminate the possibility of a forecaster receiving a perfect score when he or she didn’t forecast. e. Create a matrix of the consensus forecast by taking the average of all the forecasters for each day. Have one for each variable. f. Be sure to comment specific sections in order to keep track of coding. 2. Compare data to fill in data charts. a. Compare each individual forecaster to the verification to find their accuracy and store errors in a matrix. b. Compare the consensus forecasts to the verification and save the errors in a matrix. 3. Make the plots for the forecaster’s errors. a. Make a line plot comparing forecaster’s error and consensus error with a line for each, using two different colors. b. Repeat step 3a for each forecaster and each variable. 4. Create histograms of the data a. Create a histogram of all of the forecast user’s errors. b. Create a histogram of the consensus histogram. Set it to a different color than the previous histogram. c. Add a red line for the mean on each histogram. d. Repeat steps 4a-4c for each indicator. 5. Save all histograms and plots and label axes and titles correctly for easy future reference. 6. Analyze data and plots and record observations. 7. Create a general statement about how people tend to forecast individually versus as a group. Data Analysis and Explanation of data The data tables shown in Data Reporting Section display the errors for each forecaster and the consensus forecast for each variable, electron flux, Kp, and solar wind. The plots on pages 43-53 show consensus vs forecaster for each forecaster and each variable, where the forecaster line is blue and the consensus line is green. The red and green line plots on page 54 compare the consensus forecast to the verification. The blue histograms show the forecasters errors distribution. The green histograms show the consensus errors distribution. The red line in the histograms indicate the mean. By examining the consensus vs forecaster plots, it is noticeable that the forecaster peaks in error are often much higher than the consensus ones. This implies a greater error made by individual forecasters. For example, forecaster 71 for electron flux had an error of around 1.75 for day 45 where as the consensus only had an error of 0.25. Similarly for solar wind, on day 45, the forecaster had an error 234 where as the consensus only had an error of 57. This and other similar data from other forecasters show higher accuracy for the consensus forecast. The consensus histogram has a smaller standard deviation than the forecaster histogram, implying that the consensus forecasted slightly more accurately. The consensus histograms also show more data in the bins with fewer errors than the bins with greater error in which the forecasters support. Discussion Space Weather has been forecasted daily since 1995 by the NOAA. This experiment seeks an accurate algorithm to predicting space weather by comparing various forecasters that used different methods, results. Space weather was first discovered in 1958 on the Explorer I when it was passing the lower Van Allen radiation belt. Space weather includes major events caused by solar wind or charged particles emitted from the Sun. “Solar wind is the continuous out flow of protons, electrons, and ions as a significant amount of material removed from the sun and into interplanetary space” (Alexander, 2009) A CME, or Coronal Mass Ejection is a massive burst of solar wind and other light isotope plasma towards the Earth, while a solar flare is an intense amount of radiation that is emitted from the sun in solar wind. When solar wind reaches the Earth, it distorts its dipole shaped magnetosphere. Solar wind blows away some of the planetary atmosphere in this process. The magnetosphere’s job is to shelter the planet from solar wind, but the force of the wind tends to distort it instead allowing charged particles to enter and affect Earth’s atmosphere. Changes in the wind speed causes the magnetosphere to fluctuate, changing its position and possibly leaving satellites exposed directly to solar wind. “Where the solar wind and the magnetosphere actually come into contact is called the magnetopause. The magnetopause is in constant flux. It shrinks or expands as the electromagnetic and particle characteristics of the solar wind change. The fluctuations can be pronounced. When matter thrown out by a coronal mass ejection reaches the magnetopause, for example, in the balloon analogy given above, it’s effect is like a fist punching deep into the balloon, its skin—the magnetopause—"stretching" inward to absorb the shock.”(CISM, Weigel, 2007) The three values in the space weather forecasts of my experiment include electron flux, planetary k index, and solar wind velocity. Electron flux values measure the amount of electrons coming from the sun in a specific amount of solar wind. The Planetary K index or KP, represents the activity of the Earth’s magnetic field on a 9 pt scale. The Solar wind velocity measures the speed of solar wind coming from the sun. There are three major types of space weather events, radio blackouts, solar radiation storms, and geomagnetic storms, most resulting from a CME or solar flare. Magnetic storms can occur up to 1-4 days on Earth after a CME has taken place. Radio blackouts are caused by a disturbance of the ionosphere from x-ray emissions of a solar flare. Radio blackouts affect communications at middle to low latitudes, but only on the dayside of Earth. Solar radiation storms are caused by energetic particles that elevate Earth’s levels of radiation from CMEs and solar flares. These storms release high amounts of radiation that can be potentially dangerous to astronauts and passengers in commercial airplanes. Geomagnetic storms are caused by a gust in the solar wind, such as a CME, that energizes Earth’s magnetic field. Space weather events vary in intensity levels, and can be potentially harmful causing power outages or releasing high amounts of radiation. The intensity of these storms varies based on the polarity of solar wind. Space weather has major affects on the Earth. Spacecrafts and aircrafts are vulnerable to high speeds of ionized plasma in space environment. The solar wind can damage these spacecrafts, sensor systems, as well as reach even commercial airplanes. Pipelines suffer from corrosion effects caused by geomagnetically induced currents flowing from the pipe into the soil. “The ionosphere is electrically conducting, so it interacts strongly with the earth’s magnetosphere that surrounds it, reacting quickly to changes there and in the solar wind. One visible manifestation of this interaction is the aurora. Additionally, electromagnetic "storms" can transfer great amounts of energy into the ionosphere, thereby heating and thus expanding the atmosphere—which in turn increases atmospheric drag on satellites. At the same time, intense electric currents continually flow from the magnetosphere through the ionosphere. These currents can also induce large currents and other effects on the earth below—which in turn can affect people and human technology on the ground.” (CISM, Weigel, 2007) Intense space weather events can cause power outages, mobile phone disruptions, and radio signal disruptions. The radiation released from the Sun in a solar radiation storm can be very harmful to astronauts and even passengers in an airplane. CME’s can also cause this intense amount of radiation to arise. Even inside the spacecraft, astronauts absorb lethal doses of this radiation. Forecasting space weather beforehand can allow precaution to be taken or complete prevention of power outages and radiation absorption. In 2011, NOAA, after years of research and experimenting with various methods, similar to my experiment, began using a more sophisticated forecast model that could produce more accurate results. Their years of research included tracking explosions in the sun’s outer atmosphere, solar radiation storms, as well as geomagnetic storms. ““This advanced model has strengthened forecasters’ understanding of what happens in the 93 million miles between Earth and the sun following a solar disturbance,” said Tom Bogdan, director of NOAA’s Space Weather Prediction Center in Boulder, Colo. “It will help power grid and communications technology managers know what to expect so they can protect infrastructure and the public.”(redOrbit, 2011) Before development of this model, NOAA could predict the timing of space weather storms within a 30 hour window, now they can do it within 12. This improvement gives airline operators more reliable information to reroute flights and avoid communication blackouts. Oil drilling, mining and other operations can also avoid conditions that might place operators at risk. The new model simulates physical conditions from the base of the sun’s corona out into interplanetary space, towards Earth and beyond. Scientists can then insert solar events into the model to fully comprehend how a space weather storm might unfold. NOAA began using this new model officially on September 30th, 2011. This experiment uses space weather to compare the accuracy of consensus and individuals. A consensus forecast is made up of the average predictions in a group of forecasters. If the consensus is close to the accurate value, it implies that all the values were somewhere close to the accurate, with a probable error of around 3.1 %. “It appears then, in this particular instance, that the vox populi is correct to within 1 percent of the real value, and that the individual estimates are abnormally distributed in such a way that it is an equal chance whether one of them, selected at random, falls within or without the limits of -3.7 percent and 2.4 percent of their middlemost value.”(Galton, 1907) This statistical experiment conducted by Nature magazine in 1907 showed that the consensus is more accurate because it has compared more values, and has a lower probable error. Another experiment conducted by The Quarterly Journal of Experimental Psychology, compares consensus collaboration to individual recall accuracy. Their research proved that the consensus collaboration provided more accurate results. “However, consensus groups, and not turn-taking groups, demonstrated clear benefits in terms of recall accuracy, both during and after collaboration. Consensus groups engaged in beneficial group source-monitoring processes” (Harris, Barnier, Sutton, 2012). Research has improved the accuracy over time of space weather forecasting, but a more accurate model can still be developed, that can track a storm within a window of fewer than 12 hours. In many recent experiments consensus groups have overall performed more accurately than individual results, supporting the hypothesis towards the higher accuracy of the consensus forecast. Conclusion The hypothesis of the experiment stated that if all the forecasts were combined, then the forecast would be more accurate. In all the plots it is clearly visible that the consensus forecast performed better than the individual forecast. The consensus forecast (green line in plots) has very few errors, and the errors are closer to the verification than the forecaster. For example, for forecaster 66 in electron flux, the user had an error of 1.36 on day 54, where as the consensus only had an error of .7 on that same day. Similarly for KP, on day 45, the forecaster had an error of 3 where as the consensus had a lower error of 1.6. The difference in the error for this particular forecaster is even greater in solar wind. On day 16, the user had an error of -253, where as the consensus had an error of 143. This implies a greater accuracy for the consensus forecast over individual forecasters. The consensus contains an average of all the values, and therefore relies on more data for accuracy. The probable error for the consensus is lower than for each individual forecaster. By taking the average of individual forecasters and compiling the consensus forecast, the outliers of individual users are removed, making the consensus more accurate to the verification. Generally for electron flux, forecasters tend to over predict. The individual forecasters over predict more than the consensus. This can be seen in the histograms. In kp and solar wind velocity, forecasters seem to under predict more than the consensus. When programming and debugging a code, it is very possible that logical errors could occur. These logical errors could affect the results of the data. It would take many advanced programmers to fully look through the code and clear all logical errors. If any errors are left behind after the process of debugging, they could cause changes in the results. Another possible source of error comes from the forecasts itself. For every forecaster that neglects to forecast on a certain day, the program sets its forecasted values previous day’s forecast. This is called persistence. This could mistakenly provide for the claim that this forecaster made a perfect forecast, when they didn’t actually forecast at all. This could change the results of a perceived accurate forecaster who may actually not be so accurate. Another possible error is using kp as a main data source. kp is made up of a 9 point scale. Therefore, there is a 1/9 chance of a forecaster forecasting accurately even when their forecast was simply a guess. This is larger than the probability of accuracy in the other values. It is more likely to have lower errors with kp for this reason. Therefore, using kp as a variable in the data is a possible source of error in the experiment. All the forecasters in the experiment are of various levels of education with different levels of knowledge on space weather. Using the files, the number of experts and amateur forecasters could not be determined. The files could be comprised of many knowledgeable forecasters or many amateur forecasters. This value is uncontrollable and could also potentially adjust the results of my project. Space weather is also often difficult to predict. The verification is uncontrollable, and could have various unpredictable shifts in what seems to be a pattern of space weather, producing storms and CME’s at unpredictable intervals. This makes it more challenging for forecasters to provide accurate descriptions of when the storm will arrive therefore potentially making their errors higher and altering their results. These errors are uncontrollable, and couldn’t be prevented if this experiment was to be performed again. If this experiment were to be repeated again, the education level of the forecasters would also be taken into consideration. Using this knowledge of the data, the problem of the experiment would be to compare the accuracy of forecasts made by forecasters of different levels of education and a consensus forecast. This experiment could test to see if the consensus would still be more accurate in forecasters of a certain level of education. This could be split into many research projects comparing the consensus to a group of knowledgeable forecasters, and comparing it to amateur forecasters. This experiment could therefore also provide a generalization for whether level of education suffices for more accurate forecasts or not. Another future enhancement of this experiment is examining the algorithms of various forecasters to determine the most accurate method to forecast space weather based on the accuracy of each user’s forecasts. Lastly, another idea would be to take the consensus compiled from various participants of the space weather contest and compare it to the official forecasters at NOAA to see if the consensus is still more accurate. In conclusion, the hypothesis was supported by the research and data collected in this experiment. The consensus forecasts performed more accurately than the individual forecasters. Bibliography 1. Alexander, David. The Sun. Santa Barbara, CA: Greenwood Press, 2009. Print. 2. Clarke, Tom. "Space Weather Forecast Step Closer." nature (2001): n. pag. nature. Web. 10 Dec. 2012. 3. Daly, E. "Space Weather." Space Sciences. Ed. Pat Dasch. New York: Macmillan Reference Gale, Cengage Learning, 2002. N. pag. Gale Science in Context. Web. 29 Oct. 2012. 4. ESA's Space Weather Site. etamax space Gmbh, n.d. Web. 29 Oct. 2012. <http://www.esa-spaceweather.net/index.html>. 5. Freedman, David, et al. "The Histogram." Statistics. 2nd ed. New York: W.W. Norton and Company, n.d. 29-53. Print. 6. Galton, Francis. "Vox Populi." Nature 7 Mar. 1907: 450-51. Print. 7. Harris, Celia B., John Sutton, and Amanda J. Barnier. "Consensus Collaboration Enhances Group and Individual Recall Accuracy." The Quarterly Journal of Experimental Psychology 65.1 (2012): 179-94. Maquarie University. Web. 10 Dec. 2012. 8. Kiessling, Dolores. Space Weather Basics. MetEd. University Corporation for Atmospheric Research, 2012. Web. 29 Oct. 2012. <http://www.meted.ucar.edu/spaceweather/basic/index.htm>. 9. Lee Lerner, Ed K., and Brenda Wilmoth Lerner. "Coronal Ejections and Magnetic Storms." The Gale Encyclopedia of Science. 4th ed. Vol. 2. Detroit: Gale, 2008. N. pag. Gale Virtual Reference Library. Web. 29 Oct. 2012. 10. - - -. "Solar Wind." The Gale Encyclopedia of Science. 4th ed. Vol. 5. Detroit: Gale, 2008. 4006-07. Gale Virtual Reference Library. Web. 29 Oct. 2012. 11. NOAA Space Weather Prediction Center. "Space Weather Prediction Model Improves NOAA's Forecast Skill." redOrbit 21 Oct. 2011: n. pag. Web. 10 Dec. 2012. 12. "NOAA Space Weather Scales." NOAA/NWS Space Weather Prediction Center. National Oceanic and Atmospheric Association, 5 Nov. 2007. Web. 29 Oct. 2012. <http://www.swpc.noaa.gov/index.html>. 13. Weigel, Bob, Mike Wiltberger, and Erik Wilson. "What is Space Weather?" Center for Integrated Space weather and Modeling. Boston University, n.d. Web. 10 Dec. 2012. 14. Weigel, Robert, and Brian Curtis. Personal interview. 19 Oct. 2012. 15. Curtis, Brian, Robert Weigel, and Victoir Veibell, eds. Contest Home|SWxContest. George Mason University, n.d. Web. 10 Jan. 2013. 16. Aurorae what cause them? NASA SDO Youtube. N.p., n.d. Web. 10 Jan. 2013. Acknowledgements A special thank-you goes to those who contributed to my project. Thank you for providing me guidance and mentoring I needed to perform this experiment. Professor Robert Weigel for introducing me to the idea of the project and mentoring me through it and allowing me to have the chance to fully understand the concept before I began. Brian Curtis for meeting with me once a week to assist me in writing my code. Victoir Veibell and Brian Curtis for creating the space weather contest and providing me with the data files. The Space Weather Lab located in the Research 1 building of George Mason University for hosting my experimental procedures. Professor Joseph Marr for allowing me to audit a college level class taught by him, where I could learn the basics of MATLAB. Mathworks and MATLAB programming software for providing data tables and graphs as well as a programming console for my experiment.