Meteorology 415 The Making of MOS FALL 2012 Objective: To better understand the development of MOS from its genesis as a Perfect Prog (Survey Path) to its formation as a multiple linear regression (MLR) equation (Career Path). Goal: To generate a Perfect Prog and MLR MOS for one of six cities for the period November 1-15 Plan: In teams of 3, select one of the six cities (PHL, ATL, IAH, TPA, PIT or ORD) and mine the resource data for the Perfect Prog and MLR. Teams need to be in place by September 20. Project results are due: December 6 at classtime __________________________________________________________________________________ PERFECT PROG PROCEDURES (Survey Path and Career Path) First, access 30 years (1981-2010) of hourly FAA temperature data for the city , then sift the data and preserve only the 0z,3z,6z,9z,12z,15z,18z and 21z observations for the period from Nov 1-15. The URL for NCDC for acquiring metar data: http://cdo.ncdc.noaa.gov/pls/plclimprod/poemain.accessrouter Sample: http://hurricane.ncdc.noaa.gov/cdo/3505dat.txt Data is decoded and in a table form which can be parsed. Second, save the FAA data as a csv file then convert the data to the prescribed convention (see the sample KPHL_tmp2m_FAA.csv) Third, run the Perfect Prog program to set-up the input for each city. (the program can be found at: /coda/s0/classes/m415 Run the program on the linux side of the PC, mos_lab_perfect_prog.py using the following command line: python mos_lab_perfect_prog.py YYYYMMDDHH Actual Observation Temperature -i Input_File_Name.csv -f Forecast_File_Name.csv (the –help command can aid you if you have questions) (even if you divide up the data acquisition and formatting, the complete [two week] data set must be submitted for the Perfect Prog to work) Sample command line: python mos_lab_perfect_prog.py 2010110318 50 -i KPHL_tmp2m_FAA.csv -f KPHL_forecast.csv Options: -h, --help show this help message and exit -i INPUT Name of input file. -f FORECAST Name of the output forecast file. -d DELTA Temperature delta value for year matching. -n NUMBER Number of 3 hour forecasts. -y YEARS The number of years to use from the input file. Use 999 for all A few notes: The Date/Time (2011110318 in the sample line) following mos_lab_perfect_prog.py is the Date/Time of the current observations that will be the initialization of the perfect prog forecast. The Temperature (50 in the sample line) is the current temperature observation in Fahrenheit from the FAA site of interest. - i in the command line refers to -input file. Following the -i is the Input File Name (KPHL_temp2m_FAA.csv in the sample line). Make sure this input CSV file is in the same format as the example provided to you. -f in the command line refers to -forecast file. Following the -f is the Forecast File Name (KPHL_forecast.csv in the sample line). The output of the program after you run it should look like the following: ================================================================================ MOS Forecast Python Program Year: Month: Day: Hour: 2011 11 3 18 Temp: 16 If the above values are incorrect, do not trust the output and rerun the program. Note the temperatures must be converted from Celsius to Fahrenheit ================================================================================ ================================================================================ Number of years in forecast: 11 Years used in forecast: ['1981', '1983', '1985', '1987', '1994', '1997', '2000', '2001', '2004', '2005', '2008'] Datetime/Forecast/Climo: 20111103 21:00:00, 15, 14 20111104 00:00:00, 12, 11 20111104 03:00:00, 11, 9 20111104 06:00:00, 10, 8 20111104 09:00:00, 9, 8 20111104 12:00:00, 9, 7 20111104 15:00:00, 13, 11 20111104 18:00:00, 15, 13 20111104 21:00:00, 15, 13 20111105 00:00:00, 12, 11 20111105 03:00:00, 11, 10 20111105 06:00:00, 10, 9 20111105 09:00:00, 9, 8 20111105 12:00:00, 8, 8 20111105 15:00:00, 12, 11 20111105 18:00:00, 14, 13 Comma Delimited CSV file created – note, the sample values are in Celsius. END PROGRAM Prediction: Starting on November 1 and continuing each day until November 15, you will be making a daily 42 hour forecast based on the 00z observation at the city and using the FAA database as a predictor. The first 6 hours are not comparable since NAM and GFS MOS do not output those predictions from 00Z. You will also be acquiring the NAM and GFS MOS for those times (00Z) too. Analysis: There will be four prediction system outputs (FAA PP,CLIMO, NAM MOS and GFS MOS) for 18 time steps per day for 15 days. You will also need to acquire the verification data and then run several verification statistics (MAE and BIAS) to answer the following questions: - At what hour did the PP’s skill fall behind climatology? - What skill could be gained if we used a regional reanalysis? - In what instances did the PP usually do better than climatology? Than MOS? - What is the significance of the delta value? - What is a major shortcoming of the PP technique? - How could PP be improved? ____________________________________________________________________________________ Developing a Multiple Linear Regression Alternate for MOS (Career Path only) Objective: To better understand the development of MOS from its roots as a multiple linear regression. Goal: To generate an MLR MOS for one of six cities for the period November 1-15, making a forecast at 00z each day. Plan: In teams of 3, select one of the six cities (PHL, ATL, IAH, TPA, PIT or ORD) and mine the resource data for the MLR. MLR PROCEDURES (Career Path Only) First, from the 30 years (1981-2010) of NARR data which you processed for the period November 1-15 for your city, correlate the surface temperature with the six variables given. [12z 1000-500mb Thickness, 12z 850mb Temp, 18z 1000mb U and V wind components, 12z 250mb wind speed, 12z 700mb Specific Humidity] These variables are related to the original work on MOS by Glahn and Lowry (Glahn, Harry R., Dale A. Lowry, 1972: The Use of Model Output Statistics (MOS) in Objective Weather Forecasting. J. Appl. Meteor., 11, 1203–1211. ) One of the major tasks is make sure that the predictors are properly aligned (00z forecast relies on ‘predicted’ 12z and 18z variables) and in the correct format. You will be correlating the 18z surface temperature with the NARR 12z or 18z variables listed above. Second, initially find the values of the 1000-500mb thickness from the height fields for those two levels given in the data file. Remember for cases when the 1000mb surface is ‘below the ground’, use the rule of 8mb =60 meters. The total magnitude of the 250mb wind will need to be calculated from the U and V components given in the NARR data file (you will need to write a macro [in excel] for this. Then calculate the regression for each variable between its 12z or 18z value and the 18z FAA temp on the same day. Remember to use a time lag and to acquire the Y-intercept too, place the slope and intercept values into the open spaces on the site given below It is strongly recommended that you only use FireFox (not Chrome) and this be done from a campus computer (an IP address from Penn State) http://schubert.atmos.colostate.edu/~cslocum/mos/ Third, your team will need to select at least three variables and up to all six for the multiple linear regression components and then determine the weights of each –which must add to 1.0. Please use a single decimal place (ie. 0.1=10%). Write a short paragraph about why the team selected those weights and variables. Fourth, starting at 00Z on November 1st, input the ‘forecast’ values from the DMO (direct model output) of the GFS (AVN) The input must be for each weighted variable included in the MLR. To acquire the DMO use the weather program on the Linux side of the classroom PC. To gain access to and use the weather program, log onto mp1.met.psu.edu; type weather and after the prompt has returned, type /EAVN for the GFS - DMO. You can then enter the city and time – for example; EAVN> TPA 0z – the output will look like this: Your team will need to convert the SFC Layer wind into its U and V components: http://www.cactus2000.de/uk/unit/masswin.shtml (convert wind to m/sec) The team will also need to convert the 700mb DP/RH to specific humidity: http://www.cactus2000.de/uk/unit/masshum.shtml Fifth, the input data for the MLR will need to be formatted in this way: YYYY-MM-DD, OBS, CLIMO, Var1(xx.x), Var2(xx.x), Var3(xx.x), Var4(xx.x), Var5(xx.x), Var6(xx.x) – remember that if you use less than 6 variables, keep them in the correct order! Submit the data for 18hrs and 42hrs. Save the output files as they will be needed to calculate verification stats such as MAE and Correlation Coefficients for the MLR compared with other forecasts. R is the Pearson Correlation Coefficient X and Y bar are the means and s sub x and y are the standard deviations where are the standard score, sample mean, and sample standard deviation, respectively Analysis: There will be four prediction system outputs (MLR, NAM MOS, CLIMO and GFS MOS) for 18 time steps per day for 15 days. You will also need to acquire the appropriate verification data and then run several verification statistics (MAE and Correlation Coefficient) to answer the following questions: - At what hour did the MLR skill fall behind NAM-MOS? - Under what atmospheric conditions did the MLR usually do better than climatology? Than MOS? - What is the significance of the variables chosen and their respective weights? - What is a major shortcoming of the MLR technique? - In what ways is this MLR an improvement over MOS? - How could MLR be improved