Atmospheric Environment 42 (2008) 8661–8673 Contents lists available at ScienceDirect Atmospheric Environment journal homepage: www.elsevier.com/locate/atmosenv Identification of factors affecting air pollution by dust aerosol PM10 in Brno City, Czech Republic Zuzana Hrdličková a, *, Jaroslav Michálek a, Miroslav Kolář b, Vı́tězslav Veselý c a Institute of Mathematics, Faculty of Mechanical Engineering, Brno University of Technology, Technická 2896/2, 616 69 Brno, Czech Republic Department of Geography, Faculty of Science, Masaryk University, Kotlářská 2, 611 37 Brno, Czech Republic c Department of Applied Mathematics and Computer Science, Faculty of Economics and Administration, Masaryk University, Lipová 41a, 602 00 Brno, Czech Republic b a r t i c l e i n f o a b s t r a c t Article history: Received 22 February 2008 Received in revised form 8 August 2008 Accepted 8 August 2008 The statistical analysis of the observation of dust aerosol PM10 from four monitoring stations of the agglomeration of the city of Brno during a time period from January 1, 1998 until December 30, 2005 is presented. The main meteorological factors affecting air pollution at each station were identified by means of a generalized autoregressive linear model with gamma distribution of the response variable and log-link function. Along with meteorological factors, the influence of the heating season and weekdays on the air pollution was considered. The suggested model can be used for a prediction of the daily mean value of dust aerosol PM10 at a given station using selected factors and their previous values. Ó 2008 Elsevier Ltd. All rights reserved. Keywords: Dust aerosol PM10 Generalized autoregressive linear model Gamma distribution Goodness-of-fit statistic Anscombe residual 1. Introduction Quality of the air is one of the basic indicators of the overall quality of the environment. Air pollution has become a local as well as a regional issue of big cities, industrial centers and surroundings of transport routes, especially roads and highways. Nevertheless the release of primarily harmless substances fundamentally affects properties of the atmosphere (such as greenhouse gases or Freon) with global repercussions. The focus of this article is an evaluation of air pollution with dust aerosol in the city of Brno, the second largest city of the Czech Republic, based on data on the occurrence of the pollutants and meteorological data. The purpose is to describe and rationalize transformations of the pollution of the air of the Brno agglomeration in time and space to identify the causes of the current status and to predict significant exceeding of * Corresponding author. Tel.: þ420 541 142 532; fax: þ420 541 142 710. E-mail addresses: hrdlickova.z@fme.vutbr.cz (Z. Hrdličková), michalek @fme.vutbr.cz (J. Michálek), kolar@sci.muni.cz (M. Kolář), vesely@econ. muni.cz (V. Veselý). 1352-2310/$ – see front matter Ó 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.atmosenv.2008.08.017 the hygienic limits. When the hygienic limits are surpassed, this information enters the crisis management system and appropriate measures are taken. This is one of the reasons of the implementation of the long-term research project of the Ministry of Education, Youth and Sports of the Czech Republic no. MSM0021622418 entitled ‘‘Dynamic Geovisualization in Crisis Management’’, in the context of which the present article has been written. Preliminary results of the research were presented at Hrdličková et al. (2006). Dust aerosol comprises of all particles in the air that are not gaseous, including: molecular clusters, ice crystals, various solid particles (metal particles, silicates, fluorides, oxides, nitrates, chlorides, sulphates, etc.), drops of liquids, pollen, small insects, etc. The decisive quantities of dust aerosol are represented by particles smaller than 1 mm, mostly originating from condensation and coagulation. Particles larger than 1 mm are usually primarily emitted. Most frequently, the dust fractions of particles of size below 10 mm (PM10), 5 mm (PM5) and 2.5 mm (PM2.5) are analyzed. Dust aerosol for the purpose of this article mean PM10 fraction size. 8662 Z. Hrdličková et al. / Atmospheric Environment 42 (2008) 8661–8673 Table 1 Activity of heating plants in the Czech Republic Month Jan HS 0.19 0.16 Feb Mar Apr May Jun–Aug Sep Oct Nov Dec 0.14 0.02 0.08 0.14 0.17 0.09 0 0.01 Wind Direction Wind Velocity Relative Humidity Temperature Dust Aerosol A large number of recent epidemiological studies have observed a negative impact of ambient particle concentrations on human health including increases in respiratory symptoms and diseases (Franchini and Mannucci, 2007), hospital and emergency department admissions (Chen et al., 2007) and mortality (Qian et al., 2007). In European Union, the Council Directive (1996/62/EC) on ambient air quality assessment and management establishes the basic principles of a common strategy to define and set objectives for ambient air quality in order to avoid, prevent or reduce harmful effects on human health and the environment, assess ambient air quality in the Member States, inform the public, notably by means of alert thresholds, and improve air quality where it is unsatisfactory. According to the follow-up Council Directive (1999/30/EC) the limit value for the daily PM10 average is 50 mg m3 and should not to be exceeded more than 35 times a calendar year and the limit value for the annual PM10 average is 40 mg m3 since January 1, 2005. However, PM10 limit values are currently being exceeded in more than 370 zones – see Press Release (MEMO/07/ 571). At the studied stations in Brno (see Section 2), the daily PM10 limit was exceeded on 8% (2004) to 85% (2002) of the yearly measurements. The annual average of the available PM10 values varied from 30 mg m3 (2004) to 72 mg m3 (2002). The summer PM10 averages were mostly lower than those for the winter months with the largest difference of 23 mg m3. Correspondingly, the exceedances of the limit were observed in the winter months more frequently (an average of 59%). Anthropogenic sources of dust aerosol include power generation, metallurgy (especially foundries), ore mining, manufacture of construction materials and civil industry. Another currently relevant secondary source of dust aerosol is transport mostly affecting urban areas and cities. Transport multiplies pollution from other sources – dispersion of transported materials, industrial transport, etc. This source of dust aerosol has a number of specifics that worsen the impact of the pollution. The transport not only produces new dust aerosol, but also stirs previously settled dust, emits pollutants close to humans and biota and produces trace quantities of toxic substances, that bound to the dust particles. Spread of emissions to the surroundings of their source and therefore the emission conditions depend on four groups of factors: parameters of the source (capacity, variability in time, height above ground, temperature, output speed of emissions), properties of the emissions, effect of earth 300 200 100 0 1/98 1/99 1/00 1/01 1/02 1/03 1/04 1/05 1/06 30 20 10 0 -10 1/98 1/99 1/00 1/01 1/02 1/03 1/04 1/05 1/06 100 75 50 25 0 1/98 1/99 1/00 1/01 1/02 1/03 1/04 1/05 1/06 0 1/98 1/99 1/00 1/01 1/02 1/03 1/04 1/05 1/06 360 270 180 90 0 1/98 1/99 1/00 1/01 1/02 1/03 1/04 1/05 1/06 10 5 Fig. 1. Values of meteorological elements and concentrations of dust aerosol in the period 1998–2005 at Arboretum station. Extreme values are marked with an asterisk on the x axis. Z. Hrdličková et al. / Atmospheric Environment 42 (2008) 8661–8673 surface and meteorological factors. The latter generally exercise a decisive effect on the spread of pollutants. Meteorological factors can be divided into factors with direct effect (velocity and direction of air flow, thermal stratification of atmosphere close to the earth surface, atmospheric precipitation) and factors with indirect effect (temperature, sunshine, air humidity, cloud formation, air pressure). The indirect effect factors affect the nature of the direct effect ones. 2. Data Wind Direction Wind Velocity Relative Humidity Temperature Dust Aerosol The performed analysis is based on data from four monitoring stations of the agglomeration of the city of Brno, namely Arboretum, Bohunice, Židenice and Zvonařka. The time period of monitoring was from January 1, 1998 until December 30, 2005. Together with dust aerosol Apt [mg m3] the factors wind velocity Vt [m s1], wind direction Dt [ ], air temperature Tt [ C] and relative air humidity Ht [%] were measured to evaluate the effects of meteorological conditions on the emission situation at each monitoring station. The data used in the analysis are daily means of half-an-hour measurements of monitored factors and the subscript t stands for a day. It is well known that the level of dust aerosol is significantly higher in heating season and its values on the weekdays differ from weekend values – see model in 8663 Hörmann et al. (2005). To involve the heating plant activity, an additional variable heating season HSt was introduced. The values of variable HSt vary across months in compliance with the Czech Government Decree no. 372/2001 Coll. as given in Table 1. Another explanatory binary variable is weekend Ft with the value of 1 on Saturdays and Sundays and 0 for the rest of the week. Note, that a model with two separate binary variables for Saturdays and Sundays was considered in the first step. However, the procedure for choosing the best submodel described in Section 4 rarely showed such a model to be irreducible to a model with variable Ft in contrast to the model described in Chaloulakou et al. (2003). Flow charts of values of the measured variables across the period in question are in Figs. 1–4. As can be seen from the graphic representations of the time series for the individual stations the development of the Apt series in monitored period is strongly non-stationary and there are considerable changes in the series courses. Data contain sequences of missing values of either Apt or the measured covariates. The time series also include non-proportionately low or high values of dust formation evaluated by experts as incorrect measurements (Apt < 2 mg m3, Apt > 400 mg m3). The time of occurrence of these values can be seen in Figs. 1–4. These values were excluded from the further analysis. 300 200 100 0 1/98 1/99 1/00 1/01 1/02 1/03 1/04 1/05 1/06 30 20 10 0 -10 1/98 1/99 1/00 1/01 1/02 1/03 1/04 1/05 1/06 100 75 50 25 0 1/98 1/99 1/00 1/01 1/02 1/03 1/04 1/05 1/06 1/98 1/99 1/00 1/01 1/02 1/03 1/04 1/05 1/06 360 270 180 90 0 1/98 1/99 1/00 1/01 1/02 1/03 1/04 1/05 1/06 10 5 0 Fig. 2. Values of meteorological elements and concentrations of dust aerosol in the period 1998–2005 at Židenice station. Extreme values are marked with an asterisk on the x axis. Z. Hrdličková et al. / Atmospheric Environment 42 (2008) 8661–8673 Relative Humidity Temperature Dust Aerosol 8664 300 200 100 0 1/98 1/99 1/00 1/01 1/02 1/03 1/04 1/05 1/06 30 20 10 0 -10 1/98 1/99 1/00 1/01 1/02 1/03 1/04 1/05 1/06 100 75 50 25 0 1/98 1/99 1/00 1/01 1/02 1/03 1/04 1/05 1/06 0 1/98 1/99 1/00 1/01 1/02 1/03 1/04 1/05 1/06 360 270 180 90 0 1/98 1/99 1/00 1/01 1/02 1/03 1/04 1/05 1/06 Wind Direction Wind Velocity 10 5 Fig. 3. Values of meteorological elements and concentrations of dust aerosol in the period 1998–2005 at Bohunice station. Extreme values are marked with an asterisk on the x axis. 2.1. Description of localities of pollution measurement Satellite maps used to describe the location of the studied monitoring stations were downloaded from http:// www.mapy.cz on April 3, 2006. Arboretum is a station located in the botanical garden of Mendel University of Agriculture and Forestry (see Fig. 5). Close by there is a heavy-traffic intersection. The station is situated on the top of a hill. North of the station there are military barracks, south of the station there is the campus of Mendel University of Agriculture and Forestry, east of the station there is a residential housing and west of the station there is the botanical garden (arboretum). On the western slope of the hill there is another heavy-traffic intersection, industrial plants (heating plant) and also unused land without greenery. The station is surrounded with areas of reduced humidity. Židenice station is situated by the boundary of the barrack premises near the heavy-traffic Svatoplukova street, from which it is shielded with an about 3 m high wall (see Fig. 6). North of the station there are warehouses, small manufacturing plants and a railway line. East of the station there is a military barracks and west of the station there is a residential housing. South of the station there is an intersection with Rokytova street and more residential houses. The station is situated in the Svitava river valley and is surrounded with areas of reduced humidity. Bohunice is a station located near the Lány street on the southern edge of the Bohunice housing estate (see Fig. 7). The station is protected against effects of the traffic in the Lány street with two rows of houses and grown up vegetation. North of the station there is the Bohunice housing estate. South of the station there are gardens, the campus of the Secondary Gardening School and behind it there is a railway line and D1 highway (350 m from the station). Characteristic features of this area include relatively large stretches of unused fields. The station is located on a moderate south-oriented slope. The locality is surrounded with reduced humidity areas. There is increased chance of dynamic turbulences around the station. Zvonařka station is installed by the heavy-traffic Opuštěná street in an area heavily loaded with traffic and manufacturing plants (see Fig. 8). North of the station there is the Vaňkovka center, the repair plant of the bus terminal. West of the station there is a stretch of land so far unused. South and east of the station there is a parking lot and the Zvonařka bus station. Farther away there is the train station for goods and more heavy-traffic roads. The station is situated in the Svratka river valley and is surrounded with areas of reduced humidity. Wind Direction Wind Velocity Relative Humidity Temperature Dust Aerosol Z. Hrdličková et al. / Atmospheric Environment 42 (2008) 8661–8673 8665 300 200 100 0 1/98 1/99 1/00 1/01 1/02 1/03 1/04 1/05 1/06 30 20 10 0 -10 1/98 1/99 1/00 1/01 1/02 1/03 1/04 1/05 1/06 100 75 50 25 0 1/98 1/99 1/00 1/01 1/02 1/03 1/04 1/05 1/06 1/98 1/99 1/00 1/01 1/02 1/03 1/04 1/05 1/06 360 270 180 90 0 1/98 1/99 1/00 1/01 1/02 1/03 1/04 1/05 1/06 10 5 0 Fig. 4. Values of meteorological elements and concentrations of dust aerosol in the period 1998–2005 at Zvonařka station. Extreme values are marked with an asterisk on the x axis. As can be seen from the graphic representations of the time series for the individual stations in Figs. 1–4 the progress of the dust formation within the inspected period was changing considerably. The main cause of the change in June 2003, clearly seen at Arboretum and Židenice stations, was a reduction of emissions coming from the peak sources – the heating plants Brno-North and Cervený Mlýn – and a traffic relieve of the locality by the Arboretum station caused by an opening of the Husovice tunnel. For that reason the time lines of the Arboretum and the Židenice stations were divided into two sections, from January 1, 1998 until May 31, 2003 and from June 1, 2003 until December 31, 2005. For the Fig. 5. Position of Arboretum station. Two long white arrows indicate wind directions of maximum pollutant concentration oriented to the station and estimated by the presented model. Solid and dashed line represent estimated wind direction for the first (76 ) and the second (88 ) time section, respectively. 8666 Z. Hrdličková et al. / Atmospheric Environment 42 (2008) 8661–8673 Fig. 6. Position of Židenice station. Two long white arrows indicate wind directions of maximum pollutant concentration oriented to the station and estimated by the presented model. Solid and dashed line represent estimated wind direction for the first (101 ) and the second (72 ) time section, respectively. sake of comparison the same division was applied to the time lines of the Bohunice and the Zvonařka stations. 3. Model The following analysis of pollution with dust aerosol Apt [mg m3] is based on a generalized autoregressive linear model GALM (Fahrmeir and Tutz, 1994). The conditional density of the response variable Apt was supposed to be a density of a gamma distribution. The choice of the gamma distribution was justified by histograms, Q–Q plots and by c2 goodness-of-fit tests of the response variable Apt. The values of the response Apt were divided into clusters, which were created by similar values of covariates. For both sections of every station, the k-means cluster analysis (Johnson and Wichern, 1992) with 12 clusters was performed on the covariates Tt, Ht, Vt sin Dt, Vt cos Dt, HSt, Ft. The results of the goodness-of-fit tests of gamma distribution are given in Table 2. For illustration, the histograms and Q–Q plots of the response Apt in three of the 12 clusters at Arboretum station, the first section, are in Fig. 9. As can be seen in Table 2, in the most clusters the c2 goodness-of-fit test does not reject the null hypothesis, that the response is gamma distributed, at the 5% significance level. Nevertheless, there are some clusters, for which the null hypothesis has been rejected. However, remind that the covariates values in the cluster are not equal and that the test is approximative only. Deviances from the gamma distribution, which are visible in the histograms and even more in the Q–Q plots, can be again explained by a sensiIn tivity of the tail values Apt to the values of the covariates. pffiffiffiffiffiffiffiffi Hörmann et al. (2005) the linear regression model for Apt with normal distribution of the error term has been used what also supports the hypothesis that the measurements of Apt could be gamma distributed. The slight discrepancies form the gamma distribution were one of the reasons to Fig. 7. Position of Bohunice station. Two long white arrows indicate wind directions of maximum pollutant concentration oriented to the station and estimated by the presented model. Solid and dashed line represent estimated wind direction for the first (78 ) and the second (41 ) time section, respectively. Z. Hrdličková et al. / Atmospheric Environment 42 (2008) 8661–8673 8667 Fig. 8. Position of Zvonařka station. Two long white arrows indicate wind directions of maximum pollutant concentration oriented to the station and estimated by the presented model. Solid and dashed line represent estimated wind direction for the first (101 ) and the second (92 ) time section, respectively. consider a non-canonical link in the GALM as described in the next paragraph. Gamma distribution is a member of the exponential class and thus the GALM can be considered. The canonical link function for the gamma distribution is the reciprocal function. Another important link function for gamma distribution is the log-link function – see Fahrmeir and Tutz (1994, p. 23). The log-link can be used to improve the fit, when the distribution of the response variable shows discrepancies from the gamma distribution in the tails, as was seen in Fig. 9. Note that in Li et al. (1999) the logarithm transformation of the hourly PM10 was chosen for making the frequency distribution of the response variable in the spatial-temporal model of PM10 in Vancouver approximately normal. The log-transformed daily average PM10 concentration values were also considered as responses in the regression models in Chaloulakou et al. (2003). The canonical link and the log-link have been used in the further analysis. Before a model description, it is necessary to choose covariates for a linear predictor. In the first place the wind direction Dt, measured as an oriented angle between the Table 2 Results for c2 goodness-of-fit tests of gamma distribution of Apt in clusters C1, ., C12 identified by k-means cluster analysis performed on covariates Tt, Ht, Vt sin Dt, Vt cos Dt, HSt, Ft Arboretum C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 Židenice Bohunice Zvonařka Section 1 Section 2 Section 1 Section 2 Section 1 Section 2 Section 1 Section 2 4.585 (75) 0.519 (144) 6.639 (168) 3.488 (174) 7.655 (145) 13.200* (233) 10.653 (125) 2.736 (183) 2.176 (162) 7.145 (94) 8.924 (126) 26.719*** (197) 1.925 (70) 8.955 (115) 16.532* (97) 19.157** (66) 6.609 (86) 3.209 (101) 6.778 (87) 3.315 (44) 1.663 (45) 2.980 (42) 2.724 (57) 0.431 (92) 4.069 (96) 3.749 (92) 0.230 (106) 5.757 (103) 9.581 (130) 1.091 (70) 0.013 (28) 4.450 (140) 2.260 (42) 6.457 (128) 5.801 (132) 1.915 (76) 2.668 (30) 2.179 (37) 4.062 (65) 5.259 (79) 6.372 (103) 9.616* (53) 6.365 (83) 2.779 (111) 3.941 (41) 3.979 (74) 4.685 (74) 2.443 (53) 10.447 (155) 10.623 (140) 6.230 (136) 8.062 (114) 5.549 (79) 5.644 (60) 10.208 (188) 4.408 (138) 3.303 (113) 7.446 (149) 2.810 (64) 3.339 (137) 2.994 (91) 3.361 (59) 5.049 (41) 9.471 (84) 4.837 (40) 5.764 (87) 19.367** (115) 3.465 (76) 3.902 (68) 5.580 (55) 30.006*** (126) 13.452*** (56) 5.989 (177) 1.316 (104) 3.447 (95) 7.933 (138) 2.202 (139) 5.967 (118) 1.121 (167) 1.830 (99) 20.071** (148) 8.584 (92) 3.421 (168) 6.392 (221) 10.034* (63) 3.540 (53) 2.420 (39) 4.142 (59) 2.745 (52) 1.244 (96) 4.854 (79) 0.110 (14) 1.229 (53) 6.872 (117) 9.570 (107) 9.170 (96) Each cell consists of observed test statistic and size of cluster (in parentheses). Asterisks indicate p-value of the test (*p < 0.05, **p < 0.01, ***p < 0.001). 70 60 Cluster 3 50 40 30 20 10 0 40 80 120 160 200 Cluster 10 20 10 0 0 Pollution with dust aerosol [µg.m−3] 80 120 160 Empirical Quantiles 120 90 60 150 100 50 0 90 120 150 0 0 25 50 75 180 100 125 150 120 80 40 Cluster 10 0 60 25 160 Cluster 3 30 50 Pollution with dust aerosol [µg.m ] 150 0 Cluster 12 75 Pollution with dust aerosol [µg.m−3] 200 30 100 −3 180 Empirical Quantiles 40 Empirical Quantiles 0 30 Number of observations Z. Hrdličková et al. / Atmospheric Environment 42 (2008) 8661–8673 Number of observations Number of observations 8668 0 Theoretical Quantiles 50 100 150 200 Cluster 12 0 0 40 80 120 160 Theoretical Quantiles Theoretical Quantiles Fig. 9. Histograms with fitted probability density functions of gamma distribution and Q–Q plots for the response Apt in three of 12 clusters identified by k-means cluster analysis performed on covariates Tt, Ht, Vt sin Dt, Vt cos Dt, HSt, Ft for Arboretum station, the first section. Clusters were chosen to illustrate different results – good fit in Cluster 3 (c2 ¼ 6.639, df ¼ 5), medium fit in Cluster 10 (c2 ¼ 7.145, df ¼ 6) and worse fit in Cluster 12 (c2 ¼ 26.719***, df ¼ 4). vector pointing north of the station and the observed wind direction vector pointing to the station, has a circular distribution and thus it is not suitable to be considered as a covariate for the linear predictor directly. For the sake of interpretation the wind velocity Vt should not be considered separately from the wind direction Dt. Therefore the projections Vt sin Dt and Vt cos Dt are included in the model. Similarly to Somerville et al. (1996), reparametrization of the linear predictor with Vt sin Dt and Vt cos Dt provides an estimate of the angle S [ ] between the vector pointing north of the station and the vector pointing from the direction of maximum pollutant concentration to the station. Details on this reparametrization are given later in this section. If significant, one or more relevant angles S can also be identified by the method of overcomplete frames (Chen et al., 1998; Veselý and Tonner, 2006). Such an approach to the data studied in this paper was applied in Veselý et al. (2006). Then the covariates chosen for our GALM model were Ht, Vt sin Dt, Vt cos Dt together with the categorical covariates HSt, Ft. Furthermore the autoregressive variables with lag one Apt 1, Ht 1 were included. For the model with loglink function the variable ln(Apt 1) was involved instead of Apt 1. Variable HSt clearly reflects the mean trend of the air temperature Tt. Therefore to eliminate a co-linearity in the model the air temperature gradient (Tt Tt 1) instead of two separate covariates Tt and Tt 1 was chosen for the model. Then the considered GALM with log-link function is expressed by an equation lnðm t Þ ¼ b0 þ b1 lnðApt1 Þ þ b2 ðTt Tt1 Þ þ b3 Ht =100 þ b4 Vt sin Dt þ b5 Vt cos Dt þ b6 Ht1 =100 þ b7 HSt þ b8 Ft ; ð1Þ where b0 ; . ; b8 are unknown parameters which have to be estimated, m t ¼ EðApt jSt Þ stands for the conditional expectation of the response variable, here St is the set of variables Table 3 Comparison of GALM with canonical link (M1) and GALM with log-link (M2) by deviance D and Pearson c2 statistics for Anscombe residuals c2ans Arboretum Židenice Bohunice Zvonařka Section 1 Section 2 Section 1 Section 2 Section 1 Section 2 Section 1 Section 2 D in M1 D in M2 199.008 192.365 61.558 61.828 75.205 72.036 69.088 65.865 114.908 109.490 77.812 75.525 148.128 137.099 46.653 45.484 c2ans in M1 c2ans in M2 197.213 190.626 61.165 61.444 74.784 71.645 67.957 64.918 114.190 108.802 77.254 74.848 147.211 136.267 46.489 45.323 Z. Hrdličková et al. / Atmospheric Environment 42 (2008) 8661–8673 8669 Table 4 Parameter estimates together with their standard deviation (in parentheses) for the model (1) Arboretum Židenice Section 1 Section 2 Section 2 2.715 0.457 0.020 0.905 0.089 0.029 Section 2 2.181 0.545 0.010 0.443 0.051 0.010 (0.092) (0.020) (0.004) (0.062) (0.007) (0.008) 2.359 (0.133) 0.477 (0.030) 0.010 (0.004) 0.465 (0.076) 0.056 (0.018) 0.001 (0.029) 1.306 (0.120) 0.163 (0.016) 1.373 (0.151) 0.125 (0.019) 0.059 41 0.052 101 0.056 92 109.490 108.802 85.565 84.418 137.099 136.267 45.484 45.323 2.973 1459 1467 3.841 0.694 885 893 3.841 3.630 1651 1659 3.841 0.000 813 821 3.841 0.436 (0.084) 0.077 (0.013) 0.002 (0.021) 2.086 0.579 0.008 0.459 0.049 0.010 0.938 (0.122) 0.125 (0.018) 1.045 (0.155) 0.128 (0.021) 0.617 (0.122) 0.147 (0.017) 2.021 (0.219) 0.225 (0.028) 0.058 0.012 0.304 1.107 0.038 b45 0.093 76 0.077 88 0.050 101 0.093 72 0.060 78 192.365 190.626 61.828 61.444 72.036 71.645 65.865 64.918 0.006 1805 1813 3.841 6.245 892 899 5.991 1.658 1127 1135 3.841 0.003 787 795 3.841 c2ans W f n c20.95 (0.196) (0.036) (0.006) (0.129) (0.052) (0.018) Zvonařka Section 1 2.079 (0.091) 0.603 (0.018) 0.014 (0.004) 0.585 (0.070) 0.090 (0.009) 0.023 (0.009) D (0.117) (0.023) (0.004) (0.070) (0.020) (0.012) Section 1 const. ln(Apt 1) Tt Tt 1 Ht/100 Vt sin Dt Vt cos Dt Ht 1/100 HSt Ft S 1.772 (0.115) 0.579 (0.027) Bohunice Section 1 1.799 (0.097) 0.592 (0.021) 0.007 (0.003) (0.005) (0.006) (0.069) (0.124) (0.017) Section 2 1.535 (0.153) 0.718 (0.027) 0.024 (0.006) 0.038 0.045 0.523 0.945 0.102 (0.016) (0.030) (0.125) (0.202) (0.028) The table is completed by estimated angle S which identify the direction of maximum pollutant concentration and recalculated regression parameter b45 at Vt cos(Dt S). Then goodness-of-fit statistics D, c2ans and Wald statistic (W) together with their degrees of freedom (f), length of modeled time series (n) and corresponding c20.95 quantile for each measured series follow. used on the right-hand side of the model Eq. (1). Thus St consists of the variables Tt Tt 1, Ht, Vt sin Dt, Vt cos Dt, Apt 1, Ht 1, HSt, Ft. Covariate Ht/100 and Ht 1/100 is considered instead of Ht and Ht 1, respectively, to obtain estimates of the corresponding regression parameters of an order similar to the order of other regression parameters. The component of the linear predictor b4 Vt sin Dt þ b5 Vt cos Dt (2) can be reparametrized by the sum formula to b45Vt cos(Dt S), where b45 and S are unknown parameters and their estimates can be obtained from b4 and b5. As pointed out in Somerville et al. (1996) S identifies an angle of the direction of maximum pollutant concentration. Because the GALM with log-link function performed better than GALM with canonical link on the given data the equation for GALM with canonical link is left. Hereinafter the analysis concentrates on the model with log-link. 4. Parameter identifications and model verification The parameters b0 ; . ; b8 of model (1) were estimated by the maximum likelihood method (Fahrmeir and Tutz, 1994) adapted for GALM with gamma conditionally distributed response. Numerical calculations were implemented in MATLAB software package and the procedure glmfit from the Statistics Toolbox has been used for fitting the model. The choice of the best submodel of the studied model was performed by stepwise backward selection using Wald Dust Aerosol Apt 300 200 100 0 Anscombe Residuals 1/98 1/99 1/00 1/01 1/02 1/03 1/99 1/00 1/01 1/02 1/03 2 0 -2 1/98 Fig. 10. Observed and predicted values of dust aerosol Apt and corresponding plot of Anscombe residuals at Arboretum station, the first section. 8670 Z. Hrdličková et al. / Atmospheric Environment 42 (2008) 8661–8673 Dust Aerosol Apt 300 200 100 Anscombe Residuals 0 6/03 1/04 1/05 6/03 1/04 1/05 2 0 -2 Fig. 11. Observed and predicted values of dust aerosol Apt and corresponding plot of Anscombe residuals at Arboretum station, the second section. statistic – see Fahrmeir and Tutz (1994, pp. 122–123). This way the variables in the submodel of the model (1) with the best prediction ability were identified. Then the final best submodel was tested against the maximal model using Wald statistic to verify, whether the best selected submodel is acceptable. The 5% significance level has been used throughout the analysis. The model verification was based on the analysis of residuals and goodness-of-fit tests. The values mt were b t using the best submodel corresponding to estimated by m the model (1) with estimated parameters. Then the bt observed values of Apt together with the estimated trend m were plotted. Later the Anscombe residuals – see McCullagh and Nelder (1989), (Section 2.4.2) – were calculated and plotted. Further the Pearson c2 statistics for Anscombe residuals given by c2ans ¼ 9 2 1=3 1=3 b m Y n i i X i¼1 (3) mb 2=3 i and deviance D given by D ¼ 2 n X . b i þ Yi m bi mb i ln Yi = m were calculated to compare different models. The values of these two goodness-of-fit statistics enabled us to choose the best model and led us to prefer GALM with log-link (1) to GALM with canonical link. 5. Results First the GALMs with canonical link and log-link were compared. Observed values of the goodness-of-fit statistics c2ans (3) and D (4) for both models are given in Table 3. For GALM with log-link, smaller values of the goodness-of-fit statistics c2ans and D were achieved in all cases except for the second section at Arboretum station. Note that this conclusion corresponds with the results in Veselý et al. (2007), where a forecasting ability was the main criterion. Therefore the GALM with log-link was examined further. Table 4 includes parameter estimates and their standard deviations for the selected best submodels. Finally, to assess the model suitability, measured values were Dust Aerosol Apt 300 200 100 0 Anscombe Residuals (4) i¼1 1/00 1/01 1/02 1/03 1/00 1/01 1/02 1/03 2 0 -2 Fig. 12. Observed and predicted values of dust aerosol Apt and corresponding plot of Anscombe residuals at Židenice station, the first section. Z. Hrdličková et al. / Atmospheric Environment 42 (2008) 8661–8673 8671 Dust Aerosol Apt 300 200 100 0 Anscombe Residuals 6/03 1/04 1/05 2 0 -2 6/03 1/04 1/05 Fig. 13. Observed and predicted values of dust aerosol Apt and corresponding plot of Anscombe residuals at Židenice station, the second section. Extreme b t > 2 m g:m3 ) are marked with plus (‘‘þ’’) at the bottom of the graph and asterisk (‘‘*’’) at the observed values (Apt > 300 mg m3) and extreme residuals (Apt m top of the graph, respectively. Dust Aerosol Apt 300 200 100 0 Anscombe Residuals 1/98 1/99 1/00 1/01 1/02 1/03 1/99 1/00 1/01 1/02 1/03 2 0 -2 1/98 Fig. 14. Observed and predicted values of dust aerosol Apt and corresponding plot of Anscombe residuals at Bohunice station, the first section. Dust Aerosol Apt 300 200 100 Anscombe Residuals 0 6/03 1/04 1/05 6/03 1/04 1/05 2 0 -2 Fig. 15. Observed and predicted values of dust aerosol Apt and corresponding plot of Anscombe residuals at Bohunice station, the second section. Extreme b t > 2 m g:m3 ) are marked with asterisks (‘‘*’’) at the top of the graph. residuals (Apt m 8672 Z. Hrdličková et al. / Atmospheric Environment 42 (2008) 8661–8673 Dust Aerosol Apt 300 200 100 Anscombe Residuals 0 1/99 1/00 1/01 1/02 1/03 1/99 1/00 1/01 1/02 1/03 2 0 -2 Fig. 16. Observed and predicted values of dust aerosol Apt and corresponding plot of Anscombe residuals at Zvonařka station, the first section. displayed in graphs together with the values predicted from the selected submodel. The plot of Anscombe residuals was also inspected. The results are included in Figs. 10–17. The models show common correlation features. In the first place the results show that the dust formation of the previous day is a significant predictor and contributes to the dust formation of the current day. Further, a positive effect of the increase of air temperature from the previous day was manifested at all stations except for the second section at Arboretum station. According to Li et al. (1999), high temperature increases the activity of particles. Also, this effect may be related to the fact that a precipitation subsequently reducing the dust formation is often associated with a decrease of the air temperature. With the exception of the Bohunice station the current relative humidity proved to reduce the dust formation value. Delay of the influence of relative humidity at the Bohunice station might be explained by the location of the station. Unlike the other stations, it is not beside a busy communication. A negative correlation between PM10 and the relative humidity was also observed in summer and springtime of Dust Aerosol Apt 300 200 100 Anscombe Residuals 0 6/03 1/04 1/05 6/03 1/04 1/05 2 0 -2 Fig. 17. Observed and predicted values of dust aerosol Apt and corresponding plot of Anscombe residuals at Zvonařka station, the second section. Table 5 Relative frequencies summarizing performance of model (1) in prediction of days with PM10 greater than 50 mg m3 Arboretum Židenice Bohunice Zvonařka Section 1 Section 2 Section 1 Section 2 Section 1 Section 2 Section 1 Section 2 n 1813 899 1135 795 1467 893 1659 821 ohigh chigh clow coverall 0.524 0.856 0.647 0.756 0.091 0.366 0.979 0.923 0.650 0.928 0.529 0.789 0.546 0.899 0.582 0.755 0.396 0.781 0.822 0.806 0.380 0.808 0.773 0.786 0.700 0.930 0.470 0.792 0.516 0.816 0.647 0.734 Value ohigh is relative frequency of Apt 50 in the modeled time series of length n. Values chigh and clow stand for relative frequencies of correct prediction b t < 50; Apt < 50, respectively. Finally, coverall ¼ ohigh $ chigh þ (1 ohigh) $ clow is an overall correct frequency. b t 50; Apt 50 and m m Z. Hrdličková et al. / Atmospheric Environment 42 (2008) 8661–8673 the period from 1999 until 2003 at some monitoring sites in Egypt (Elminir, 2007). A positive effect of the heating period manifested itself at all stations. Similarly to the results of an analysis of PM10 in Vancouver conducted in Li et al. (1999), a negative effect of the weekend variable was identically indicated at all stations. As noted in Li et al. (1999) air movement may transport and redistribute PM10. In accordance with this fact, the influence of the wind vector was statistically significant at all stations. The angles S of the direction of maximum pollutant concentration estimated by the reparametrization of the linear predictor (2) are given in Table 4. In Figs. 5–8 the identified angles are displayed with solid and dashed lines for the first and the second time section, respectively. The identified wind directions of maximum pollutant concentration correspond very well with both local and more remote sources of emissions at all stations. At the Arboretum station, the main sources of emissions are an intersection with heavy-traffic, relatively large surfaces of roofs and insufficiently maintained hard surfaces in the military object area. From the point of view of immissions of PM10, the Židenice station is clearly influenced by an important communication connecting town center and the Židenice district. Dust particles from the direction identified in the first section come from an adjacent military area. The direction estimated for the second section corresponds with a more remote main marshalling station. At the Bohunice station, the identified directions of flow bring dust particles from an agricultural land and a remote residential area. Station Zvonařka is situated immediately next to a busy communication and the estimated wind directions correspond well with this fact. Thus as noted in Li et al. (1999), pollution generated by traffic seems to be the main source of ambient PM10 concentrations, although local point sources may contribute for some stations. Finally, we believe that the model can be used for an identification of a high level of the dust formation at considered areas. According to the currently valid legislation in the Czech Republic and the directives of the European Union, the limiting value of PM10 is 50 mg m3. Table 5 summarizes an ability of the model (1) to predict days with a critical value of PM10 in terms of measures similar to those considered in Stadlober et al. (2008). It is necessary to keep b t ), not an in mind that only a mean value (estimated trend m extreme value of PM10 is being predicted. From this point of view the prediction ability of the model is fairly satisfactory. Note, that the predictions for a day t are based on the factors measured on day t 1 and t. Prediction of Apt based on the measurements on day t 1 and forecasts of the factors for day t, as presented in Stadlober et al. (2008), are behind the scope of this paper and will be considered in the future research. Acknowledgement The article was written in the context of implementation of the long-term research projects no. 8673 MSM0021622418 and no. 1M06047. The paper was completed during the first author’s postdoctoral appointment at the University of British Columbia Okanagan supported by the Pacific Institute for the Mathematical Sciences. References Chaloulakou, A., Kassomenos, P., Spyrellis, N., Demokritou, P., Koutrakis, P., 2003. Measurements of PM10 and PM2.5 particle concentrations in Athens, Greece. Atmospheric Environment 37 (5), 649–660. Chen, S.S., Donoho, D.L., Saunders, M.A., 1998. Atomic decomposition by basis pursuit. SIAM Journal on Scientific Computing 20 (1), 33–61. Chen, L., Mengersen, K.L., Tong, S., 2007. Spatiotemporal relationship between particle air pollution and respiratory emergency hospital admissions in Brisbane, Australia. Science of the Total Environment 373 (1), 57–67. Council Directive 1996/62/EC of 27. September 1996 on ambient air quality assessment and management. Official Journal L296, 21/11/ 1996, 55–63. Council Directive 1999/30/EC of 22. April 1999 relating to limit values for sulphur dioxide, nitrogen dioxide and oxides of nitrogen, particulate matter and lead in ambient air. Official Journal of the European Communities L163, 41–60. Elminir, H.K., 2007. Relative influence of air pollutants and weather conditions on solar radiation – Part 1: relationship of air pollutants with weather conditions. Meteorology and Atmospheric Physics 96 (3-4), 245–256. Fahrmeir, L., Tutz, G., 1994. Multivariate Statistical Modelling Based on Generalized Linear Models. Springer-Verlag, New York. Franchini, M., Mannucci, P.M., 2007. Short-term effects of air pollution on cardiovascular diseases: outcomes and mechanisms. Journal of Thrombosis and Haemostasis 5 (11), 2169–2174. Hörmann, S., Pfeiler, B., Stadlober, E., 2005. Analysis and Prediction of Particulate Matter PM10 for the Winter Season in Graz. Austrian Journal of Statistics 34 (4), 307–326. Hrdličková, Z., Kolář, M., Michálek, J., Veselý, V., 2006. The Statistical Analysis of Air Pollution by Suspended Particulate Matter in Brno. Program and Abstracts, The Seventeenth International Conference on Qualitative Methods for the Environmental Sciences, TIES 2006, Kalmar Sweden, 18.-22.6.2006. Johnson, R.A., Wichern, D.W., 1992. Applied Multivariate Statistical Analysis, third ed. Prentice-Hall, New Jersey. Li, K.H., Le, N.D., Sun, L., Zidek, J.V., 1999. Spatial-temporal models for ambient hourly PM10 in Vancouver. Environmetrics 10 (3), 321–338. McCullagh, P., Nelder, J.A., 1989. Generalized Linear Models, second ed. Chapman and Hall, New York. Press Release MEMO/07/571. Questions and Answers on the new directive on ambient air quality and cleaner air for Europe. 12/12/2007. http:// europa.eu/rapid/pressReleasesAction.do?reference¼MEMO/07/571. Qian, Z., He, Q., Lin, H., Kong, L., Liao, D., Dan, J., Bentley, C.M., Wang, B., 2007. Association of daily cause-specific mortality with ambient particle air pollution in Wuhan, China. Environmental Research 105 (3), 380–389. Somerville, M.C., Mukerjee, S., Fox, D.L., 1996. Estimating the wind direction of maximum air pollutant concentration. Environmetrics 7 (2), 231–243. Stadlober, E., Hörmann, S., Pfeiler, B., 2008. Quality and performance of a PM10 daily forecasting model. Atmospheric Environment 42 (6), 1098–1109. Veselý, V., Tonner, J., 2006. Sparse parameter estimation in overcomplete time series models. Austrian Journal of Statistics 35 (2, 3), 371–378. Veselý, V., Tonner, J., Michálek, J., Kolář, M., 2006. Air Pollution Analysis Based on Sparse Estimates from an Overcomplete Model. Program and Abstracts, The Seventeenth International Conference on Qualitative Methods for the Environmental Sciences, TIES 2006, Kalmar, Sweden, 18.-22.6.2006. Veselý, V., Tonner, J., Hrdličková, Z., Michálek, J., Kolář, M., 2007. Analysis of PM10 air pollution in Brno based on generalized linear model with strongly rank-deficient design matrix. In: Book of Abstracts TIES 2007, August 16–20, 2007, 18th Annual Meeting of the International Envi ronmetrics Society. TIES, Brno, Ceská republika, ISBN 978-80-2104333-6, 2007 p. 118–118. 16.8.2007, Mikulov, Czech Republic.