1 Air Quality in Ontario: 2 Data visualization for variables in air quality 3 ABSTRACT Air pollution is extremely important in the topic of public health science where the air pollutant has direct impact on human’s body. In contrast to traditional informative data visualization graph, we come up with userfriendly graphical visualization to present a (1) comprehensive scatter-density plot of Ontario based on filtering particle and nitrogen dioxide where pollution measures for major cities are plotted as point estimates, (2) a line curve to introduce a trend of air pollution comparing with past years using GAMM (Generalized Additive Mixed Model) on Hamilton areas, (3) a new interval plot to show the distribution of air pollution in past years for top 7 ranked cities with an innovative impact formula that considers relative population difference of cities, and (4) a monthly comparison of geospatial map on individual regions in Ontario for the varying air pollutions. Keywords: air quality, Ontario, regions, PM2.5, NO2, NO, data visualization 4 INTRODUCTION 5 6 Urban air pollution is a major concern of issues in public health science and Ontario. The atmospheric 7 pollutants such as nitrogen dioxide, filtering particle (PM 2.5) and nitrogen monoxide has considerable impact on 8 the respiratory disease (e.g. asthma) and are known to be the cause of lung cancer in the case that one is exposed 9 for substantially long periods of time (Tsung-Te Lai et. al, 2011). From public environmental perspective, the air 10 pollutants are responsible for ecological problems, such as acid rain and ozone layers depletion resulting in hazard 11 influence on human conditions as well. Under importance of monitoring air pollution in Ontario, the categorized 12 hazard air pollutants, NO, PM2.5 and NO, are monitored by the ministry of environment, conservation, and parks 13 to archive data with highly sensible and accurate measurements (Burke et al., 2006). 14 15 In recent past years, the public data collected has gained increasing popularity for installation of static 16 graph in data visualization for comprehensive understanding with the help of statistical methods (Vardoulakis et 17 al., 2003). However, in many instances, the methods are not completely persuasive and are difficult follow with for 18 the public to understand via changing trend and visual representation (Matějíček et al., 2006). This led to some 19 natural question, how can the air quality affect the regions and what is a natural way for people to know the 20 overall air quality of regions in Ontario, comparing with some severe pollutants. To answer the following questions 21 for relative intuitive understanding and simplicity of the visual representation, we propose a scatter-density plot 22 with point estimates by using measures of one pollutant to range for colors to distinguish the highly pollutant 23 groups from the other within several minimized plots to follow with (Li et al., 2016). 24 25 By dealing with graphical representation of the complex data, the scatter plot is used to analyze the 26 relationship between one or more variables, to effectively visualize the correlation between variables (Andrienko 27 & Andrienko, 2010). Note that this approach for the criterion on using atmospheric pollutant as one variable to 28 evaluate the air quality upon degrees of impact on other variables (Pope et al., 2002). Also, the scatter plot is 29 simple and widely used for a long time indicating the legitimacy of public awareness with additional attractiveness 30 in using colors and sub-plots to effectively portray the theme of air pollutions (Wilke, 2019). 31 32 Many statistical analyses rely in the use of regression approach to show for the trend and prediction of 33 regressor against regressed to emphasize on the parametric approach that can be interpretable for public usage. 34 Hence, we hope to apply the generalized additive mixed model allowing random effect on time series data to 35 effectively portray the trends of air pollution on the current years, compared to past years for a Hamilton region. 36 In this way, the data visualization allows for the reader to simply find comprehensive trend of the curve by few line 37 graphs in the plot to visualize with. Numerous researchers focus on the systematic theories in measure the impact 38 of air pollution using various techniques and variables to come across for accurate prediction on public health 39 impact (Kraak, 2003). However, such complicated modeling with numerical analysis and mathematical complexity 40 does not provide the public for interpretability and overall understandings. Also, the methodological 41 implementation of the air composition analysis requires complex computations, which are time consuming for 42 many readers. Hence, with such simple equation for population numbers to consider for, this paper proposes a 43 new calculation method with considerable emphasis on how people are influenced by the impact of air pollutants 44 to come up with a point plot on the geospatial map (Li et al., 2016). 45 46 Along with the distribution of air pollutants for regions to consider with the comparison in the graph look 47 forward to providing considerable understanding on overall trend in visual exploration of massive movement data 48 (Andrienko & Andrienko, 2010). Although the overall distribution of air pollutants provides the difference in how 49 much air pollution there is for regions to differ with, it does not provide an answer to the question that what 50 periods in time that people should be aware with the air pollution in particular regions of Ontario to be warned 51 with. Hence, another framework to provide a time series data of air pollution in Ontario for various regions was 52 proposed for geospatial graphs with monthly changes (Li et al., 2016). These frameworks are expected to provide 53 data exploration by the readers on spatiotemporal patterns and correlation between regions to identify difference 54 of air pollution in Ontario (Kveladze et al., 2013). Air pollution analysis and data modeling requires complex 55 computations and are time consuming. Moreover, the current graphical approaches for air pollution analysis lack 56 comprehensive and sharable multi-perspective visualizations. 57 58 The public urgently needs rapid and reliable knowledge on air pollution to make daily decisions. Similarly, 59 most of the government staff are not professionals and need intuitive understanding of the conditions before they 60 can execute any actions against the increasingly serious air pollutants. Thus, a visual methodology is needed for 61 efficient and reliable exploration, particularly in the case of air pollution data, to improve the depth, readability, 62 and accuracy of data analysis. We were motivated by the idea of visual thinking and the geovisualization concept, 63 which may be well suited for the exploration of air pollution data (Kraak, 2003). By creating its own models and 64 methods, this paper presents a visualization exploration method that is not presented in any other papers. 65 METHODS 66 Plot 1: Scatter plot on 28 Ontario regions 67 The intention of the visual was to translate aspects of the motivation, so that the audience can sense the 68 same difficulties that we were introduced to. Leaving the viewer general thoughts and assisting them with keeping 69 an open mind when learning about air quality. It mainly attends to the appearance of information. Although it still 70 has some suggestive features. The positioning, colors scale, and labels create effective discrimination, ranking and 71 ratioing. 72 These elements are framed within the idea of compound plots. A viewer first interacts with the density 73 and the labels, then recognizes the color scheme of the points are related to the subplots that visibly simple to 74 understand. Notice that majority of the graphic is light, but the labels are dark. The innovative part of the plot is on 75 legend of color difference by the cut of NO median taken from the box plot of NO emission of all Ontario regions. 76 Divided by its median of the NO emission, the color of two categorical groups of cities is colored on the scatter plot 77 for apparent difference in the scatter plot. 78 Since interest is in Hamilton, a subplot that is slightly away from the view of the others is displayed. The 79 visual was given to begin developing the story and understanding the variations of air quality and variations across 80 regions. A view that it best facilitated as being general. The innovative legend (top left subplot) is a distribution 81 interval of the values, grouped by the two variables. 82 83 In an attempt leverage the flexibility of ggplot before considering other packages, it was redundant to annotate 28 regions where their labels will have to be chosen effectively (Wilke, 2019). Hence, ggrepel added 84 position labels and arrows, and restricted the overlapping of points. Across Ontario, three air quality variables 85 were available. To begin to give the audience an understanding that air quality differs 2 variables would suffice. For 86 more precise distribution difference in the scatter plot, scales are taken to the power of 1/3 to show more obvious 87 difference between cities for its air qualities (Wilke, 2019). It is possible to see both ratio of distribution difference 88 between two variables, NO2 and PM2.5 and regions themselves from the subplot of density plot. 89 What is ignored in this graphic are the axis labels. Although it is given for reference and display of 90 authenticity of the information. To further engage with the ratio, it would be compelling to view what exists when 91 that information is separated. Distribution intervals help to visualization uncertainty of point estimates. According 92 to (Wilke, 2019). it has been noted that the plot could be difficult to interpret when the strips fade to white. As 93 such, intervals were chosen to alleviate that challenge. While this is the manner that it can used, we also displayed 94 it in relation the transparency of points. Half-eye plots provide a different perspective to the distribution of the 95 respective air qualities. The viewer can freely choose which they are most comfortable with. 96 The graph communicates a story that we can pick up on. But we have had much more experience with the 97 data to quickly grasp cues, and thus bias towards an effective visual is plausible. Arguments can be made against 98 this graph where it can be visually busy (filled with noise). To better organize the plot, the information could have 99 been split by the traditional and possibly cleaner approach given by compound plots. The plot aims to capture a 100 101 relationship between 2 variables in respect to all regions available. An alternative view would be displaying a ranked bar graph with flipped coordinates of the proportional 102 mean values of each city and both variables. This would be quick for the viewers to follow but may not be retained 103 as it won’t be memorable. With such a distinction may be best suited for the last graph. Since it usually the most 104 complex, the idea of memorable is naturally introduced to capture the information. 105 Plot 2: Comparison of weekend and weekdays estimates using GAMM 106 (Generalized Additive Mixed Model) 107 The plot acts as a supplementary visual for Figure 1. The key components are the appearance of points 108 with the line and a blurred line. The glow effect on the points brings the viewers’ attention to the main trend lines 109 of 2020 to be evaluated. While the blurred line in the background aims to act as a reference in a non-distracting 110 form. Such a display is suitable when various lines are presented on the same graph. Comparison of 4 lines is best 111 understood by the audience. 112 113 Although it is not shown here, more lines, say 6 can be made visible. With 4 of the most recent years at the forefront (given points and shadow) and the remaining 2 as references. The reference (blurred) lines are not to 114 be interacted with as much as the others. Their importance is minimal but can be closely looked at to strengthen 115 ones understand. 116 Plot 3: Interval plot on Top 7 regions of Ontario (2020) 117 Extracting the minimal but suggestive properties of the previous graph for an adhesive story was 118 considered. Here, it was still an aim to provide the audience with a general understanding but more interesting 119 and direct implications than that of the previous graphic. The two colors from the previously plot is again used for 120 consistency. At first it was of interest to find a way display the distribution of all regions. Then realizing that this 121 would become a visual that is difficult for the audience to read and retain information from. So, a ranking of top 7 122 cities in respect to the ratios are displayed. The idea of ranks is one that is traditionally displayed via bar graphs. 123 Taking the average for each region would oversimplify the data in a way that implies concrete information 124 can be perceived by the user. Thus, focus was on highlighting variation of data in respect to different levels 125 (variable, across region, within region). With each rank there is a combination of box plot and density. Information 126 first given by the gradient distribution interval. 127 Plot 4: Monthly Univariate geospatial graph on Ontario regions 128 This graph gives an image that combines geographical regions (shapes) with that of a sequential color 129 scale for increasing and decreasing trends of air quality. The selection of intervals was chosen to be quarterly as to 130 reflect a balanced number of visual differences between regions. Information absorbed in this way leads to direct 131 assumptions of the data. Thus, it is more so suggestive than it is informative. To emphasize that Hamilton is also of 132 interest a facet zoom was used to showcase Hamilton trends. Viewers may then assess the surrounding regions of 133 Hamilton. Note a label of Hamilton is given once as repeating the information would be repetitive and does not 134 add information. Monthly trends can be useful for understanding the impact of changes related to an 135 environment. For instance, when companies release climate action plans it is reasonable to review if such goals are 136 being met. Clearly conditions (variables) will slightly vary. Where regions may be now by smaller communities or 137 buildings. 138 RESULTS 139 Plot 1: Scatter plot on 28 Ontario regions 140 The variables of air quality can be categorized by quantitative and qualitative. There also exists station 141 information, where the name and longitude and latitude of each station is given. With limitation of the access and 142 missing data of stations for cities, the main data to be considered as variables in the visualization include Nitrogen 143 Oxides (NO/ NO2) and Fine-Particle Matter (PM2.5). To compliment the primary data source, we looked at Ontario 144 geospatial data, population data, and traffic volumes (Howell, 2020). 145 There are many locations in the Ontario to observe with. The most concerning regions are ones that can 146 be further evaluated. The main interest the plot intends to answer is where regions of Ontario are placed for air 147 quality. The observations of this graphic indicate that Ontario regions tend to be heterogeneous - there exists 148 variation across the regions. The density gives reason to believe that some regions can be analyzed together. The 149 size of Ontario is 1,076,395km2 which is larger than usual size of such nation like South Korea (100,210km2). This 150 indicates a qualitative difference to inference for Ontario for 28 regions as well as different perspective of air 151 quality to coexists around 28 different atmospheres. 152 However, Hamilton appears to be much different from the others. In addition, there is variation of air 153 quality variables. Another perspective of what is happening in Ontario, and specifically Hamilton would be feasible. 154 Giving that Ontario is comprised of various environments, it is suggestive of the visual that northern cities have a 155 lower ratio. In other views, there are some developing rural cities in the mix of low-moderate ratios. Implications 156 of these may lie with those being the standard. This may come from a decrease of any of the 2 variables shown. 157 For analysis and graph, an optimal features of air quality that surround the atmosphere of cities and shown with 158 scatter plot and for comparison to show how quantitatively location differ for air qualities. 159 160 161 There are 5 major aspects that is shown from the plot: Mean values of NO 1/3 and PM 2.51/3 emission for each city: The quantitative points 2are shown in the 162 scatter plot with text labelled cities where they are placed as well as qualitative measurement of yearly 163 gas pollution in a 2D topological plot. In this way, the non-parametric rankings of 28 cities are compared 164 for viewers to come up with measured comparisons between cities that they are interested in. 165 Groups within density contour of variation for each city: 28 cities are plotted with its mean values of 1/3 and PM2.51/3 where points fall within 3 different contours, mostly. Each city is regarded to 166 yearly NO 167 vary between the density contour for their variation of 12-month periods that is plotted by the 2D density 168 graph for scatter points. 169 170 Categorical clusters of ratios taken between NO 1/3 and PM2.51/3 by NO median as a guideline for colored groups: Found from the median value of monthly NO emission (0.5 as a cut point), there is an obvious two 171 groups’ clusters of cities that has a different median point when a ratio of PM2.5/NO is distributed in the 172 subplot for comparison. 173 Scatter plot of Hamilton gas emission compared to density plot: Within 12 months of variation by mean of 174 weekdays and weekend, the Hamilton’s individual scatter points of 24 points are shown compared to 175 general density contour of NO2 and PM 2.5 emission for Ontario regions, which show the variation of 176 highest gas polluted city, Hamilton. 177 Combined legend, boxplot NO graph: Within 12 months of variation by mean of weekdays and weekend, 178 the boxplot of NO emission in Ontario is taken to be the legend of the color of scatter plot’s points for its 179 obvious categorization of two groups. The median is taken to be the cut-off of two colors, which is 0.5 and 180 used as a color variant for 28 cities that are in the scatter plot. 181 The data analysis is as follows. From the boxplot analysis, the data are cleaned by omitting outliers and placed 182 on the scatter plot where points are represented by its location. The side plot shows for the legend how the color 183 is chosen which comes from the median of boxplot as cut-off values. Also, ratio of two variables is distributed on 184 the density curve as well as another scatter plot is described for only Hamilton of 12-month period. 185 Plot 2: Comparison of weekend and weekdays estimates using GAMM 186 (Generalized Additive Mixed Model) 187 Correlation between NO2 and PM2.5 is visualized. They both follow an increasing pattern marginally 188 different time. Where it appears, there is a slight lag for PM2.5. Also suggested by the data is that when the work 189 (school) day begins then emissions are expected to be of most concern daily. What is interesting is that around 190 2:00PM–3:00 PM there is a decrease in both variables. At this time of day, many students are released for the day. 191 Before this many would have presumed that parents were picking up their kids as they experienced traffic during 192 this time frame. This graph communicates that to be potentially bias, and other more natural means of commuting 193 home are being used by students. Seeing that there has been an average decrease of both variables from 2018- 194 2019 to 2020. Which gives relevance to the states of the pandemic. 195 Visually, hourly trends are of significance in various settings. First the data given, the impact of each 196 variable can indicate what a person their air quality to be throughout the day. Where adults and children are found 197 to be at more risk of poor air quality, then advise can be deployed to give caution. This is how air quality is 198 currently being accessed by the general public. Second, another use case of an hourly decomposition can be 199 provided in relation to air cleaning devices used in buildings due to the pandemic. As the devices were developed, 200 they were interested in evaluating how long it may take to clear the hazardous particles in the air. Such periods 201 would presumably be assessed minutely, or hourly. 202 203 Plot 3: Interval plot on Top 7 regions of Ontario (2020) 204 Now, the distribution of two variables, NO2 and PM2.5 is shown with interval plots for top 7 cities by 205 quantity of NO2 emission in Ontario that is found from the previous scatter plot; Hamilton, Windsor, Toronto, 206 Burlington, Milton, and Hamilton Mountain. Even though the median represents the air quality of cities, the 207 variation is clarified with the distribution of interval plot as to be described with. In this plot, the main interest to 208 answer with is the difference of population for rural and urban areas matters for air quality as to combine for 209 impact measure, which is a numerical value that combines both NO2, PM2.5 and population of cities. 210 These cities also have very high PM2.5 to be considered as a dangerous place for citizens to live with. 211 Hence, the goal is to come up with global impact on population of each city with two variables that have same 212 range by the formula 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝐼𝑚𝑝𝑎𝑐𝑡 𝑜𝑓 𝑒𝑎𝑐ℎ 𝑐𝑖𝑡𝑦 = ( ) × {𝑚𝑒𝑎𝑛(NO2) + 𝑚𝑒𝑎𝑛(PM2.5)} 500,000 213 214 since the air quality is naturally influential for every individual’s human being in the city. Note that the NO2 and 215 PM2.5 has a same measurement to be compared in the previous line plot to consider for the consideration. Also, 216 the numerator of 500,000 is taken to be the cut-off of metropolitan city for large population cut-off to be placed 217 for impact values to calculate with. By this cut-off in population of numerator, we can better classify the difference 218 of impact for air pollution between the urban and rural areas to be found from the geospatial point plot. 219 There are 2 major aspects that is shown from the plot: 220 Distribution of top 7 highest air polluted regions are compared within interval chart by two variables, NO 2 221 and PM 2.5: A median contain interval, 45%-55% is highlighted at the center with its variation shown in a 222 separate color to highlight the different distribution of each city, which are 42.5%-57.5%, 37.5%-62.5%, 223 25%-75% and 15-85% as intervals. 224 Impact measure of top 7 highest air polluted regions on population for categorization: By calculation of 225 adding two gas pollution for each city multiplied by the population of each city, an influence of two 226 pollutants is categorized as a measure where the bottom geospatial point map shows the impact of gas 227 pollutants for each city. 228 229 Those regions that are identified to have a higher average of PM2.5 and NO2 than other cities in the scatter plot tend to have non-unimodal distribution in their distribution plots. While PM 2.5 is a countable measurement 230 to have peak at each point, the NO2 is a continuous measurement to have negatively skewed distribution models in 231 all 7 cities. From the categorical classification of top 7 cities for the impact, there is an obvious discrimination to be 232 found from the impact measurement by the multiplier, population of citizens in the city. 233 While Hamilton is observed to have the highest air pollution of NO2 and PM 2.5 from the scatter plot for its 234 yearly mean and points of the monthly average, Toronto is found to have the highest impact on its population for 235 the city by the impact numeric value computed. This clarifies the impact to be different for rural and urban areas 236 by the identification of impact values. Not only the air quality that is around us to change along the season and 237 month as it flows around, but also the impact by population clarifies the discrimination of urban and rural areas by 238 its population. 239 Here it is environmentally reasonable to insert the classifications of regions - urban, suburban, and rural. 240 Therefore, to some extent, the insights proposed by the data of Hamilton can be extended to other rural regions. 241 Also, the geospatial point plot shows the impact of hazardous air quality for it increment based on the population 242 to live in the city, as 3 variables work to create impact numerical values. There exists evidence that air quality 243 variables differ across regions. Some with marginal differences, others much significant. 244 245 Plot 4: Monthly Univariate geospatial graph on Ontario regions 246 There are 2 major aspects that is shown from the plot: 247 • Monthly variation of NO2 emission for Ontario regions in comparison for geospatial map: The Hamilton 248 area is specified in the plot separately from other regions for monthly comparisons. Other regions are also 249 observed to vary by the percentile division of NO2 emission which is 0−25%, 25−50%, 50−75% and 250 75−100% for comparison. 251 • Monthly variation of NO2 emission for Hamilton region in magnified square for comparison: Previously, 252 the Hamilton area is found to have highest gas polluted area. A monthly comparison of mean values 253 clarifies for variation of NO2 air pollutants. 254 In Ontario, both variables tend to decrease in the summer. This is expected since flowers can absorb 255 emissions. The geographical representation demonstrates that rural regions adjust to that seasonality much more 256 effectively than urban regions. Again, expected since rural regions are in favor of having more trees and less 257 population. As we are concerned with Hamilton it is clear from both NO 2 and PM2.5 that not much insight can be 258 retained over the months. Rather it indicates that the two variables are relatively constant. With that, there may 259 be other dimensions of time that will be useful. 260 The central part on Ontario including Toronto and Hamilton areas is found to have highest NO2 air pollutant 261 over 12 months of period. Although other areas are found to have low emission, constant pattern of high polluted 262 areas are observed to be identified in January, February, March, November, and December for increments. Some 263 of this area includes Burlington, Barrie, Hamilton, and Toronto on seasons, Winter, and Spring. This clarifies the 264 seasonal changes of air quality to change along particular season by a law of nature. 265 Although NO2 and PM2.5 shows a similar pattern of increment and seasonality from previous plots, most areas 266 that have high emission of PM2.5 maintain its quantitative measurement for the year. This shows the constant 267 pollution that human being could make for populated areas that have constant high PM2.5 range for the year. 268 However, most areas that are identified to have high percentile range (75-100%) are also observed to have high 269 percentile (75-100%) for 12 months period. Even though there is a variation and difference compared to the trend 270 of NO2, the seasonal changes of air quality to change along particular season by a law of nature is inevitable for 271 most regions with its color changes in pattern to detect with. 272 Similar with the emission geospatial graph of NO2, the geospatial graph for Ontario regions shows a same 273 trend of NO air quality over 12 monthly plots. 274 DISCUSSION 275 276 Air pollutant data visualization is represented based on a huge data, requiring creative and advanced 277 analytical skill for extraction of useful patterns. Despite the advantage of their large size, the limitations of such 278 datasets include the possibility of missing values, that each concentration may show high and low value, and that 279 the records for each station may not be equal (Al-Janabi et al., 2020). A bivariate density-scatter plot with means 280 of air pollutants variables by regions is effective to rank and compare regional difference as qualitative as possible. 281 Also, by expanding plots with unique subplots and creating a legend as other plots showing for different intervals, 282 the limitation of bivariate scatter-density plot is overcome with various perspective to identify with. 283 While there is no statistical approach to observe from the density-scatter plot, the line plot from a 284 methodological perspective by the GAMM analysis provides an hourly based trend on a curve to identify how air 285 pollutants there are for supplementary purpose. This helps to compare yearly differences and provide hourly 286 changes on a line graph to portray a comprehensive outlook for air pollution in the Hamilton region to observe 287 with. Then, the interval chart is compared with top 7 ranked air polluted regions. This helps to identify 288 distributions of pollutants on areas that could not be seen precisely in the bivariate density-scatter plot. This helps 289 to determine comparative difference among top 7 polluted areas as quantitative as possible. Within interval chart 290 to plot with, the point geospatial plot for impact based on the formula created provide thorough understanding on 291 the influence of air pollutants on population which the distribution could not provide with. As it matters for air 292 pollutants if there are people who live in the regions, the point estimates have meanings for Ontario regions to 293 identify how much influence the measure of air pollutants are shown in the plot to observe with. 294 Lastly, a geospatial choropleth map to identify the 12 month comparison of region in Ontario, which we could 295 not see from any of the plots. This helps to find visual comparisons for seasonal changes that many people wonder 296 as a topic of public health science to search and get informed with. As the geospatial plot provide a powerful and 297 strong meanings to public interest where the color differences and region identification is an efficient tool for data 298 visualization. 299 CONCLUSION 300 In search of appealing visualizations that can be communicated effectively we find that the feasible 301 graphics are ones that highlight feature(s) of the data. Air quality in Hamilton, most importantly does suggest 302 distinct assumptions given the period. The most suggestive feature of the data is that there was a decrease of air 303 quality for the year 2020, air quality distributions have multiple modes - implying that there exists sub regions or 304 climates that have an effect, emissions in the winter are of concern since flowers are not as active in absorbing 305 emissions. It is meaningful to the audience to be aware of such factors when defining problems and proposing 306 solutions. To process further development of plots, a closer look at uncomplicated plots would be of interest. 307 Appeal can be added by first be formally evaluated to assess whether the data is communicating more than what is 308 given. If so, then a natural transition to more complicated plots, will benefit from appeal to engage the viewer. 309 310 ACKNOWLEDGMENTS 311 The copyright of plot 1: scatter plot on Ontario region belongs to the authorization of Kyuson.Lim as created by 312 own ideas and combination on density contour, scatter points and text labels. Also, the copyright of plot 2: impact 313 rankings on top 7 Ontario region belongs to the authorization of Kyuson.Lim as created by own ideas and 314 combination on point colors and mathematical formula to be used with. REFERENCES 315 1. 316 317 movement data. The Cartographic Journal, 47(1), 22-40. 2. Al-Janabi, S., Mohammad, M., & Al-Sultan, A. (2020). A new method for prediction of air pollution based 318 319 Andrienko, G., & Andrienko, N. (2010). A general framework for using aggregation in visual exploration of on intelligent computation. Soft Computing, 24(1), 661-680. 3. Burke, J., Estrin, D., Hansen, M., Parker, A., Ramanathan, N., Reddy, S., & Srivastava, M. B. (2006, 320 November). Participatory Sensing. Paper given at 4th ACM Conf. on Embedded Networked Sensor 321 Systems (ACM SenSys 2006). In Workshop on World-Sensor-Web (WSW'2006), Boulder, CO. 322 4. Howell, E. T. (2020). Ontario Ministry of Environment, Conservation and Parks, 125 Re- sources Rd, 323 324 Toronto, ON M9P 3V6, Canada. Personal Communication. 5. 325 326 57(5-6), 390-399. 6. 327 328 Kveladze, I., Kraak, M. J., & van Elzakker, C. P. (2013). A methodological framework for researching the usability of the space-time cube. The Cartographic Journal, 50(3), 201-210. 7. 329 330 Kraak, M. J. (2003). Geovisualization illustrated. ISPRS journal of photogrammetry and remote sensing, Li, H., Fan, H., & Mao, F. (2016). A visualization approach to air pollution data exploration—a case study of air quality index (PM2. 5) in Beijing, China. Atmosphere, 7(3), 35. 8. Matějíček, L., Engst, P., & Jaňour, Z. (2006). A GIS-based approach to spatio-temporal analysis of 331 environmental pollution in urban areas: A case study of Prague's environment extended by LIDAR data. 332 Ecological Modelling, 199(3), 261-277. 333 9. Pope Iii, C. A., Burnett, R. T., Thun, M. J., Calle, E. E., Krewski, D., Ito, K., & Thurston, G. D. (2002). Lung 334 cancer, cardiopulmonary mortality, and long-term exposure to fine particulate air pollution. Jama, 287(9), 335 1132-1141. 336 10. Tsung-Te Lai, T., Lin, C. Y., Su, Y. Y., & Chu, H. H. (2011, November). BikeTrack: Tracking stolen bikes 337 through everyday mobile phones and participatory sensing. In Proceedings of the 2nd International 338 Workshop on Sensing Applications on Mobile Phones (PhoneSense), Seattle, WA, USA (pp. 1-4). 339 11. Vardoulakis, S., Fisher, B. E., Pericleous, K., & Gonzalez-Flesca, N. (2003). Modelling air quality in street 340 341 342 canyons: a review. Atmospheric environment, 37(2), 155-182. 12. Wilke, C. O. (2019). Fundamentals of data visualization: a primer on making informative and compelling figures. O’Reilly Media. 343 FIGURES AND TABLES 344 Figure 1: Figure 1 (a) A bivariate density-scatter plot of Ontario NO2 and PM 2.5 with mean of each region is 345 plotted as point on the plot. (b) The interval chart of NO is a legend plot and color difference for the scatter plot to 346 differentiate how the groups of highly polluted regions differs from the low polluted regions. (c) The ratio of two 347 variables, NO2 and PM 2.5 is taken to distributed for Hamilton regions to know how the distribution of points in the 348 right-top plot distribution on the bottom-right plot is. (d) The right-top plot shows for the Hamilton are to compare 349 with scatter-density plot to inform the severeness of 12 months periods for the air pollution. 350 Figure 2: The 24-Hourly time series are estimated using GAMM on Hamilton region, shown together with each NO2 351 and PM2.5 quantities where the reference lines are described with light colored lines on the background of graph 352 which are analyzed using the data of 2018-2019. 353 Figure 3: (a) An interval distribution plot of top 7 ranked NO2 regions along with Ontario’s average air pollution of 354 NO2 and PM2.5 interval plot are described on the top of the plot for overall comparison. (b) Using the impact 355 formula, the top 7 ranked regions for air pollution influence are described with 3 categorical ranges on the bottom 356 with an impact as a geospatial point. 357 Figure 4: A choropleth map of 12 months period for univariate geospatial graph in Ontario regions of air pollutant, 358 NO2, by the box-plot intervals as a range for different colors to graph with. The highlighted region on the side is a 359 Hamilton region to magnify for the comparative view on air pollution. 360 Figure 5: A choropleth map of 12 months period for univariate geospatial graph in Ontario regions of air pollutant, 361 PM 2.5, by the box-plot intervals as a range for different colors to graph with. The highlighted region on the side is 362 a Hamilton region to magnify for the comparative view on air pollution. 363 364 Figure 6: A choropleth map of 12 months period for univariate geospatial graph in Ontario regions of air pollutant, 365 NO, by the box-plot intervals as a range for different colors to graph with. The highlighted region on the side is a 366 Hamilton region to magnify for the comparative view on air pollution.