Uploaded by MD Talha

1-s2.0-S0045653522007433-main

advertisement
Chemosphere 299 (2022) 134250
Contents lists available at ScienceDirect
Chemosphere
journal homepage: www.elsevier.com/locate/chemosphere
Modelling and investigating the impacts of climatic variables on ozone
concentration in Malaysia using correlation analysis with random forest,
decision tree regression, linear regression, and support vector regression
Abdul-Lateef Balogun a, c, Abdulwaheed Tella b, c, *
a
Professional Services Department (Resources), Esri Australia, 613 King Street, West Melbourne, VIC, 3003, Australia
Earth, Environment and Space Division, Foresight Institute of Research and Translation, Ibadan, Nigeria
Geospatial Analysis and Modelling (GAM) Research Laboratory, Department of Civil and Environmental Engineering, Universiti Teknologi PETRONAS (UTP), 32610,
Seri Iskandar, Perak, Malaysia
b
c
H I G H L I G H T S
• Climate change has the potential to influence air pollution.
• Temperature is an essential factor for ozone pollution.
• Wind serves as a medium for transboundary pollution.
• Random forest outperformed the other machine learning algorithms for surface ozone prediction.
• Industrialized and urbanized areas are hotspots for bad air quality.
A R T I C L E I N F O
A B S T R A C T
Handling Editor: Kyusik Yun
Climate change is generally known to impact ozone concentration globally. However, the intensity varies across
regions and countries. Therefore, local studies are essential to accurately assess the correlation of climate change
and ozone concentration in different countries. This study investigates the effects of climatic variables on ozone
concentration in Malaysia in order to understand the nexus between climate change and ozone concentration.
The selected data was obtained from ten (10) air monitoring stations strategically mounted in urban-industrial
and residential areas with significant emissions of pollutants. Correlation analysis and four machine learning
algorithms (random forest, decision tree regression, linear regression, and support vector regression) were used
to analyze ozone and meteorological dataset in the study area. The analysis was carried out during the southwest
monsoon due to the rise of ozone in the dry season. The results show a very strong correlation between tem­
perature and ozone. Wind speed also exhibits a moderate to strong correlation with ozone, while relative hu­
midity is negatively correlated. The highest correlation values were obtained at Bukit Rambai, Nilai, Jaya II
Perai, Ipoh, Klang and Petaling Jaya. These locations have high industries and are well urbanized. The four
machine learning algorithms exhibit high predictive performances, generally ascertaining the predictive accu­
racy of the climatic variables. The random forest outperformed other algorithms with a very high R2 of 0.970,
low RMSE of 2.737 and MAE of 1.824, followed by linear regression, support vector regression and decision tree
regression, respectively. This study’s outcome indicates a linkage between temperature and wind speed with
ozone concentration in the study area. An increase of these variables will likely increase the ozone concentration
posing threats to lives and the environment. Therefore, this study provides data-driven insights for decisionmakers and other stakeholders in ensuring good air quality for sustainable cities and communities. It also
serves as a guide for the government for necessary climate actions to reduce the effect of climate change on air
pollution and enabling sustainable cities in accordance with the UN’s SDGs 13 and 11, respectively.
Keywords:
Air quality
Machine learning
Ozone
Sustainable cities
Climate change
* Corresponding author. Earth, Environment and Space Division, Foresight Institute of Research and Translation, Ibadan, Nigeria.
E-mail address: tellaabdulwaheed01@gmail.com (A. Tella).
https://doi.org/10.1016/j.chemosphere.2022.134250
Received 21 March 2021; Received in revised form 1 December 2021; Accepted 5 March 2022
Available online 19 March 2022
0045-6535/© 2022 Elsevier Ltd. All rights reserved.
A.-L. Balogun and A. Tella
Chemosphere 299 (2022) 134250
Author contributions
is a correlation between climatic factors and air quality, suggesting that
weather conditions influence airborne pollution. Climatic factors like
temperature, wind speed, and relative humidity significantly impact the
atmospheric balance (Tong et al., 2018; Tang, 2019; Hanaoka and
Masui, 2019), thereby affecting the spread, scale, and magnitude of air
pollutants. Variations in the climatic conditions also affect the air
pollution emission strength and rate, dispersion, atmospheric reaction,
and deposition (Xie et al., 2017; Zhao et al., 2016). Temperature en­
hances the rate of photochemical reactions in the atmosphere and ozone
formation (Hassan et al., 2015). Wind speed influences the dispersion
rate of pollutants while relative humidity impacts the pollutants’ life
cycle in the atmosphere and their deposition (He, 2017; Plocoste et al.,
2019).
Understanding air pollution enablers such as climatic factors’
changes is crucial to air pollution monitoring and mitigation studies
because the sources, composition, and potential risks of pollutants vary
across the year (Kim et al., 2017). The study of Althuwaynee et al.
(2020) highlighted the significance of assessing the inter-correlation
between pollutants and climatic variables in modelling the future con­
centration of air pollutants. Similarly, Manimaran and Narayana (2018)
showed that understanding the nexus between ozone and climatic fac­
tors (e.g. temperature, wind speed and relative humidity) will facilitate
effective air quality monitoring and control. Thus, this study attempts to
model the correlation between ozone and climatic factors in major
Malaysian cities. This study has the potential to provide new insights
into the formation, emission, accumulation, and behaviour of the
pollutant which will aid mitigation initiatives to improve the deterio­
rating air quality in Malaysia therefore supporting climate actions to
reduce climate change effects (SDG13) and enabling sustainable cities
and communities (SDG11).
Abdul-Lateef Balogun: conceptualization, data curation, investi­
gation, methodology, project administration, resources, supervision,
validation, writing – review & editing. Abdulwaheed Tella: concep­
tualization, data curation, formal analysis, investigation, methodology,
resources, software, validation, visualization, writing – original draft,
writing – review & editing.
1. Introduction
Air pollution is one of the most hazardous environmental problems
locally, regionally, and globally (Bayat et al., 2019; Lee et al., 2019). Its
effects transcend the ecosystem (De Marco et al., 2019; Bayat et al.,
2019), affecting human health (Wu et al., 2019; Rovira et al., 2020),
economy and environmental sustainability (Bayat et al., 2019; Tella
et al., 2021b). Exposure to indoor and ambient air pollutants such as
ozone has negatively impacted human health, causing complications
linked to pulmonary disease, lung cancer, heart disease, health failure,
asthmatic condition, and infertility (Rovira et al., 2020; Wang et al.,
2019; An and Yu, 2018).
Over 85% of the world’s population live in polluted environments
(World Health Organization 2016), causing approximately 3 million
mortality per annum. The southeast Asian region experiences significant
air pollution due to rapid industrialization, forest degradation, burning
of fossil fuels, and an increase in the use of automobiles (Hamid and
Long, 2017). Trans-boundary pollution is a common phenomenon in the
region, with Malaysia being the most impacted (Latif et al., 2018) due to
the emission air pollutants from neighbouring countries. This has a great
negative impacts on people’s well-being (Wong et al., 2017) and the
economy (Khan et al., 2016).
A study by Azid et al. (2015) shows that ozone (O3), a major
pollutant in Malaysia, greatly affects air quality thereby impacting
public health. It is noteworthy that tropospheric ozone is potentially
dangerous if its concentration rises beyond the brink (Tang et al., 2020).
Ozone is not primarily emitted into the air, rather, a high concentration
of tropospheric ozone originates as secondary pollutants from the
photochemical reactions of sunlight and two precursor chemicals, that
is, volatile organic chemicals (VOCs) and nitrogen oxides (NOx ≡ NO +
NO2) (Tang et al., 2020; Nassikas et al., 2020; Wang et al., 2020).
Tropospheric ozone, which helps to absorb ultraviolet radiation and
protects the earth from harmful radiation, doubled as a greenhouse and
urban air pollutant, a triggering factor for asthma (Nassikas et al., 2020),
affecting environmental sustainability (Yang et al., 2020), contributing
to climate change (Wang et al., 2020) and causes respiratory problems
even when exposed to it for a short time (World Health Organization,
2013; Dimakopoulou et al., 2020). High ozone concentration also affects
vegetation and exacerbates the asthmatic condition, particularly in
children due to the tenderness of their respiratory tract (Yusoff et al.,
2019). According to Mabahwi et al. (2015), 10.36% of respiratory dis­
eases in Malaysia could be linked to air pollutants’ effect, and 19.48% of
mortality is due to respiratory problems caused by air pollutants,
including ozone. Thus, predicting and understanding the formation and
emission rate is essential for alerting the public for appropriate
intervention.
1.2. Modelling of ozone concentration
Ozone concentration can be predicted using deterministic models or
statistical models (Wen et al., 2019; Wang et al., 2020). Deterministic
models such as Weather Research Forecasting (WRF) and community
multi-scale air quality (CMAQ) models use different physical and
chemical mechanism that is associated to air pollutants emission,
transportation, and dispersion (Sharma et al., 2016; Djalalova et al.,
2015). This approach has some limitations which affect the predictive
performance of the model. For instance, deterministic approach exhibit
a very low accuracy for micro-urban settlement (Wang et al., 2020).
Also, it has high computational cost, and depend highly on scarce air
pollutant source and emission data (Tella and Balogun, 2021). The
statistical models and emerging artificial intelligence (e.g. machine
learning and deep learning) algorithms which do not depend on a
physical and chemical mechanism (Bai et al., 2018) are being used to
forecast air pollution. So far, these recent models have shown a higher
predictive accuracy for air pollution modelling compared to the deter­
ministic models (Choubin et al., 2020).
Machine learning is an effective technique for understanding the
inter-dependence of climatic data and air pollution since it supports
exploratory analysis of data without using an empirical model (Tong,
2020; Dou et al., 2019). Further, machine learning addresses the
non-linearity problem, enhancing the model’s predictive performance
(Ma et al., 2020; Li et al., 2019).
Although some studies have been undertaken to model ozone (Yusoff
et al., 2019; Ahamad et al., 2014) and other air pollutants in Malaysia,
most of the adopted models are Multiple Linear Regression (MLR) model
(Nazif et al., 2018; Tan et al., 2016; Abdullah et al., 2019). According to
the review study conducted by (Nur Shaziayani et al., 2020), 72% of
studies used linear regression model compared to other non-linear sta­
tistical models. The MLR model, which has shown better predictive
performance than the deterministic approach, is constrained by model
performance (Ma et al., 2020) when compared to machine learning
models. Also, it suffers from multicollinearity problems between
1.1. Causes and enablers of air pollution
Atmospheric pollution is caused mainly by emission from natural
sources (e.g. volcanic eruption and methane emission) or anthropic
sources (e.g. gases emitted from the burning of fossils fuels, smoking,
industrial activities, vehicular movement, and open burning (How and
Ling, 2016). However, studies have ascertained the impact of climate
change on the formation, dispersion, and transportation of ozone and
other air pollutants (Tong et al., 2018; Kalisa et al., 2018; Orru et al.,
2017). According to Nguyen et al. (2019), and Tang et al. (2020), there
2
A.-L. Balogun and A. Tella
Chemosphere 299 (2022) 134250
predictors (Li et al., 2019). That is, the predictors and the response need
to be linearly correlated, which opposes real-life circumstances which
are not linearly correlated.
Attempts to leverage the high predictive capabilities of machine
learning algorithms for modelling air pollution is limited in Malaysia. A
comparative assessment of different machine learning algorithms for
this purpose is also lacking. Comparing the efficacy of multiple algo­
rithms is vital to identify the optimal algorithm for ozone prediction.
Further, there is little consideration of the impacts of atmospheric
conditions on ozone in the country. Consideration of climatic parame­
ters is pertinent to air pollution management because the pollutants’
behaviour can be better understood by taking into consideration the
climatic variables impacts. Moreover, this study will evince the potential
of climate change increasing air pollution in the future, therefore,
creating insights for decision-makers and stakeholders.
Based on the foregoing, it is hypothesized that ML algorithms will
produce good results in predicting ozone concentration in Malaysia,
although model performance will likely vary based on the strengths and
weaknesses of the ML algorithms. Also, it is hypothesized that climatic
variables will correlate with ozone, particularly in locations with a large
concentration of industries.
Thus, this study investigates the impact of climatic variables on
ozone while considering the variations in seasons. In order to determine
the predictive capability of the climatic variables for ozone concentra­
tion in the study area, four machine learning algorithms (random forest,
linear regression, support vector regression, and decision tree regres­
sion) are used. Leveraging the effectiveness and efficiency of opensource software and programming languages in overcoming the limita­
tion of uncertainty in statistical models (Althuwaynee et al., 2020), the
following objectives are pursued in this study:
ozone was compared using statistical indices, including Root Mean
Square Error (RMSE), Mean Absolute Error (MAE), and Coefficient of
determinant (R2).
2.1. Study area
Malaysia (Fig. 1) is a Southeast Asian nation with latitude and
longitude 4.2105◦ N and 101.9758◦ E, respectively. It has thirteen states
and three federal territories. The country is split into Peninsular
Malaysia and Malaysian Borneo by the south china sea, with over 30
million and a land area of over 320,000 km2 (Ab Rahman et al., 2013).
The country has a tropical climate with high humidity throughout the
year. The boundary oceans subdue warming. It has a mean yearly
rainfall of 250 cm. The annual seasonal variation is based on the
northeast monsoon and southwest monsoon. The northeast monsoon
experiences intense precipitation, while the southwest monsoon is
defined by drier weather (Tang, 2019; Kwan et al., 2013).
Air pollution in the country is driven by urbanization and industri­
alization (Chin et al., 2019). Malaysia’s economic advancement has
contributed to the degradation of its air quality, especially in agglom­
erated regions. The most common air pollution source is emissions from
vehicles and trucks, which contribute over 70% to urban air pollution
(Afroz et al., 2003; Chin et al., 2019). The country is also exposed to
frequent haze pollution caused by open fire from neighbouring
Indonesia (Gaveau et al., 2014; Althuwaynee et al., 2020).
2.2. Sampling stations
Ten air pollution monitoring stations spread across five states in
Peninsular Malaysia were used. Two monitoring stations were selected
from each state for even distribution. The selection comprises two states
in the northern region (Perak and Penang), two states in the central
region (Selangor and Negeri Sembilan), and a state in the southern re­
gion (Malaka), as shown in Fig. 2.
Most of these stations are strategically mounted in urbanindustrialized areas such as Petaling Jaya in Selangor, Bukit Rambai
in Melaka, and Nilai in Negeri Sembilan (Ahmat et al., 2015; Ahamad
et al., 2014). Some monitoring stations are in Penang, the most populous
state in Malaysia after Selangor (Ahamad et al., 2014). Selangor’s
monitoring station (Petaling Jaya) is characterized as an urban area due
to its proximity to the central business district of Kuala Lumpur.
The hourly concentration of ground-level ozone (O3) and climatic
variables was acquired from the Malaysian Department of Environment
(DoE). After that, an analysis of the hourly O3 data was carried out.
(i) To model the correlation between climatic variables and ozone at
ten air pollution monitoring stations;
(ii) To determine the climatic variables’ potential to accurately pre­
dict surface ozone trend using decision tree regression, support
vector regression, random forest, and linear regression algorithm.
(iii) Comparative accuracy assessment of the four machine learning
models
2. Materials and methods
This study’s methodology is classified into four sections. The first
section discusses the data acquisition, preparation, and analysis. We
subsequently classified and analyzed the hourly concentration of the
atmospheric pollution data and the climatic data in Malaysia’s south­
west monsoon season. Southwest monsoon is one of the monsoons in
Malaysia that extends from May to September and denote the temperate
weather (Tella et al., 2021b). The southwest monsoon is characterized
by a rise in temperature and a more prolonged warm climate (Andaya
and Andaya, 2016), which makes it suitable for ozone studies with a
higher concentration during the warm and temperate season (APIMS,
2021).
The study was done using ten monitoring stations in Peninsular
Malaysia from 2012 to 2016. The second phase of the research imple­
mented Pearson’s Rank Correlation to examine the relationship between
the climatic criteria and ozone concentration in each monitoring station
following the approach of (Tella et al., 2021b; Jumin et al., 2020).
Compared to spearman correlation which shows monotonic association
between variables, Pearson’s correlation depicts linear interrelationship
between variables. Thus, the Pearson’s correlation is more used, espe­
cially for air pollution studies. GIS was used to produce ozone source
density maps to visualize the land-use around the monitoring stations
with the highest correlation indices. This is important in order to iden­
tify sources of ozone in the most vulnerable areas. Machine learning
algorithms are used to model the predictive capabilities of the predictor
(climatic variables). The accuracy of the models in predicting surface
2.3. Correlation analysis
The correlation factor will measure the extent of the linear de­
pendency between temperature and Ozone. The correlation ranges from
− 1 to +1. A correlation quotient closer to − 1 shows a weaker correla­
tion, while a coefficient more relative to +1 shows a stronger interre­
lationship. Values of − 1 or +1 depict either completely negative or
positive interlink, respectively, while 0 indicates absolutely no rela­
tionship. The most widely used correlation coefficient, Pearson’s cor­
relation coefficient denoted by r (Tang et al., 2020), is adopted for this
study as shown in Eq. (1).
P(ab) =
Cov (a, b)
δaδb
(1)
where P (ab) is the Pearson correlation coefficient, cov(a,b) is the
covariance of parameters and δaδb is the multiplication of the standard
deviation of the two parameters. A classification index was adopted
(Mukaka, 2012; Hinkle et al., 2003) for the correlation coefficient
interpretation as shown in Table 1.
The correlation coefficient value indicates the level of influence of
3
A.-L. Balogun and A. Tella
Chemosphere 299 (2022) 134250
Fig. 1. The study area map.
the independent variable on the dependent variable. Pandas dataframe.
corr () in python was used for Pearson’s correlation analysis. The
analysis covered five years (2012–2016) to investigate the correlation
trend.
∑n
Xm )(Yi − Ym )
i=1 (Xi −
R2 = √̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
( ∑n
2 )( ∑n
2)
i=1 (Xi − Xm )
i=1 (Yi − Ym )
√̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
∑n
2
i=1 (Yi − Xi )
RMSE =
n
2.4. Machine learning modelling
Machine learning (ML) is an algorithm based computational study
for deriving knowledge from data with implicit instructions (Ma and
Cheng, 2016). ML trains algorithms to accept data and predict new data
using statistical analysis (Sayad et al., 2019). For this study, the fixed
monitoring stations were divided into two for training and testing. Four
machine learning models, random forest, linear regression, support
vector regression, and decision trees regression, were used to predict the
ozone’s hourly concentration. The model is used to ascertain the inde­
pendent variables’ potential (climatic variables) to predict the depen­
dent variable (O3). The model was developed using scikit-learn (Buitinck
et al., 2013) within the python programming environment. 80% of the
dataset was used for model training and the rest of the dataset was used
to test the model. A similar dataset training-testing ratio was adopted by
(Bhalgat et al., 2019; Zeinalnezhad et al., 2020). Model validation was
done using the coefficient of the determinant (R2), which tests for
models’ fitness using values between 0 and 1. Values nearer to 1 depict a
mutual relationship, while values closer to 0 indicate a weaker associ­
ation. The mean absolute error (MAE), which measures the mean ab­
solute distance between predicted and true values, and the root mean
square error (RMSE), which shows the possibility of considerable mis­
predictions were also adopted for model validation. These indicators are
commonly used in validating ML algorithms (Iskandaryan et al., 2020).
Eqs. (1)–(3) show the formula for calculating the R2, RMSE, and MAE,
respectively.
MAE =
n
1∑
|Yi − Xi |
n i=1
(1)
(2)
(3)
where n is the total number of data points or instances, Xi and Yi are the
actual and predicted values, respectively, Xm and Ym are the mean of the
actual and predicted values, respectively.
2.5. Model selection
Linear regression is a statistics-based machine learning model used
for quantitative analysis and prediction of numerical variables based on
correlation, and it is widely applied for air pollution studies (Ezimand
and Kakroodi, 2019). It is used to ascertain how well an explanatory
variable can linearly predict the response variable. As regards multiple
linear regression, more than one predictor is used to predict a single
dependent variable. That is, it is used in examining the association be­
tween multiple predictors and an observed variable (Tella et al., 2021b).
Equation (4) shows the multiple linear regression model.
P = b0 + b1 R1 + b2 R2 ... + bn Rn + ∈
(4)
where P is the predicted or observed variable, while R is the regressor or
explanatory variable. b0 is the y-intercept (constant), b1 , b2 …bn is the
slope coefficient for the regressors, and the ∈ is the residuals. For this
study, the P represents the predicted ozone concentration, while the R
4
Chemosphere 299 (2022) 134250
A.-L. Balogun and A. Tella
Fig. 2. Location of monitoring stations.
represents the climatic variables.
Support vector regression (SVR) is a supervised algorithm and
application of support vectors (Bai et al., 2018) used for the regression
model by researchers (Bishop, 2006; Steinwart and Christmann, 2008).
The SVR model relies on a subset of the training, ignoring any data that
is close to the model’s prediction (within a threshold ε) (Suárez Sánchez
et al., 2011). SVR depends on the choice of kernel and relevant pa­
rameters to solve the regression problem. The kernel used for this study
is the radial basis function (RBF). One of the strengths of SVR is its high
dimensional space which does rely on the input space dimensionality
(Bai et al., 2018). SVR uses a linear function, also called the SVR
equation, for non-linear mapping of the imported data into higher
dimensionality. The SVR equation is presented in Equation 5 (Bai et al.,
2018).
nodes. The root node is the first node that gets split up into more nodes,
called the interior nodes. The interior nodes represent the model’s data
features and decision rules, while the leaf nodes stand for the final result
from the decision. The DecisionTreeRegressor function from sklearn was
used for training the model Fig. 3 shows the decision tree structure.
Random forest invented by Breiman (2001) is an ensemble learning
model that can perform classification, regression, clustering, interaction
detection, and variable selection (Rahmati et al., 2017; Belgiu and
Drăguţ, 2016). The random forest learning method is based on the
combination of decision trees that split the input data based on the
parameters like a tree structure (Ma and Cheng, 2016; Breiman, 2001)
(Fig. 4). Each tree is constructed using a bootstrapped sample of the
data, splitting each node in the tree according to the best subset and
chosen predictors randomly at each point (Araki et al., 2018; Rahmati
et al., 2017). The final class is predicted, and output is resolved based on
the number of the decision trees’ vote (Micheletti et al., 2014; Rahmati
et al., 2017). Random forest is resistant to overfitting and outliers and
P(x) = (ω × φ(x)) + t
where P(x) is the predicted values, ω is the weight vector of the feature
space dimension, and t is the threshold.
Decision Trees (DT) is a non-parametric model of supervised
learning used for both classification and regression analysis. It is based
on a binary tree that splits one or more nodes to make up a decision tree
(Kadavi et al., 2019). The decision trees algorithm splits the dataset into
smaller classes and represents the result in a leaf node. Basically, the
decision tree trains the dataset in the form of a tree structure for pre­
diction. That is why it is sometimes called tree structure regression. DT
has three different nodes, namely, root nodes, interior nodes, and leaf
Table 1
Description of Correlation index.
5
Correlation index
Description
0.90–1.00 (− 0.90 to − 1.00)
0.70–0.90 (− 0.70 to − 0.90)
0.50–0.70 (− 0.50 to − 0.70)
0.30–0.50 (− 0.30 to − 0.50)
0.00–0.30 (0.00 to − 0.30)
Very strong + ve (-ve) correlation
Strong + ve (-ve) correlation
Moderate + ve (-ve) correlation
Weak + ve (-ve) correlation
Negligible correlation
A.-L. Balogun and A. Tella
Chemosphere 299 (2022) 134250
Fig. 3. Architecture of decision tree model.
Fig. 4. Random forest model architecture.
has been established to have high performance (Balogun et al., 2021;
Tella et al., 2021a). RandomForestRegressor of sklearn was used for this
model in python, and the number of estimators (that is trees grown is
500).
3.1. Correlation analysis of climatic variables and O3
The ozone concentration with climatic variables such as tempera­
ture, wind speed and relative humidity were analyzed. This was done for
all the monitoring stations. The ozone concentration is evaluated with
temperature, wind speed and relative humidity to get the correlation
index, presented in Table 2.
From the correlation analysis, we observed that the correlation index
between temperature and ozone for all monitoring stations is > 0.7 and
<0.9. According to Table 2, there exist a strong positive correlation
between temperature and ozone. This study’s outcome aligns with some
previous studies that established a connection between ozone and
temperature. Studies carried out by Ueno and Tsunematsu (2019) in
Japan, Melkonyan and Wagner (2013) in Germany, Tang et al. (2020)
and Pu et al. (2017) in China all indicated a significant effect and cor­
relation of temperature to Ozone, concluding that a rise in warming
3. Results
This section presents the analysis of the correlation study of the air
pollutants (O3) and climatic variables based on seasonal variation
(southwest monsoon and northeast monsoon) from 2012 to 2016. The
outcome of the predictions and evaluation of the machine learning
models’ performance was also presented. The influence of temperature,
wind speed and relative humidity on O3 vis-a-vis seasonal variations is
discussed, and recommended strategies for mitigating atmospheric
pollution in the context of a changing climate are offered.
6
A.-L. Balogun and A. Tella
Chemosphere 299 (2022) 134250
Fig. 5. Air monitoring stations and industries density map.
Kompalli et al. (2014), and Xu et al. (2015), wind speed influences the
air masses, pollutants’ concentration, and dilution due to transboundary
pollution from neighbouring countries. That is, the inflow of pollutants
from a distant place can contribute to an increased concentration in a
location where they were lower pollutants (Oleniacz et al., 2016). In the
presence of nitrogen oxides in the tropospheric air, wind speed rise can
affect the inflow of precursors that aid the formation of ozone in the
atmosphere (Seinfeld and Pandis, 2016; Gorai et al., 2015; Oleniacz
et al., 2016). From Table 2, there exist moderate to a strong negative
correlation between relative humidity and ozone. The correlation ranges
from – 0.662 to – 0.844. The inverse relationship between ozone and
relative humidity was also discovered by a study by Jasaitis et al. (2016),
which is related to the association of humidity with rainfall (Tella et al.,
2021b). Thus, ozone reduces during the rainy period (Jasaitis et al.,
2016). Chen et al. (2019) examined the influence of climatic variables
on ozone concentration in Beijing discover that low humidity is a suit­
able climatic condition for photochemical reaction in ozone production.
From Table 2, it is observed that the strongest correlation between
climatic variables and ozone concentration occurs in the monitoring
stations such as CA0006 (Bukit Rambai), CA0010 (Nilai), CA0009 (Jaya
II Perai), CA0008 (Ipoh), CA00011 (Klang) and CA0016 (Petaling Jaya).
These areas are thus more vulnerable to air pollution than other areas.
This is because most of these stations are located in residential, indus­
trial, and urban regions (Ahmat et al., 2015; Ahamad et al., 2014).
For instance, Tasek Ipoh, situated in Perak, is recognized for its
historical industrial establishments such as cement factories and stone
quarrying. In late 2019, the monitoring stations exceeded the good and
moderate Air Pollution Index (API) level to an unhealthy level (Aqilah,
2019), putting individuals’ lives in danger and disrupting daily
Table 2
Correlation analysis of climatic variables and ozone concentration.
States
Stations
Wind Speed
(m/s)
Relative
Humidity (%)
Melaka
CA0006
CA0043
CA0047
CA0010
CA0009
CA0003
CA0008
CA0020
CA0011
CA0016
0.736
0.619
0.763
0.677
0.696
0.415
0.537
0.564
0.663
0.611
−
−
−
−
−
−
−
−
−
−
Negeri
Sembilan
Penang
Perak
Selangor
0.822
0.736
0.793
0.662
0.844
0.788
0.832
0.778
0.790
0.795
Temperature
(oC)
0.870
0.759
0.819
0.747
0.851
0.794
0.841
0.811
0.842
0.820
causes a higher concentration of ozone, especially during hot periods.
Also, Fu and Tian (2019) concluded from a systematic review of liter­
ature that tropospheric ozone is produced as an outcome of solar
radiation.
There exists a positive correlation between wind speed and ozone
concentration in all ten monitoring stations. The correlation index
ranges from 0.415 to 0.763. Using Table 2 as a baseline, ozone and wind
speed is moderate to strongly correlated. This result aligns with Jasaitis
et al. (2016) whereby an increase in wind speed influences the rise in the
ozone concentration in Bathic Sea in Lithuania. The authors discovered a
higher ozone level as the wind speed rises, while the lowest ozone
concentration was recorded in the absence of wind. Awang et al. (2018)
observed a positive association between wind speed and ozone in
Malaysia, which further validates this research result. According to
7
A.-L. Balogun and A. Tella
Chemosphere 299 (2022) 134250
activities. The API shows the different air quality levels ranging from
good, moderate, unhealthy, very unhealthy, and hazardous. The API of
Malaysia is the calculated daily hourly data of air pollutants obtained
from the Air Quality Monitoring Network in the country (APIMS, 2021).
Therefore, the pollutant with the highest concentration is used to
represent the API value. According to the APIMS (2021), ground-level
ozone is sometimes used to determine the air pollution index value in
some areas in Malaysia. Also, the reading of the ozone concentration is
usually high in the afternoon (APIMS, 2021).
Also, Bukit Rambai is stationed in an industrialized region which is
recognized for its rapid growth (Ahmat et al., 2015) as shown in Fig. 5
while Petaling Jaya and Kelang monitoring stations are located in a
highly dense environment and around many large scale industries, sit­
uated in the most populous state in Malaysia (Ahamad et al., 2014).
Moreover, Petaling Jaya’s monitoring station is located close to the
centre city of Kuala Lumpur, surrounded by commercial, residential and
industrial features (Azmi et al., 2010). The dense concentration of in­
dustries (Fig. 5) explains why these regions are vulnerable to poor air
quality, exacerbated by variations in climatic factors (e.g. temperature),
particularly in the southwest monsoon.
It is noteworthy that a strong correlation between the two variables,
as seen in Table 2. Nevertheless, it is not a conclusive justification to
establish a relationship between the two variables. That is, it does not
necessarily imply causation (Buchanan, 2012; Akoglu, 2018). Thus,
correlation analysis alone is insufficient to explain if an increase in
temperature due to climate change could influence a rise in ozone
concentration. This necessitates adopting four machine learning algo­
rithms to determine these climatic variables’ reliability as predictors of
the ozone variation trend.
is above 0.5. The low MAE and RMSE values are generally acceptable
because low values closer to 0 indicates a high predictive performance
(Fong et al., 2018). The MAE value ranges from 2.344 to 10.390, while
the RMSE value ranges from 2.737 to 10.964. Virtually all the algo­
rithms exhibit a good fit, which establishes a mutual association be­
tween these variables. The RMSE and MAE of all the algorithms give a
low value, which shows the predictive performance’s accuracy.
In related studies that used linear regression, Suhaimi et al. (2019)
obtained R2 of 0.68 and RMSE of 8.67 while Moustris et al. (2012) ob­
tained R2 of 0.65 and RMSE of 25.5. Similar to the predictive perfor­
mance of SVR in this study, Chaiyakhan et al. (2017) obtained a high
SVR predictive performance when compared with linear regression in
air quality prediction in Thailand while Ishak et al. (2017) air pollution
prediction in Tunisia showed that Random Forest outperformed SVR.
This implies that the predictive performance of all the algorithms is
significant for ozone concentration prediction. However, the models’
performances differ.
A comparative assessment of the ML algorithms in this study reveals
that the random forest (RF) has the highest predictive performance,
followed by linear regression, support vector regression and decision
trees regression. A similar outcome was obtained in Watson et al. (2019)
study to predict ozone exposure during a California wildfire. Ten ma­
chine learning algorithms were used with random forest exhibiting the
highest predictive performance. Also, Zhan et al. (2018) used the
random forest to accurately predict China’s ozone concentration. RF is
known for its powerful prediction accuracy and performance, which can
model the non-linear relationship between the predictors and output,
unlike other algorithms such as support vector machines and neural
network (Zhan et al., 2018; Rahmati et al., 2017; Li et al., 2019). The
SVR algorithm required more processing time for both training and
testing phases. Although it has good generalization performance, it
tends to be very slow during the testing stage. This may be due to the
bulkiness of the data used for this study, which supports the findings of
(Ye et al., 2020). Despite variations in the algorithms’ performance, the
results indicate the climatic variables’ capability as predictors of ozone
in the study area, validating the outcome of the correlation analysis.
3.2. Assessment of models’ performance
Table 3 presents the model validation outcome. The prediction gives
relatively high accuracy for ozone concentration for all models. The
coefficient of the determinant (R2) ranges from 0.216 to 0.970, which
indicates a good fit. Notably, over 95% of the coefficient of determinant
4. Discussion
Table 3
Validation outcome of ML models.
Station ID
CA0006
CA0043
CA0047
CA0010
CA0009
CA0003
CA0008
CA0020
CA0011
CA0016
R2
MAE
RMSE
R2
MAE
RMSE
R2
MAE
RMSE
R2
MAE
RMSE
R2
MAE
RMSE
R2
MAE
RMSE
R2
MAE
RMSE
R2
MAE
RMSE
R2
MAE
RMSE
R2
MAE
RMSE
MLR (ppm)
DT (ppm)
RF (ppm)
SVR (ppm)
0.805
5.826
7.374
0.668
6.048
7.787
0.769
7.198
10.311
0.657
4.814
6.743
0.763
5.837
7.706
0.663
8.247
10.679
0.722
5.810
7.872
0.656
2.380
2.971
0.755
6.327
8.520
0.666
8.049
11.282
0.739
6.623
8.538
0.342
8.003
10.964
0.524
10.398
14.790
0.566
5.333
7.586
0.642
6.478
9.464
0.620
7.950
11.341
0.647
5.898
8.870
0.216
7.683
11.462
0.663
6.993
10.001
0.630
7.225
11.878
0.958
2.344
3.439
0.936
2.379
3.423
0.952
3.136
4.676
0.918
2.840
3.297
0.970
1.824
2.737
0.934
2.861
4.722
0.963
2.007
2.881
0.912
2.790
4.284
0.959
2.097
3.491
0.949
2.342
4.412
0.676
7.093
9.058
0.510
6.631
8.626
0.670
8.802
11.953
0.480
5.241
8.641
0.732
5.676
7.683
0.530
9.711
13.158
0.616
6.777
10.823
0.231
5.905
9.335
0.639
6.847
9.233
0.662
6.547
10.978
An extensive study of the climatic impact on ozone concentration in
Malaysia from 2012 to 2016 was examined as mentioned in the objec­
tives. The ozone concentration in Malaysia can be attributed to forest
fires induced by man (APIMS, 2021). Temperature is a significant factor
in the occurrence of forest fire, and wind speed serves as a transportation
medium. Notably, emitted smoke from forest fire transported to
Malaysia in the form of transboundary pollution (Tella et al., 2021b) can
increase the ozone concentration. For instance, it was discovered that
wildfires are one of the sources of ozone in Canada (Moeini et al., 2020).
Also, according to the United States Environmental Protection Agency
(EPA), an increase in ozone concentration was observed from monitored
wildfire in the United States (EPA, 2021).
Relative humidity serves as a reductive factor to the surge in ozone
concentration because it links to rainfall that aid in clearing pollutants
from the atmosphere. Out of these three climatic factors, the tempera­
ture has the strongest nexus with ozone concentration. Notably, the
temperature is one of the climatic parameters which contribute to air
pollution, especially ozone concentration (Dawson et al., 2007; Hede­
gaard et al., 2008; Ng and Awang, 2018). According to Christensen et al.
(2007), a warming climate can worsen ozone pollution in a densely
populated area, thereby increasing ozone concentration and elongating
the ozone season (Wu et al., 2008; Nolte et al., 2008; Bloomer et al.,
2009; Hong et al., 2019).
The Inter-Government Panel forecast on Climate Change (IPCC) in­
dicates a 1 ◦ C rise in the global temperature by 2025 and 3 ◦ C rise before
the end of the 21st century (EEA, 2016). This suggests a potential in­
crease in the impacts of temperature on air quality in the future,
8
Chemosphere 299 (2022) 134250
A.-L. Balogun and A. Tella
considering the inter-link between climate variation and airborne
pollution (Orru et al., 2017; Fuzzi et al., 2015; Bond et al., 2013).
Moreover, the temperature is considered a crucial factor influencing
open burning in Malaysia, which emits pollutants. According to Ahamad
et al. (2014), a high concentration of ozone exceeding the recommended
Malaysian Air Quality Guideline (RMAQG) of 100 ppb is caused by
biomass combustion.
Ozone greatly impacts air quality during the dry season (Rani et al.,
2018), thereby affecting human health. The results suggest a potential
spike in ozone-induced health challenges, particularly during the warm
southwest season. This could be mitigated by adopting measures that
reduce ozone concentration, such as refuelling cars when the weather is
cool, conserving electricity, and boosting public transport while man­
aging private cars (EPA, 2018).
Ahmat, H., Yahaya, A.S., Ramli, N.A., 2015. ’PM10 analysis for three industrialized areas
using extreme value. Sains Malays. 44, 175–185.
Akoglu, H., 2018. ’User’s guide to correlation coefficients. Turk. J. Emerg. Med. 18,
91–93.
Althuwaynee, O.F., Balogun, A.L., Al Madhoun, W., 2020. ’Air pollution hazard
assessment using decision tree algorithms and bivariate probability cluster polar
function: evaluating inter-correlation clusters of PM10 and other air pollutants.
GIScience Remote Sens. 57, 207–226.
An, R., Yu, H., 2018. Impact of ambient fine particulate matter air pollution on health
behaviors: a longitudinal study of university students in Beijing, China. Publ. Health
159, 107–115.
Andaya, B.W., Andaya, L.Y., 2016. A History of Malaysia. Macmillan International
Higher Education.
APIMS, 2021. ’Information about API’. Department of Environmnet, Malaysia. http://a
pims.doe.gov.my/public_v2/aboutapi.html. (Accessed 28 November 2021).
Aqilah, I., 2019. More schools closed in Perak as air quality deteriorates’, star media
group Berhad. https://www.thestar.com.my/news/nation/2019/09/18/more-schoo
ls-closed-in-perak-as-air-quality-deteriorates. (Accessed 22 July 2020).
Araki, S., Shima, M., Yamamoto, K., 2018. Spatiotemporal land use random forest model
for estimating metropolitan NO2 exposure in Japan. Sci. Total Environ. 634,
1269–1277.
Awang, N.R., Ramli, N.A., Shith, S., Zainordin, N.S., Manogaran, H., 2018.
’Transformational characteristics of ground-level ozone during high particulate
events in urban area of Malaysia. Air Qual. Atmos. Health 11, 715–727.
Azid, A., Juahir, H., Toriman, M.E., Endut, A., Kamarudin, M.K.A., Rahman, M.N.A.,
Hasnam, C.N.C., Saudi, A.S.M., Yunus, K., 2015. ’Source Apportionment of Air
Pollution: a Case Study in Malaysia. Jurnal Teknologi, p. 72.
Azmi, S.Z., Latif, M.T., Ismail, A.S., Juneng, L., Jemain, A.A., 2010. Trend and status of
air quality at three different monitoring stations in the Klang Valley, Malaysia. Air
Qual. Atmos. Health 3, 53–64.
Bai, L., Wang, J., Ma, X., Lu, H., 2018. Air pollution forecasts: an overview. Int. J.
Environ. Res. Publ. Health 15, 780.
Balogun, A.-L., Tella, A., Baloo, L., Adebisi, N., 2021. ’A review of the inter-correlation of
climate change, air pollution and urban sustainability using novel machine learning
algorithms and spatial information science. Urban Clim. 40, 100989.
Bayat, R., Ashrafi, K., Shafiepour Motlagh, M., Hassanvand, M.S., Daroudi, R., Fink, G.,
Künzli, N., 2019. ’Health impact and related cost of ambient air pollution in Tehran.
Environ. Res. 176, 108547.
Belgiu, M., Drăguţ, L., 2016. ’Random forest in remote sensing: a review of applications
and future directions. ISPRS J. Photogrammetry Remote Sens. 114, 24–31.
Bhalgat, P., Pitale, S., Bhoite, S., 2019. Air quality prediction using machine learning
algorithm. Int. J. Comput. Appl. Technol. Res. 8, 367–370.
Bishop, C.M., 2006. Pattern Recognition and Machine Learning. springer.
Bloomer, B.J., Stehr, J.W., Piety, C.A., Salawitch, R.J., Dickerson, R.R., 2009. ’Observed
relationships of ozone air pollution with temperature and emissions. Geophys. Res.
Lett. 36.
Bond, T.C., Doherty, S.J., Fahey, D.W., Forster, P.M., Berntsen, T., DeAngelo, B.J.,
Flanner, M.G., Ghan, S., Kärcher, B., Koch, D., 2013. Bounding the role of black
carbon in the climate system: a scientific assessment. J. Geophys. Res. Atmos. 118,
5380–5552.
Breiman, L., 2001. Random forests. Mach. Learn. 45, 5–32.
Buchanan, M., 2012. ’Cause and correlation, 852-52 Nat. Phys. 8.
Buitinck, L., Louppe, G., Blondel, M., Pedregosa, F., Mueller, A., Grisel, O., Niculae, V.,
Prettenhofer, P., Gramfort, A., Grobler, J., 2013. API Design for Machine Learning
Software: Experiences from the Scikit-Learn Project. arXiv preprint arXiv:1309.0238.
Chaiyakhan, K., Chujai, P., Kerdprasop, N., Kerdprasop, K., 2017. Hourly ground-level
ozone concentration prediction using support vector regression. In: International
MultiConference of Engineers and Computer Scientists.
Chen, Z., Zhuang, Y., Xie, X., Chen, D., Cheng, N., Yang, L., Li, R., 2019. ’Understanding
long-term variations of meteorological influences on ground ozone concentrations in
Beijing during 2006–2016. Environ. Pollut. 245, 29–37.
Chin, Y.S.J., De Pretto, L., Thuppil, V., Ashfold, M.J., 2019. ’Public awareness and
support for environmental protection—a focus on air pollution in peninsular
Malaysia. PLoS One 14.
Choubin, B., Abdolshahnejad, M., Moradi, E., Querol, X., Mosavi, A., Shamshirband, S.,
Ghamisi, P., 2020. Spatial Hazard Assessment of the PM10 Using Machine Learning
Models in Barcelona, Spain, vol. 701. Science of The Total Environment, p. 134474.
Christensen, J.H., Hewitson, B., Busuioc, A., Chen, A., Gao, X., Held, R., Jones, R.,
Kolli, R.K., Kwon, W., Laprise, R., 2007. ’Regional climate projections. In: Climate
Change, 2007: the Physical Science Basis. Contribution of Working Group I to the
Fourth Assessment Report of the Intergovernmental Panel on Climate Change.
University Press, Cambridge (Chapter 11).
Dawson, J.P., Adams, P.J., Pandis, S.N., 2007. ’Sensitivity of ozone to summertime
climate in the eastern USA: a modeling case study. Atmos. Environ. 41, 1494–1511.
De Marco, A., Proietti, C., Anav, A., Ciancarella, L., D’Elia, I., Fares, S., Fornasier, M.F.,
Fusaro, L., Gualtieri, M., Manes, F., Marchetto, A., Mircea, M., Paoletti, E.,
Piersanti, A., Rogora, M., Salvati, L., Salvatori, E., Screpanti, A., Vialetto, G.,
Vitale, M., Leonardi, C., 2019. ’Impacts of air pollution on human and ecosystem
health, and implications for the National Emission Ceilings Directive: insights from
Italy. Environ. Int. 125, 320–333.
Dimakopoulou, K., Douros, J., Samoli, E., Karakatsani, A., Rodopoulou, S., Papakosta, D.,
Grivas, G., Tsilingiridis, G., Mudway, I., Moussiopoulos, N., Katsouyanni, K., 2020.
’Long-term exposure to ozone and children’s respiratory health: results from the
RESPOZE study. Environ. Res. 182, 109002.
5. Conclusion
This study investigated the influence of variations in climatic vari­
ables on ozone. Correlation analysis and four machine learning algo­
rithms, random forest, support vector regression, decision trees
regression, and linear regression, were used to investigate the relation­
ship between predictors and Ozone (O3). The correlation analysis shows
a very strong relationship between temperature and ozone in all the ten
air pollution monitoring stations. There is a moderate to strong corre­
lation between wind speed and the ozone, while relative humidity
showed an inverse relationship.
Also, climatic variables from six stations, Tasek Ipoh, Bukit Rambai,
Nilai, jaya II perai Petaling Jaya, and Kelang, exhibit a very high cor­
relation with ozone, indicating their vulnerability to the air pollutant.
Also, the machine learning algorithms confirmed the reliability of cli­
matic variables for ozone concentration prediction. The random forest
exhibits the highest performance, followed by linear regression, support
vector machine, and decision tree regression.
The study concludes that climate change exerts considerable influ­
ence on air quality in urban centres due to variations in climatic factors
such as temperature, wind speed and relative humidity. Residents of the
six most vulnerable locations are at risk of respiratory and cardiovas­
cular problems. Therefore, this study’s outcome provides a sound basis
for implementing evidence-based interventions in the most susceptible
areas considering climatic variations. Therefore, future works should
focus on trend and time-series analysis of climate variables and ozone to
better understand if a future rise or fall of the climatic variables corre­
lates with the rise and fall of the ozone concentration.
Declaration of competing interest
The authors declare that they have no known competing financial
interests or personal relationships that could have appeared to influence
the work reported in this paper.
Acknowledgement
The authors gratefully acknowledge the Department of Environment
(DoE) Malaysia for providing the air quality data used in this study.
References
Ab Rahman, A.K., Abdullah, R., Balu, N., Shariff, F.M., 2013. ’The impact of La Niña and
El Niño events on crude palm oil prices: an econometric analysis. Oil Palm Ind. Econ.
J. 13, 38–51.
Abdullah, S., Nasir, N.H.A., Ismail, M., Ahmed, A.N., Jarkoni, M.N.K., 2019.
’Development of ozone prediction model in urban area. Int. J. Innovative Technol.
Explor. Eng. 8, 2263–2267.
Afroz, R., Hassan, M.N., Ibrahim, N.A., 2003. Review of air pollution and health impacts
in Malaysia. Environ. Res. 92, 71–77.
Ahamad, F., Latif, M.T., Tang, R., Juneng, L., Dominick, D., Juahir, H., 2014. ’Variation
of surface ozone exceedance around Klang Valley, Malaysia. Atmos. Res. 139,
116–127.
9
Chemosphere 299 (2022) 134250
A.-L. Balogun and A. Tella
Djalalova, I., Delle Monache, L., Wilczak, J., 2015. ’PM2. 5 analog forecast and Kalman
filter post-processing for the Community Multiscale Air Quality (CMAQ) model.
Atmos. Environ. 108, 76–87.
Dou, J., Yunus, A.P., Tien Bui, D., Merghadi, A., Sahana, M., Zhu, Z., Chen, C.-W.,
Khosravi, K., Yang, Y., Pham, B.T., 2019. Assessment of advanced random forest and
decision tree algorithms for modeling rainfall-induced landslide susceptibility in the
Izu-Oshima Volcanic Island, Japan. Sci. Total Environ. 662, 332–346.
EEA, 2016. Air Pollutants and Global Effects. European Environment. https://www.eea.
europa.eu/publications/2599XXX/page009.html.
EPA, 2018. ’Actions You Can Take to Reduce Air Pollution. United States Environmental
Protection Agency. https://www3.epa.gov/region1/airquality/reducepollution.
html#:~:text=On%20Days%20when%20High%20Ozone,Walk%20to%20errands%
20when%20possible. (Accessed 22 June 2020).
EPA, 2021. ’Study Provides New Insights into Impacts of Wildland Fires on Ozone
Monitoring Equipment. https://www.epa.gov/sciencematters/study-provides-ne
w-insights-impacts-wildland-fires-ozone-monitoring-equipment#:~:text=Study%
20Provides%20New%20Insights%20Into%20Impacts%20of%20Wildland%20Fires
%20on%20Ozone%20Monitoring%20E. (Accessed 28 November 2021). quipment
,-EPA%20research%20team&text=States%20have%20observed%20unexplained%
20increases,active%20wildfires%20or%20prescribed%20burns.
Ezimand, K., Kakroodi, A., 2019. ’Prediction and spatio–Temporal analysis of ozone
concentration in a metropolitan area. Ecol. Indicat. 103, 589–598.
Fong, S., Abdullah, S., Ismail, M., 2018. ’Forecasting of particulate matter (PM 10)
concentration based on gaseous pollutants and meteorological factors for different
monsoon of urban coastal area in Terengganu. J. Sustain. Sci. Manag. 5, 3–17.
Fu, T.-M., Tian, H., 2019. Climate change penalty to ozone air quality: review of current
understandings and knowledge gaps. Curr. Pollut. Rep. 5, 159–171.
Fuzzi, S., Baltensperger, U., Carslaw, K., Decesari, S., Denier van der Gon, H.,
Facchini, M.C., Fowler, D., Koren, I., Langford, B., Lohmann, U., 2015. ’Particulate
matter, air quality and climate: lessons learned and future needs. Atmos. Chem.
Phys. 15, 8217–8299.
Gaveau, D.L., Salim, M.A., Hergoualc’h, K., Locatelli, B., Sloan, S., Wooster, M.,
Marlier, M.E., Molidena, E., Yaen, H., DeFries, R., 2014. ’Major atmospheric
emissions from peat fires in Southeast Asia during non-drought years: evidence from
the 2013 Sumatran fires. Sci. Rep. 4, 6112.
Gorai, A., Tuluri, F., Tchounwou, P., Ambinakudige, S., 2015. Influence of local
meteorology and NO 2 conditions on ground-level ozone concentrations in the
eastern part of Texas, USA. Air Qual. Atmos. Health 8, 81–96.
Hamid, M.A., Long, K.Q., 2017. An assessment of environmental impacts assessment
(EIA) in Malaysia. In: SHS Web Of Conferences, 00018. EDP Sciences.
Hanaoka, T., Masui, T., 2019. ’Exploring Effective Short-Lived Climate Pollutant
Mitigation Scenarios by Considering Synergies and Trade-Offs of Combinations of
Air Pollutant Measures and Low Carbon Measures towards the Level of the 2◦ C
Target in Asia. Environmental Pollution, p. 113650.
Hassan, N., Hashim, Z., Hashim, J., 2015. Impact of Climate Change on Air Quality and
Public Health in Urban Areas. Asia-Pacific journal of public health/Asia-Pacific
Academic Consortium for Public Health.
He, H.-d., 2017. ’Multifractal Analysis of Interactive Patterns between Meteorological
Factors and Pollutants in Urban and Rural Areas. Atmospheric Environment.
Hedegaard, G.B., Brandt, J., Christensen, J.H., Frohn, L.M., Geels, C., Hansen, K.M.,
Stendel, M., 2008. ’Impacts of climate change on air pollution levels in the Northern
Hemisphere with special focus on Europe and the Arctic. In: Air Pollution Modeling
and its Application XIX. Springer.
Hinkle, D.E., Wiersma, W., Jurs, S.G., 2003. Applied Statistics for the Behavioral
Sciences. Houghton Mifflin College Division.
Hong, C., Zhang, Q., Zhang, Y., Davis, S.J., Tong, D., Zheng, Y., Liu, Z., Guan, D., He, K.,
Schellnhuber, H.J., 2019. ’Impacts of climate change on future air quality and
human health in China. Proc. Natl. Acad. Sci. Unit. States Am. 116, 17193–17200.
How, C.Y., Ling, Y.E., 2016. The influence of PM2. 5 and PM10 on air pollution index
(API). Environ. Eng. Hydraul. Hydrol.: Proc. Civil Eng. Univ. Teknol. Malays. Johor,
Malays. 3, 132.
Ishak, A.B., Daoud, M.B., Trabelsi, A., 2017. ’Ozone concentration forecasting using
statistical learning approaches. J. Mater. Environ. Sci. 8, 4532–4543.
Iskandaryan, D., Ramos, F., Trilles Oliver, S., 2020. Air quality prediction in smart cities
using machine learning technologies based on sensor data: a review. Appl. Sci. 10,
2401.
Jasaitis, D., Vasiliauskienė, V., Chadyšienė, R., Pečiulienė, M., 2016. Surface ozone
concentration and its relationship with UV radiation, meteorological parameters and
radon on the eastern coast of the Baltic sea. Atmosphere 7.
Jumin, E., Zaini, N., Ahmed, A.N., Abdullah, S., Ismail, M., Sherif, M., Sefelnasr, A., ElShafie, A., 2020. ’Machine learning versus linear regression modelling approach for
accurate ozone concentrations prediction. Eng. Appl. Computat. Fluid Mech. 14,
713–725.
Kadavi, P.R., Lee, C.-W., Lee, S., 2019. ’Landslide-susceptibility mapping in Gangwon-do,
South Korea, using logistic regression and decision tree models. Environ. Earth Sci.
78, 116.
Kalisa, E., Fadlallah, S., Amani, M., Nahayo, L., Habiyaremye, G., 2018. ’Temperature
and air pollution relationship during heatwaves in Birmingham, UK. Sustain. Cities
Soc. 43, 111–120.
Khan, M.F., Sulong, N.A., Latif, M.T., Nadzir, M.S.M., Amil, N., Hussain, D.F.M., Lee, V.,
Hosaini, P.N., Shaharom, S., Yusoff, N.A.Y.M., Hoque, H.M.S., Chung, J.X.,
Sahani, M., Mohd Tahir, N., Juneng, L., Maulud, K.N.A., Abdullah, S.M.S., Fujii, Y.,
Tohno, S., Mizohata, A., 2016. ’Comprehensive assessment of PM2.5
physicochemical properties during the Southeast Asia dry season (southwest
monsoon), 589-14,611 J. Geophys. Res. Atmos. 121 (14).
Kim, S.E., Honda, Y., Hashizume, M., Kan, H., Lim, Y.-H., Lee, H., Kim, C.T., Yi, S.-M.,
Kim, H., 2017. Seasonal analysis of the short-term effects of air pollution on daily
mortality in Northeast Asia. Sci. Total Environ. 576, 850–857.
Kompalli, S.K., Babu, S.S., Moorthy, K.K., Manoj, M., Kumar, N.K., Shaeb, K.H.B.,
Joshi, A.K., 2014. Aerosol black carbon characteristics over Central India: temporal
variation and its dependence on mixed layer height. Atmos. Res. 147, 27–37.
Kwan, M.S., Tangang, F.T., Juneng, L., 2013. ’Projected changes of future climate
extremes in Malaysia. Sains Malays. 42, 1051–1059.
Latif, M.T., Othman, M., Idris, N., Juneng, L., Abdullah, A.M., Hamzah, W.P., Khan, M.F.,
Nik Sulaiman, N.M., Jewaratnam, J., Aghamohammadi, N., Sahani, M., Xiang, C.J.,
Ahamad, F., Amil, N., Darus, M., Varkkey, H., Tangang, F., Jaafar, A.B., 2018. Impact
of regional haze towards air quality in Malaysia: a review. Atmos. Environ. 177,
28–44.
Lee, D., Robertson, C., Ramsay, C., Gillespie, C., Napier, G., 2019. Estimating the health
impact of air pollution in Scotland, and the resulting benefits of reducing
concentrations in city centres. Spatial Spatio-temp. Epidemiol. 29, 85–96.
Li, R., Cui, L., Meng, Y., Zhao, Y., Fu, H., 2019. ’Satellite-based prediction of daily SO2
exposure across China using a high-quality random forest-spatiotemporal Kriging
(RF-STK) model for health risk assessment. Atmos. Environ. 208, 10–19.
Ma, J., Cheng, J.C.P., 2016. Identifying the influential features on the regional energy
use intensity of residential buildings based on Random Forests. Appl. Energy 183,
193–201.
Ma, J., Ding, Y., Cheng, J.C.P., Jiang, F., Tan, Y., Gan, V.J.L., Wan, Z., 2020.
’Identification of high impact factors of air quality on a national scale using big data
and machine learning techniques. J. Clean. Prod. 244, 118955.
Mabahwi, N.A., Leh, O.L.H., Omar, D., 2015. ’Urban air quality and human health effects
in Selangor, Malaysia. Procedia-Soc. Behav. Sci. 170, 282–291.
Manimaran, P., Narayana, A., 2018. ’Multifractal detrended cross-correlation analysis on
air pollutants of University of Hyderabad Campus, India. Phys. Stat. Mech. Appl.
502, 228–235.
Melkonyan, A., Wagner, P., 2013. ’Ozone and its projection in regard to climate change.
Atmos. Environ. 67, 287–295.
Micheletti, N., Foresti, L., Robert, S., Leuenberger, M., Pedrazzini, A., Jaboyedoff, M.,
Kanevski, M., 2014. ’Machine learning feature selection methods for landslide
susceptibility mapping. Math. Geosci. 46, 33–57.
Moeini, O., Tarasick, D.W., McElroy, C.T., Liu, J., Osman, M.K., Thompson, A.M.,
Parrington, M., Palmer, P.I., Johnson, B., Oltmans, S.J., 2020. Estimating wildfiregenerated ozone over North America using ozonesonde profiles and a differential
back trajectory technique. Atmos. Environ. X 7, 100078.
Moustris, K., Nastos, P., Larissi, I., Paliatsos, A., 2012. Application of multiple linear
regression models and artificial neural networks on the surface ozone forecast in the
greater Athens area, Greece. Adv. Meteorol. 2012.
Mukaka, M.M., 2012. Statistics corner: a guide to appropriate use of correlation
coefficient in medical research. Malawi Med. J. : J. Med. Assoc. Malawi 24, 69–71.
Nassikas, N., Spangler, K., Fann, N., Nolte, C.G., Dolwick, P., Spero, T.L., Sheffield, P.,
Wellenius, G.A., 2020. Ozone-related asthma emergency department visits in the US
in a warming climate. Environ. Res. 183, 109206.
Nazif, A., Mohammed, I., Malakahmad, A., Abualqumboz, M., 2018. ’Multivariate
analysis of monsoon seasonal variation and prediction of particulate matter episode
using regression and hybrid models. Int. J. Environ. Sci. Technol. 16.
Ng, K.Y., Awang, N., 2018. ’Multiple linear regression and regression with time series
error models in forecasting PM10 concentrations in Peninsular Malaysia. Environ.
Monit. Assess. 190, 63.
Nguyen, G.T.H., Shimadera, H., Uranishi, K., Matsuo, T., Kondo, A., 2019. Numerical
assessment of PM2. 5 and O3 air quality in Continental Southeast Asia: impacts of
potential future climate change. Atmos. Environ. 215, 116901.
Nolte, C.G., Gilliland, A.B., Hogrefe, C., Mickley, L.J., 2008. Linking global to regional
models to assess future climate impacts on surface ozone levels in the United States.
J. Geophys. Res. Atmos. 113.
Nur Shaziayani, W., Zia Ul-Saufie, A., Libasin, Z., Norsyiha Ahmad Shukri, F., Sarimah
Syed Abdullah, S., Mohamed Noor, N., 2020. A review of PM10 concentrations
modelling in Malaysia. IOP Conf. Ser. Earth Environ. Sci. 616, 012008.
Oleniacz, R., Bogacki, M., Szulecka, A., Rzeszutek, M., Mazur, M., 2016. Assessing the
impact of wind speed and mixing-layer height on air quality in Krakow (Poland) in
the years 2014–2015. J. Civil Eng. Environ. Arch. 63, 315–342.
Organization, W. H., 2013. Review of Evidence on Health Aspects of Air
Pollution–REVIHAAP Project: Final Technical Report. WHO European Centre for
Environment and Health, Bonn.
Organization, W. H., 2016. Ambient Air Pollution: A Global Assessment of Exposure and
Burden of Disease.
Orru, H., Ebi, K., Forsberg, B., 2017. The interplay of climate change and air pollution on
health. Curr. Environ. Health Rep. 4, 504–513.
Plocoste, T., Calif, R., Jacoby-Koaly, S., 2019. ’Multi-scale time dependent correlation
between synchronous measurements of ground-level ozone and meteorological
parameters in the Caribbean Basin. Atmos. Environ. 211, 234–246.
Pu, X., Wang, T., Huang, X., Melas, D., Zanis, P., Papanastasiou, D., Poupkou, A., 2017.
Enhanced surface ozone during the heat wave of 2013 in Yangtze River Delta region,
China. Sci. Total Environ. 603, 807–816.
Rahmati, O., Tahmasebipour, N., Haghizadeh, A., Pourghasemi, H.R., Feizizadeh, B.,
2017. ’Evaluation of different machine learning models for predicting and mapping
the susceptibility of gully erosion. Geomorphology 298, 118–137.
Rani, N.L.A., Azid, A., Khalit, S.I., Juahir, H., Samsudin, M.S., 2018. Air pollution index
trend analysis in Malaysia, 2010-15. Pol. J. Environ. Stud. 27.
Rovira, J., Domingo, J.L., Schuhmacher, M., 2020. ’Air quality, health impacts and
burden of disease due to air pollution (PM10, PM2.5, NO2 and O3): application of
10
A.-L. Balogun and A. Tella
Chemosphere 299 (2022) 134250
Wang, W., Liu, C., Ying, Z., Lei, X., Wang, C., Huo, J., Zhao, Q., Zhang, Y., Duan, Y.,
Chen, R., 2019. Particulate air pollution and ischemic stroke hospitalization: how the
associations vary by constituents in Shanghai, China. Sci. Total Environ. 695,
133780.
Watson, Gregory L., Telesca, Donatello, Reid, Colleen E., Pfister, Gabriele G.,
Jerrett, Michael, 2019. Machine learning models accurately predict ozone exposure
during wildfire events. Environ. Pollut. 254, 112792.
Wen, C., Liu, S., Yao, X., Peng, L., Li, X., Hu, Y., Chi, T., 2019. A novel spatiotemporal
convolutional long short-term neural network for air pollution prediction. Sci. Total
Environ. 654, 1091–1099.
Wong, L.P., Alias, H., Aghamohammadi, N., Ghadimi, A., Sulaiman, N.M.N., 2017.
Control measures and health effects of air pollution: a survey among public
transportation commuters in Malaysia. Sustainability 9, 1616.
Wu, B., Li, T., Baležentis, T., Štreimikienė, D., 2019. Impacts of income growth on air
pollution-related health risk: exploiting objective and subjective measures. Resour.
Conserv. Recycl. 146, 98–105.
Wu, S., Mickley, L.J., Leibensperger, E.M., Jacob, D.J., Rind, D., Streets, D.G., 2008.
Effects of 2000–2050 global change on ozone air quality in the United States.
J. Geophys. Res. Atmos. 113.
Xie, M., Shu, L., Wang, T.-j., Liu, Q., Gao, D., Li, S., Zhuang, B.-l., Han, Y., Li, M.-m.,
Chen, P.-l., 2017. Natural emissions under future climate condition and their effects
on surface ozone in the Yangtze River Delta region, China. Atmos. Environ. 150,
162–180.
Xu, J., Yan, F., Xie, Y., Wang, F., Wu, J., Fu, Q., 2015. Impact of meteorological
conditions on a nine-day particulate matter pollution event observed in December
2013, Shanghai, China. Particuology 20, 69–79.
Yang, J., Shi, B., Shi, Y., Marvin, S., Zheng, Y., Xia, G., 2020. Air pollution dispersal in
high density urban areas: research on the triadic relation of wind, air pollution, and
urban form. Sustain. Cities Soc. 54, 101941.
Ye, Z., Yang, J., Zhong, N., Tu, X., Jia, J., Wang, J., 2020. Tackling environmental
challenges in pollution controls using artificial intelligence: a review. Sci. Total
Environ. 699, 134279.
Yusoff, M.F., Latif, M.T., Juneng, L., Khan, M.F., Ahamad, F., Chung, J.X., Mohtar, A.A.
A., 2019. Spatio-temporal assessment of nocturnal surface ozone in Malaysia. Atmos.
Environ. 207, 105–116.
Zeinalnezhad, M., Chofreh, A.G., Goni, F.A., Klemeš, J.J., 2020. Air pollution prediction
using semi-experimental regression model and Adaptive Neuro-Fuzzy Inference
System. J. Clean. Prod. 261, 121218.
Zhan, Y., Luo, Y., Deng, X., Grieneisen, M.L., Zhang, M., Di, B., 2018. Spatiotemporal
prediction of daily ambient ozone levels across China using random forest for human
exposure assessment. Environ. Pollut. 233, 464–473.
Zhao, W., Fan, S., Guo, H., Gao, B., Sun, J., Chen, L., 2016. Assessing the impact of local
meteorological variables on surface ozone in Hong Kong during 2000–2015 using
quantile and multiple line regression models. Atmos. Environ. 144, 182–193.
AirQ+ model to the Camp de Tarragona County (Catalonia, Spain). Sci. Total
Environ. 703, 135538.
Sayad, Y.O., Mousannif, H., Al Moatassime, H., 2019. ’Predictive modeling of wildfires: a
new dataset and machine learning approach. Fire Saf. J. 104, 130–146.
Seinfeld, J.H., Pandis, S.N., 2016. Atmospheric Chemistry and Physics: from Air
Pollution to Climate Change. John Wiley & Sons.
Sharma, S., Chatani, S., Mahtta, R., Goel, A., Kumar, A., 2016. ’Sensitivity analysis of
ground level ozone in India using WRF-CMAQ models. Atmos. Environ. 131, 29–40.
Steinwart, I., Christmann, A., 2008. Support Vector Machines. Springer Science &
Business Media.
Suárez Sánchez, A., García Nieto, P.J., Riesgo Fernández, P., del Coz Díaz, J.J., IglesiasRodríguez, F.J., 2011. Application of an SVM-based regression model to the air
quality study at local scale in the Avilés urban area (Spain). Math. Comput. Model.
54, 1453–1466.
Suhaimi, N., Ghazali, N.A., Nasir, M.Y., Mokhtar, M.I.Z., Ramli, N.A., Yusof, N.F.F.M., UlSaufie, A.Z., 2019. Daytime ozone concentration prediction using statistical models.
J. Sustain. Sci. Manag. 14 (3), 7–11.
Tan, K.C., San Lim, H., Jafri, M.Z.M., 2016. Prediction of column ozone concentrations
using multiple regression analysis and principal component analysis techniques: a
case study in peninsular Malaysia. Atmos. Pollut. Res. 7, 533–546.
Tang, K.H.D., 2019. Climate change in Malaysia: trends, contributors, impacts,
mitigation and adaptations. Sci. Total Environ. 650, 1858–1871.
Tang, X., Gao, X., Li, C., Zhou, Q., Ren, C., Feng, Z., 2020. Study on spatiotemporal
distribution of airborne ozone pollution in subtropical region considering
socioeconomic driving impacts: a case study in Guangzhou, China. Sustain. Cities
Soc. 54, 101989.
Tella, A., Balogun, A.-L., 2021. GIS-based air quality modelling: spatial prediction of
PM10 for Selangor State, Malaysia using machine learning algorithms. Environ. Sci.
Pollut. Control Ser. 1–17.
Tella, A., Balogun, A.-L., Adebisi, N., Abdullah, S., 2021a. Spatial assessment of PM10
hotspots using random forest, K-nearest neighbour and Naïve Bayes. Atmos. Pollut.
Res. 12, 101202.
Tella, A., Balogun, A.-L., Faye, I., 2021b. Spatio-temporal modelling of the influence of
climatic variables and seasonal variation on PM10 in Malaysia using multivariate
regression (MVR) and GIS. Geomatics, Nat. Hazards Risk 12, 443–468.
Tong, C.H.M., Yim, S.H.L., Rothenberg, D., Wang, C., Lin, C.-Y., Chen, Y.D., Lau, N.C.,
2018. Projecting the impacts of atmospheric conditions under climate change on air
quality over the Pearl River Delta region. Atmos. Environ. 193, 79–87.
Tong, W., 2020. Chapter 5 - machine learning for spatiotemporal big data in air
pollution. In: Li, Lixin, Zhou, Xiaolu, Tong, Weitian (Eds.), Spatiotemporal Analysis
of Air Pollution and its Application in Public Health. Elsevier.
Ueno, H., Tsunematsu, N., 2019. Sensitivity of ozone production to increasing
temperature and reduction of precursors estimated from observation data. Atmos.
Environ. 214, 116818.
Wang, H.-W., Li, X.-B., Wang, D., Zhao, J., He, H.-d., Peng, Z.-R., 2020. ’Regional
prediction of ground-level ozone using a hybrid sequence-to-sequence deep learning
approach. J. Clean. Prod. 253, 119841.
11
Download