1215 ARTICLE Real-time winter road surface condition monitoring using an improved residual CNN Guangyuan Pan, Matthew Muresan, Ruifan Yu, and Liping Fu Abstract: This paper proposes a real-time winter road surface condition (RSC) monitoring solution that automatically generates descriptive RSC information in terms of snow and ice coverage by using images from fixed traffic and weather cameras. Several state-of-the-art pre-trained deep neural networks are customized and fine-tuned to address a specific domain, classifying the amount of snow coverage on a road surface. A thorough evaluation is conducted to identify and select the best model. This evaluation uses an extensive set of experiments to test the accuracy and generalization of each model and uses transfer-learning to fine-tune each of the pre-trained models on independent images from different traffic and weather cameras. The transferability of each model, relationship between model performance and data size, and the system settings of each model are then examined. Lastly, three online weight calibration methods are proposed to automatically update the model in new environments. The result shows that re-training the model using images from a mixed set of cameras has the most promising results. Key words: road surface condition, realtime recognition, deep learning, convolutional neural network. Résumé : Dans ce document, on propose une solution de surveillance en temps réel de l’état de la surface de roulement (ESR) en hiver qui génère automatiquement des informations descriptives de l’ESR en matière de couverture de neige et glace en utilisant des images de caméras fixes de circulation et météo. Plusieurs réseaux de neurones profonds préentraînés de pointe sont adaptés et peaufinés pour traiter un domaine spécifique, soit celui de classifier la quantité de couverture de neige sur une surface de roulement. Une évaluation approfondie est effectuée afin de déterminer et de choisir le meilleur modèle. Cette évaluation utilise un vaste ensemble d’expériences pour vérifier la précision et la généralisation de chaque modèle et utilise l’apprentissage par transfert pour peaufiner chacun des modèles pré-entrainés sur des images indépendantes provenant de différentes caméras de circulation et météo. La transférabilité de chaque modèle, la relation entre la performance du modèle et la taille des données ainsi que les paramètres du système de chaque modèle sont ensuite examinés. Enfin, trois méthodes de calibration de poids en ligne sont proposées pour mettre à jour automatiquement le modèle dans de nouveaux environnements. Le résultat montre que le réentrainement du modèle à l’aide d’images provenant d’un ensemble de caméras fournit les résultats les plus prometteurs. [Traduit par la Rédaction] Mots-clés : état de la surface de roulement, reconnaissance en temps reel, apprentissage profond, réseau neuronal convolutif. 1. Introduction In countries with severe winter seasons such as Canada, road surface conditions (RSC) on highways during snow events could vary remarkably from location to location and over time. Due to the vast spatial distances covered by the highway network and the uncertain nature of the weather events, such variations are often hard to monitor and predict, making both winter road maintenance and public travel extremely challenging. Maintenance agencies often struggle to obtain up-to-date information on the RSC of the highway network, which is essential to making effective and efficient decisions on managing maintenance operations such as salting and plowing. Providing RSC data in realtime to the general public can also have benefits as it allows them to consider current roadway conditions when planning travel. Traditionally, RSC monitoring is done by either manual patrolling by highway agencies and maintenance contractors or using road weather information system (RWIS). Both methods however suffer from their limitations. Manual observation by patrollers is subjective, inaccurate, and time-consuming while data from RWIS are limited to the sparse points where RWIS stations are installed (Buchanan and Gwartz 2005; Kwon et al. 2015, 2017; Gu et al. 2019). Some new technologies, for example, in-vehicle video recorders, smartphone-based systems, and high-end imaging systems, have been developed to collect RSC data; however, the application of these technologies still requires manual observation as no reliable image recognition solutions are available to automate the process (Hong et al. 2009; Omer and Fu 2010; Jonsson et al. 2015; Linton and Fu 2015). Researchers have also attempted to Received 19 June 2019. Accepted 13 October 2020. G. Pan. Department of Civil & Environmental Engineering, University of Waterloo, Waterloo, ON, Canada; Shenzhen Garry Intelligent Technology Limited, Shenzhen, China. M. Muresan and L. Fu.* Department of Civil & Environmental Engineering, University of Waterloo, Waterloo, ON, Canada. R. Yu. Google, Kitchener, ON, Canada. Corresponding author: Liping Fu (email: lfu@uwaterloo.ca). *Liping Fu served as an Associate Editor at the time of manuscript review and acceptance; peer review and editorial decisions regarding this manuscript were handled by another Editorial Board Member. Copyright remains with the author(s) or their institution(s). Permission for reuse (free in most cases) can be obtained from copyright.com. Can. J. Civ. Eng. 48: 1215–1222 (2021) dx.doi.org/10.1139/cjce-2019-0367 Published at www.cdnsciencepub.com/cjce on 20 October 2020. 1216 apply some traditional machine learning models, including artificial neural networks (ANN), random forests (RF), and support vector machines (SVM) to classify winter RSC images; however, these models have not been shown to be practically applicable in terms of accuracy and transferability (Linton and Fu 2016). This has however changed due to the development of a novel machine learning technique – deep learning (DL) or deep neural networks (DNN). DNN models have been extensively studied and shown excellent performance to solve a variety of problems including unsupervised-learning based classification, objects observation, forecasting, and reinforcement learning based game playing (LeCun et al. 2015; Qiao et al. 2019), some deep learning and big-data based approaches have been adopted for traffic flow forecasting, accident prediction and other applications, showing satisfying improvements (Lv et al. 2015; Basso et al. 2018; Zhang et al. 2017, 2018; Lippi et al. 2013). In our previous effort, we have shown some promising results when applying this technique for tackling the RSC recognition problems (Pan et al. 2018, 2019). This paper describes the results of our continuing effort with a specific focus on exploring the potential of applying one of the most successful convolutional neural networks, deep residual network (ResNet), to classify winter road surface conditions. The research has made the following critical contributions: (1) it is the first effort to customize and compare several state-of-the-art pre-trained convolutional neural network (CNN) models for monitoring winter RSC using images from traffic and RWIS cameras; (2) this research has extensively investigated the relationship between the performance of the customized pre-trained CNN models and data size to the transferability of the trained models using images from new cameras; and (3) three new model updating methods are proposed and tested to automate the model updating using new data for real-world implementation. 2. Literature review The image recognition problem has been studied extensively in the literature. Many machine learning models have been proposed in the past decades. In particular, convolutional neural networks (CNN) have been applied successfully to solve pattern recognition problems, including large-scale image classification and video analysis. A CNN is a kind of feed forward neural network that contains convolutional computing elements with deep structure and is one of the most typical algorithms applied in deep learning. This success has been made possible due to both the availability of large public image repositories (such as ImageNet), and advances in computational power such as graphic processing units (GPU), tensor processing units (TPU), and recent neural-network processing units (NPU) (Glorot and Bengio 2010; He and Sun 2015). In recent years, CNN has become one of the most popular techniques used in the machine learning field and many variations have been developed for improved performance (Sermanet et al. 2014; Howard 2014). One of the successful models was called “AlexNet” which was developed in 2012 by Krizhevsky et al. (2012). Since its development, more research has been done to advance the state of the art in the field and many successful models have also been developed by commercial entities, such as Microsoft’s ResNet and Google’s Xception and Inception (Simonyan and Zisserman 2014; He et al. 2016a; Szegedy et al. 2016; Chollet 2016). Although deep learning provides excellent performance in many settings, there are many situations where it still faces difficulties. Deep learning, like many other big data technologies, is unable to learn useful features without significant data input. When data are limited, approaches common to traditional machine learning methods must be employed, such as the preprocessing of raw images (Linton and Fu 2016; Liu et al. 2013; Breiman 2001; Marsh and Bearfield 2004). Furthermore, while previously developed models excel in many situations, some tasks may provide additional challenges that reduce their Can. J. Civ. Eng. Vol. 48, 2021 effectiveness (Veit et al. 2016; He et al. 2016b; Chen-McCaig et al. 2017; Soon et al. 2018; Mondal et al. 2017; Yi and Shirk 2018; Xue and Li 2018). One key strategy that has been developed recently to address this is transfer learning, which is the idea of adapting a trained model for a new task (Pan and Yang 2010; Zheng et al. 2018). In a previous work, we attempted to apply a number of machine learning methods for RSC classification (Linton and Fu 2016). We further extended this work by, developing a VGG16 based RSC recognition model using images from in-vehicle cameras and showed that the proposed model had the best performance when compared to other traditional machine learning techniques (Pan et al. 2018). In another work, we studied the testing accuracy of the model using different training dataset sizes and showed the potential of achieving much higher accuracy with a larger training dataset. In another work (Pan et al. 2019), we evaluated four of the most successful CNN models in recent years, namely, VGG16 (Oxford), ResNet50 (Microsoft), Inception-V3 (Google) and Xception (Google), for their potential to address the particular challenges that hinder RSC classification. To summarize, (1) CNN has been shown to have superior performance for image recognition, (2) transfer learning based on pre-trained CNNs has increased learning efficiency greatly, (3) several pre-trained CNNs are tested on road surface conditional monitoring using small amount of images; however, the performance of these models is yet to be evaluated using large-scale datasets from real-world applications. Hence, based on those works, in this paper we aim at establishing a relatively complete RSC recognition system based on ResNet50 using transfer learning theory and constructing an automatically updating method for the system. 3. Deep residual neural network 3.1. Model structure Deep residual neural networks are a type of convolutional neural network that have been shown to be considerably more powerful than other models for image recognition based on testing using two standard benchmark image datasets: CIFAR-10 and ImageNet-2012 (Göring et al. 2013). The advantage of making use of one of the successful pretrained models is that they have already been trained with millions of images, which means that they could have “remembered” the major common features of various objects; therefore, transferring the model to new applications with some additional training using a smaller data set is expected to be much easier than training a completely new model using a large data set. In our previous research, we have shown that ResNet50 is the best performer providing the most accurate predictions. In this research, the model is trained by using raw images that are resized into a three channeled (Red, Green, Blue) image with dimensions of 224 224. RGB values are normalized by subtracting the mean RGB value of each pixel from the value of each pixel. Although this causes some minor information loss, it reduces the complexity of the learning process and lowers the computation time. The original images are also classified into one of four classes – bare pavement, partly covered pavement, fully covered pavement, and unrecognizable images. Four nodes are used in the output layer with the output from each node representing the probability that the image matches each of the categories. This structure is identical to the one used in a previous work (Pan et al. 2019) and uses two layers with 1000 neurons in each layer. As outlined in the next section, all nodes in the hidden layers are rectified linear units (ReLU). All the models used in this research have been pretrained on general-purpose images, we therefore solve this RSC problem by using the highest probability value provided by the SoftMax output layer to select the category for the image. As the ImageNet database used to train the ResNet model does not directly include RSC classes and image, it is necessary to add these classifications by further training and fine Published by Canadian Science Publishing Pan et al. Fig. 1. Development of the winter road condition monitoring system. [Colour online.] 1217 Step 4 is introduced to automate the model updating process. In real world applications, new cameras could be installed over time and the camera settings could be changed to address some new condition monitoring needs. This step allows the model to be further updated automatically with a completely new data set without significant loss in prediction performance. This process is done by training the model for a few epochs using data from the newly installed cameras or cameras with new settings. 4. Experiment tuning the model with our domain-specific data (RSC images). The structure of the customized ResNet for RSC recognition includes two 1000 fully connected layer that must be trained from scratch, and an output layer with 4 nodes. 3.2. Algorithm: training and fine-tuning Figure 1 shows the process involved in developing the proposed model. The key strategy associated with development process includes steps to train the top classifier (Steps 1 and 2) and to finetune the whole network (Steps 3 and 4). In Step 1, all layers except the customized final two customized fully connected layers of the pre-trained model are initiated using the pre-stored weights. The model is then fed with the training dataset and validation dataset. In this step all outputs from the convolutional functions, ReLU transfer functions, and Max pooling layers are used. In Step 2, the convolutional layers and Max pooling layers calculated in Step 1 are all frozen and the extra fully connected layers on the top are trained. Weights are updated by using the stochastic gradient decent (SGD) method. This approach of storing features offline and only training the last fully connected layers helps to increase the computational speed as tuning the model from a randomly initialed state is computationally expensive, especially, if training is done on the CPU. In Step 3, fine-tuning of the model’s upper layers and the toplevel classifier is conducted. This process uses small weight updates while re-training the model’s layers with an additional dataset. During this process, lower layers of the model are frozen and gradient descent is used to update the upper layers only. 4.1. Experimental design The image dataset used is collected from a number of fixed weather cameras installed at the various Road Weather Information System (RWIS) stations through the Ontario road network. In addition to images, these stations also provide highway maintenance staff with real-time weather and road surface conditions so that they can proactively make the most appropriate winter maintenance decisions. The RWIS network is divided into five regions: western Ontario (WR), eastern Ontario (ER), central (CR), north eastern (NER), and north western Ontario (NWR). The RWIS cameras typically take pictures at three different angles: left, middle and right, as shown in Table 1. In this experiment, each image is first manually classified (ground truth) according to the four-class classification scheme shown in Table 1 and then used to train and test the image recognition model. The four classes include bare pavement (BP) — the road surface including lanes and shoulders is completely clear — partly snow covered (PSC), fully snow covered (FSC), and not recognizable (the image quality is too poor to be used for classifying road surface conditions. It should be noted that this RSC classification system follows the Canadian standard for reporting winter highway conditions as proposed by Transportation Association of Canada. It is typically used to convey RSC information about maintenance routes to the general public. Sample images for each category are provided in Table 1. The model is implemented using an open source available from online. All experiments are conducted on Windows 10 machine with a GTX1070 GPU with 16 GB memory. Four experiments are designed to evaluate the performance and transferability of the trained prototype under different conditions. The first is to assess the overall performance of the ResNet50-based model as compared to other models using data from all the 60 cameras. The second experiment focuses on the transferability of the model with images from some cameras being used for training and those from the remaining cameras being held out for use in validation. In the third experiment, based on the results of the second experiment, two tests are designed to analyze the relationship between model performance, the number of cameras used, and data size. Finally, the fourth experiment tests the effectiveness of a model updating process based on a sequential training and fine-tuning approach. 4.2. Performance assessment In this experiment, the performance of ResNet50 is evaluated in comparison to some other well-known deep neural network models, including VGG16 (Simonyan and Zisserman 2014), InceptionV3 (Szegedy et al. 2016), and Xception (Chollet 2016). As the convolutional layers in each of these models are pre-trained and specified, we are only able to modify the fully connected layers and only alter the number of units (or nodes) in those layers. Although structural changes are important, the re-training process’ effectiveness heavily depends on configuring the training process. Model training is divided into “epochs” which represent weight updates conducted after a single pass-through of the training images. The size of these weight updates is controlled by the learning rate, and a different rate can be specified for both the pre-training and fine-tuning stage. As mentioned previously, Published by Canadian Science Publishing 1218 Can. J. Civ. Eng. Vol. 48, 2021 Table 1. Definition of different types of snow coverage. Sample image some layers can also be frozen preventing weight updates on their nodes. Based on a sensitivity analysis conducted in our previous effort (Pan et al. 2018, 2019), we determined that using two fully connected layers each containing 1000 neurons, a learning rate of 0.001, fine-tuning learning rate of 0.0005, and restricting the fine-tuning to the last 48 convolutional layers provided the best performance. These settings are adopted in this research. The data was downloaded from Ontario’s RWIS system, including a total of 24 779 images, of which 80% (19 864 samples) are randomly chosen for training, and the remaining 20% (4915 images) for testing. The results in Fig. 2 and Table 2 show the evolving performance of the training and fine-tuning process and the final performance of the four models. Accuracy is used to evaluate the performance of the model; it is defined as the ratio of the number of correct predictions to the total number of testing images. ResNet50 had the best final performance, starting with an accuracy of 62.17% that reached to 81.4% after 20 training epochs. Its final accuracy reached 95.18% after fine-tuning. Additional results are shown in Table 2. However, the difference in the testing accuracy between the models is relatively small. In a previous study (Linton and Fu 2016), additional machine learning models were tested on an invehicle image classification problem with three classes. The accuracy of the traditional artificial neural networks was 83.6%, random tree 85.3%, random forest 85.4%, and traditional convolutional neural networks only 84.8%. ResNet50’s high accuracy is in line with findings and results from another recent study that showed it was the most robust model with the lowest variance (Pan et al. 2019). Description Four-class description At least 3 m of the pavement crosssection in all lanes clear of snow or ice. Bare Only part of wheel path is clear of snow or ice. Partly snow covered No wheel path clear of snow or ice. Fully snow covered (more than 90% snow coverage) Not recognizable because of too dark, too much light or too blurry. Not recognizable As a comprehensive comparison, a confusion matrix using ResNet (showed in Table 3) is also provided to show different aspects of the model performance. 4.3. Effect of camera mix on model transferability This experiment is designed to assess the transferability of the proposed model. We start the experiment by first training the ResNet50 based RSC model using images from a varying number of cameras and then test its performance using data from five holdout cameras (CR-01, CR-21, ER-26, NER-21, and WR-11, see Table 4). The experiment is first started with images from one camera (CR-04), which is a Highway 400 camera near Bradford in central Ontario. It has two directions and three lanes on each direction. The five cameras in the testing set have similar angles and snow-covered areas, but have a few differences in the roadway curvature, lane count, weather conditions, etc. The results are shown in Table 4, with the last column providing the accuracy for each camera’s image set. 4.4. Effect of training data size In this experiment, we further analyze the effect of data size on the performance of the proposed model. (1) Accuracy test: increasing mixed cameras and training data In this experiment, the performance of ResNet50 on a fixed traffic camera with different training dataset sizes is evaluated. First, a subset of data at a specific amount is randomly drawn from the training data set (e.g., 10, 35, and 60 cameras out of 60 of the total training cameras) and then used to train and Published by Canadian Science Publishing Pan et al. 1219 Fig. 2. Training (first 20 epochs) and fine-tuning (last 20 epochs) using four pre-trained models. [Colour online.] Table 2. Model comparisons in training and testing. Table 4. Results of transferability testing. Model Training time/epoch/s Fine-tuning/ epoch/s Training accuracy Testing accuracy VGG16 Xception Inception-V3 ResNet50 Testing samples 220 220 215 211 / 332 438 260 320 / 95.28% 96.30% 96.05% 97.53% 93.08% 94.67% 94.93% 95.18% / Table 3. Confusion matrix on ResNet model testing performance. Training Testing Site name Testing accuracy CR-04 CR-01 CR-21 ER-26 NER-21 WR-11 96.42% 100 % 87.23% 99.01% 92.15% 92.74% Table 5. Testing result with increased training cameras. Training Training Testing size accuracy accuracy Variance Predicted Predicted Predicted Predicted True False Class 1 Class 2 Class 3 Class 4 positives negatives Class 1 95.33% Class 2 3.57% Class 3 0.00% Class 4 1.10% 1.50% 96.14% 2.15% 0.21% 0.28% 15.41% 82.91% 1.40% 1.95% 0.39% 0.78% 96.88% 95.33% 96.14% 82.91% 96.88% 4.67% 3.86% 17.09% 3.12% Case 1: Train on 10, test on 10 3439 Case 2: Train on 35, test on 35 10 975 Case 3: Train on 60, test on 60 19 864 95.79% 97.69% 97.53% 92.14% 93.63% 95.18% 0.0109 0.0050 0.0027 Table 6. Transferability performance versus data size. fine-tune the model. The selected dataset is then split into two subsets: a training set and a testing set. The training set includes the 80% of data while the testing set includes the remaining 20% data. The trained model is subsequently used to classify the testing data set. The training and the fine-tuning epochs are set to 15 in the 10 and 35 camera experiments, and 20 in 60 camera experiment. As shown in the Table 5, the classification performance of the model is 92.14% at low data sizes but increases quickly as data size also increases. The improvement trend of the model performance suggests that the model can reach a higher level (over 95%) of classification accuracy if more training data was available. (2) Transferability test: mixed cameras vs. training data To examine the effect of camera-mix on model performance, ResNet50 is trained on a varying number of cameras Test site Train Train Train Train Train 10 cameras 20 cameras 30 cameras 40 cameras 55 cameras CR-01 74.07% CR-21 68.89% ER-26 73.89% NER-21 59.92% WR-11 81.67% Total 71.79% accuracy 78.57% 79.95% 78.81% 74.31% 78.62% 78.35% 76.45% 82.25% 83.25% 75.48% 76.33% 79.33% 77.77% 78.80% 78.57% 75.09% 70.22% 76.68% 80.15% 82.48% 81.77% 75.48% 73.28% 79.39% while the number of training images is fixed. This experiment tests the transferability of the ResNet model as related to the variety of the camera mix. The same five cameras used in the effect of camera mix on model transferability experiment (Section 4.3) (CR-01, CR-21, ER-26, NER-21, and WR-11) are used to test the model transferability. Images are divided into five Published by Canadian Science Publishing 1220 Can. J. Civ. Eng. Vol. 48, 2021 Table 7. Results of sequential training. Test site Without re-training Re-trained on one camera Re-trained using images from all new cameras but with only one class of RSC (only bare pavement) Re-trained using data from all new cameras with all RSC classes CR-01 CR-21 ER-26 NER-21 WR-11 Accuracy 84.12% 82.25% 80.04% 68.87% 84.73% 80.54% 92.83% 85.79% 90.41% 82.69% 87.68% 88.29% 83.95% 86.09% 79.94% 71.15% 80.78% 81.16% 94.19% 87.27% 92.21% 80.76% 88.66% 89.23% Fig. 3. Improvements after re-finetuning. [Colour online.] different training groups which are created by using a randomly drawn subset of images from specific cameras from the remaining sets (e.g., 10/55, 20/55, 30/55, 40/55, and 55/55 of the rest training cameras). To assess the importance of the number of cameras and data size on the testing performance, each group contains the same number of images regardless of camera count (Table 6). The ResNet50 model is then trained and fine-tuned separately using the previously chosen data, and the trained models are subsequently used to classify the testing data set. The training and the fine-tuning epochs are set at 15 for each group due to the small sample size in the training dataset. Table 6 shows the trends of accuracies and correct classifications of the five trained ResNet50 models as the number of cameras used in training increases while keeping training sample size the same. The trained models all perform identically to previous results as all the cameras, samples and model parameters are designed the same. As the number of cameras used in training increases, the transferability accuracy rises at first to an average of approximately 78% and drops slightly before finally stopping at 79.39% when 55 cameras are used. When looking at the results with individual testing cameras, most of them perform better when more cameras are used for training, which suggests that the more cameras used, the more useful features are captured. 4.5. Sequential training and automatically updating In this section, we experiment with a fine-tuning scheme to quickly retrain the model using image data from new cameras. A sample-based approach is introduced with the idea of using a few labelled images from newly installed cameras to update the model. The ResNet50 model is first trained with data from all the cameras (except the five cameras held out for testing) and then re-trained using a few new images drawn from the new cameras (the five selected cameras). In this experiment, the re-training set includes only 20% of data while the testing set includes the remaining 80% data. We explore three re-training methods: (1) the re-training data are from each of the new cameras added, (2) from a mixed dataset of the five cameras, (3) images with bare pavement category only (sometimes image samples in the snow-covered state may not be immediately available). Table 7 and Fig. 3 compares the results of different re-training methods. Large improvements are achieved in cases of re-training on one extra camera and mixed cameras, especially with CR-01, ER-26, and NER-21. Significant improvements are also achieved for CR-21 and WR-11. Performance degradations were also observed when testing NER-21 using mixed data compared with using only one specific camera (Fig. 3). Overall, the prototype model shows a 2% to 3% improvement when the model is trained with images of only bare pavement condition (in which ER-26 and NER-21 show decrease in accuracy), and 8% 9% improvement in all conditions after using re-training with very few epochs, which become much Published by Canadian Science Publishing Pan et al. better after updating (the process of retraining and updating can refer to Fig. 1). 5. Conclusions In this paper, we have discussed the results of an extensive investigation on the idea of applying a pre-trained deep convolutional neural network called RestNet for winter road surface condition (RSC) recognition. A set of customized models are trained with our problem-specific training data — a RSC image dataset. By analyzing the model parameters and sensitivity, the best model and structure is found. The results have shown that the proposed solution reduces the need for large training datasets and reduces computational time while maintaining high accuracy. Meanwhile, the model is shown to be highly transferable and can adapt to new tasks with only minor tuning using new data. This research finding suggests that, as compared to the traditional approach which involves developing several local models separately, the proposed method of developing a single model using ResNet50 has the advantage of significantly reducing the training effort. Furthermore, the automatic updating technique is shown to have the flexibility to make use of newly available data for improving model performance, which would otherwise be a time-consuming process. Acknowledgement This research is supported by Natural Sciences and Engineering Research Council of Canada (NSERC), Ontario Research Fund – Research Excellence (ORF-RE), and the Ministry of Transportation Ontario (MTO) through its Highway Infrastructure Innovation Funding Program (HIIFP). References Basso, F., Basso, L.J., Bravo, F., and Pezoa, R. 2018. Real-time crash prediction in an urban expressway using disaggregated data. Transportation Research Part C: Emerging Technologies, 86: 202–219. doi:10.1016/j.trc.2017.11.014. Breiman, L. 2001. Random forests. Machine Learning, 45: 5–32. doi:10.1023/ A:1010933404324. Buchanan, F., and Gwartz, S.E. 2005. Road weather information systems at the Ministry of Transportation, Ontario. In Proceedings of the 2005 Annual Conference of the Transportation Association of Canada, Calgary, Alta. Chen-McCaig, Z., Hoseinnezhad, R., and Hadiasha, A. 2017. Convolutional neural networks for texture recognition using transfer learning. In Proceedings of the 2017 International Conference on Control, Automation and Information Sciences (ICCAIS), Chiang Mai, Thailand, 31 October–1 November 2007. IEEE. pp. 187–192. doi:10.1109/ICCAIS.2017.8217573. Chollet, F. 2016. Xception: deep learning with depthwise separable convolutions. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 21–26 July 2017. pp. 1800– 1807. doi:10.1109/CVPR.2017.195. Glorot, X., and Bengio, Y. 2010. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS), Chia Laguna Resort, Sardinia, Italy. pp. 249–256. Göring, C., Freytag, A., and Rodner, E. 2013. Fine-grained categorization – short summary of our entry for the ImageNet Challenge 2012. In Proceedings of the Conference on Computer Vision and Pattern Recognition. arXiv:1310.4759 [cs.CV]. Gu, L., Kwon, T.J., and Qiu, Z.J. 2019. A geostatistical approach to winter road surface condition estimation using mobile RWIS data. Canadian Journal of Civil Engineering, 46(6): 511–521. doi:10.1139/cjce-2018-0341. He, K., and Sun, J. 2015. Convolutional neural networks at constrained time cost. In Proceedings of the Conference on Computer Vision and Pattern Recognition. pp. 5353–5360. arXiv:1412.1710 [cs.CV]. He, K., Zhang, X., and Ren, S. 2016a. Identity mappings in deep residual networks. In European Conference on Computer Vision. Cham, pp. 630–645. He, K., Zhang, X., Ren, S., and Sun, J. 2016b. Deep residual learning for image recognition. In Proceedings of the Conference on Computer Vision and Pattern Recognition. pp. 770–778. arXiv:1603.05027 [cs.CV]. Hong, L., Lin, J., and Feng, Y. 2009. Road surface condition recognition method based on color models. In Proceedings of the 2009 First International Workshop on Database Technology and Applications, Wuhan, China, 25–26 April 2009. pp. 61–63. 1221 Howard, A.G. 2014. Some improvements on deep convolutional neural network based image classification. In Proceedings of the Conference on Computer Vision and Pattern Recognition. arXiv:1312.5402 [cs.CV]. Jonsson, P., Vaa, T., Dobslaw, F., and Thörnberg, B. 2015. Road condition imaging – model development. In Proceedings of the Transportation Research Board 94th Annual Meeting, Washington, DC, 11–15 January 2015. The National Academies of Sciences, Engineering, and Medicine, Washington, DC. Krizhevsky, A., Sutskever, I., and Hinton, G. 2012. ImageNet classification with deep convolutional neural networks. In Proceedings of the Advances in Neural Information Processing Systems 25 (NIPS 2012). Curran Associates Inc. pp. 1097–1105. Kwon, T.J., Fu, L., and Jiang, C. 2015. Road weather information system stations — where and how many to install: a cost benefit analysis approach. Canadian Journal of Civil Engineering, 42(1): 57–66. doi:10.1139/cjce-2013-0569. Kwon, T.J., Fu, L., and Melles, S.J. 2017. Location optimization of road weather information system (RWIS) network considering the needs of winter road maintenance and the traveling public. Computer-Aided Civil and Infrastructure Engineering, 32(1): 57–71. doi:10.1111/mice.12222. LeCun, Y., Bengio, Y., and Hinton, G. 2015. Deep learning. Nature, 521(7553): 436–444. doi:10.1038/nature14539. Linton, M.A., and Fu, L. 2015. Winter road surface condition monitoring: field evaluation of a smartphone-based system. Transportation Research Record: Journal of the Transportation Research Board, 2482: 46–56. doi:10.3141/ 2482-07. Linton, M.A., and Fu, L. 2016. Connected Vehicle Solution for Winter Road Surface Condition Monitoring. Transportation Research Record: Journal of the Transportation Research Board, 2551: 62–72. doi:10.3141/2551-08. Lippi, M., Bertini, M., and Frasconi, P. 2013. Short-term traffic flow forecasting: An experimental comparison of time-series analysis and supervised learning. IEEE Transactions on Intelligent Transportation Systems, 14(2): 871–882. doi:10.1109/TITS.2013.2247040. Liu, M., Wang, M., Wang, J., and Li, D. 2013. Comparison of random forest, support vector machine and back propagation neural network for electronic tongue data classification: Application to the recognition of orange beverage and Chinese vinegar. Sensors and Actuators B: Chemical, 177: 970–980. doi:10.1016/j.snb.2012.11.071. Lv, Y., Duan, Y., and Kang, W. 2015. Traffic flow prediction with big data: a deep learning approach. IEEE Transactions on Intelligent Transportation Systems, 16(2): 865–873. doi:10.1109/TITS.2014.2345663. Marsh, W., and Bearfield, G. 2004. Using Bayesian Networks to model accident causation in the UK railway industry. In Probabilistic Safety Assessment and Management. Springer, London. pp. 3597–3602. Mondal, M., Mondal, P., Saha, N., and Chattopadhyay, P. 2017. Automatic number plate recognition using CNN based self-synthesized feature learning. In Proceedings of the 2017 IEEE Calcutta Conference (CALCON), Kolkata, India, 2–3 December 2017. pp. 378–381. doi:10.1109/CALCON.2017. 8280759. Omer, R., and Fu, L. 2010. An automatic image recognition system for winter road surface condition classification. In Proceedings of the 13th International IEEE Conference on Intelligent Systems, Funchal, Portugal, 19–22 September 2010. pp. 19–22. doi:10.1109/ITSC.2010.5625290. Pan, J., and Yang, Q. 2010. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22: 1345–1359. doi:10.1109/TKDE. 2009.191. Pan, G., Fu, L., Yu, R., and Muresan, M. 2018. Winter road surface condition recognition using a pre-trained deep convolutional neural network. In Proceedings of the Transportation Research Board 97th Annual Meeting. Washington, DC. arXiv:1812.06858 [eess.IV]. Pan, G., Fu, L., Yu, R., and Muresan, M. 2019. Evaluation of alternative pretrained convolutional neural networks for winter road surface condition monitoring. In Proceedings of the 2019 5th International Conference on Transportation Information and Safety (ICTIS), Liverpool, UK, 14–17 July 2019. IEEE. pp. 614–620. doi:10.1109/ICTIS.2019.8883540. Qiao, J., Pan, G., and Han, H. 2019. A regularization-reinforced DBN for digital recognition. Natural Computing, 18(4): 721–733. doi:10.1007/s11047-016-9597-7. Sermanet, P., Eigen, D., and Zhang, X. 2014. OverFeat: integrated recognition, localization and detection using convolutional networks. In Proceedings of the Conference on Computer Vision and Pattern Recognition. arXiv:1312. 6229 [cs.CV]. Simonyan, K., and Zisserman, A. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. In Proceedings of the Conference on Computer Vision and Pattern Recognition. arXiv:1409.1556 [cs.CV]. Soon, F.C., Khaw, H.Y., Chuah, J.H., and Kanesan, J. 2018. Hyper-parameters optimisation of deep CNN architecture for vehicle logo recognition. IET Intelligent Transport Systems, 12(8): 939–946. doi:10.1049/iet-its.2018.5127. Szegedy, C., Vanhoucke, V., and Ioffe, S. 2016. Rethinking the inception architecture for computer vision. In Proceedings of the Conference on Computer Vision and Pattern Recognition. pp. 2818–2826. arXiv:1512.00567 [cs.CV]. Veit, A., Wilber, M., and Belongie, S. 2016. Residual networks behave like ensembles of relatively shallow networks. In Proceedings of the Conference on Computer Vision and Pattern Recognition. pp. 550–558. arXiv:1605.06431 [cs.CV]. Published by Canadian Science Publishing 1222 Xue, Y., and Li, Y. 2018. A fast detection method via region-based fully convolutional neural networks for shield tunnel lining defects. ComputerAided Civil and Infrastructure Engineering, 33(8): 638–654. doi:10.1111/ mice.12367. Yi, Z., and Shirk, M. 2018. Data-driven optimal charging decision making for connected and automated electric vehicles: a personal usage scenario. Transportation Research Part C: Emerging Technologies, 86: 37–58. doi:10.1016/ j.trc.2017.10.014. Zhang, A., Wang, K.C.P., Li, B., Yang, E., Dai, X., Peng, Y., et al. 2017. Automated pixel-level pavement crack detection on 3D asphalt surfaces using Can. J. Civ. Eng. Vol. 48, 2021 a deep-learning network. Computer-Aided Civil and Infrastructure Engineering, 32(10): 805–819. doi:10.1111/mice.12297. Zhang, Z., He, Q., Gao, J., and Ni, M. 2018. A deep learning approach for detecting traffic accidents from social media data. Transportation Research Part C: Emerging Technologies, 86: 580–596. doi:10.1016/j.trc.2017.11.027. Zheng, Q., Yang, M., Yang, J., Zhang, Q., and Zhang, X. 2018. Improvement of generalization ability of deep CNN via implicit regularization in twostage training process. IEEE Access, 6: 15844–15869. doi:10.1109/ACCESS.2018. 2810849. Published by Canadian Science Publishing Copyright of Canadian Journal of Civil Engineering is the property of Canadian Science Publishing and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use.