Deep Federated Learning for IoT-based Decentralized Healthcare Systems Haya Elayan1 , Moayad Aloqaily2 , Mohsen Guizani2 1 2021 International Wireless Communications and Mobile Computing (IWCMC) | 978-1-7281-8616-0/21/$31.00 ©2021 IEEE | DOI: 10.1109/IWCMC51323.2021.9498820 xAnalytics Inc., Ottawa, ON, Canada. 2 Qatar University, Doha, Qatar. E-mails: h.elayan@xanalytics.ca, maloqaily@ieee.org, mguizani@ieee.org Abstract—Recent trends in the healthcare industry, such as the use of wearable IoT for continuous health monitoring, are setting new requirements for healthcare systems that boost data analysis. These systems should support decentralization and maintain the privacy and ownership of users’ data due to the sensitivity of healthcare data. Therefore, the use of federated learning techniques is recommended for systems that need such requirements. This paper proposes a Deep Federated Learning framework for decentralized healthcare systems that maintain user privacy in a distributed architecture. It also proposes an algorithm for an automated training data acquiring process. Furthermore, it presents an experiment for using deep federated learning in detecting skin diseases and using Transfer Learning to address the problem of limited availability of healthcare data in building deep learning models. The evaluated results show how the federated learning increased the Area Under the Curve of the centralized learning model up to 0.97, as it also shows good model performance during federated rounds in terms of accuracy, precision, recall, and F1-score. Moreover, although the FL system has affected the quality of service to the user in terms of model conversion time, the Federated Learning system meets the requirements of building models in a decentralized manner with no sharing of users’ private data. Index Terms—Deep Federated Learning, Healthcare, IoT, Transfer Learning, Distributed systems, Privacy. I. I NTRODUCTION Over the past few years, traditional technologies have become insufficient to create privacy-preserving applications as more and more records are exposed, especially with the spread of internet-connected devices as the global average of mobile internet data traffic is expected to reach 48,72 PB per month in 2021 [1]. Specifically for industrial healthcare applications where the healthcare industry is the most expensive in the average cost of data breaches worldwide [2]. Therefore, it is highly recommended to use technologies that support decentralization and maintain data ownership to achieve more privacy. Federated learning (FL) is a machine learning technique used to train machine learning models collaboratively in a decentralized manner on multiple devices or servers, thus maintain data privacy and maintain data ownership for the device/server owner [3]. FL is hugely beneficial for highly decentralized healthcare data, especially after the increasing popularity of the use of IoT devices to capture data and monitor health continuously, as the healthcare IoT market is expected to reach $ 158 billion in 2022 according to Deloitte estimates [4]. It is also beneficial for the sensitivity of this data because the health information of the users is very sensitive to disclose [5]. FL will train machine learning models locally without sharing data, unlike centralized learning techniques 978-1-7281-8616-0/21/$31.00 ©2021 IEEE where the training process happens in a centralized unit. Alternatively, local models will share their updates after each local training process and those updates will be aggregated to train a global model. Since there is an on-device training process in FL which means less human interaction with the local model or the captured data, it becomes essential to use advanced machine learning algorithms such as Deep Learning to tackle this process. Deep Federated Learning (DFL) uses deep neural networks with FL to train models as deep learning can build robust models and reduce feature engineering processes. This is helpful for healthcare data where it is huge and ever-changing which requires feature engineering techniques to extract the most useful features for building models. Healthcare data also faces the problem of being unavailable or having limited samples [6] which makes it require other machine learning techniques to build high-performance models such as transfer learning. Transfer Learning is a method that uses knowledge gained by solving a problem while training a model in solving a related problem for another model. Deep learning has shown its effectiveness in various healthcare applications, especially in computer vision for medical imaging, such as detection of diabetic retinopathy in fundus images [7] and classification of skin cancer images [8]. DFL’s integration with IoT will help build robust privacypreserving applications in a decentralized manner as well as meet a series of challenges. As IoT devices still face some AI reliability and compatibility issues. However, the digitization and AI-market growth will boost the IoT industry [9]. This article integrates these technologies and applies various techniques for analyzing healthcare data that aid in patients’ health monitoring. The contributions of this article can be summarized as follows: • • • • Propose a DFL framework for healthcare data analysis using IoT, Introduce an algorithm for DFL process automation that handles the training data acquiring stage, Use Transfer Learning for building a skin-diseases detection model, Implement an experiment for skin-diseases detection using Deep Federated Learning. The reminder of this paper is organized as follows. Section II discusses the related work. Section III describes the proposed DFL for decentralized healthcare systems framework. Section IV proposes the data acquiring algorithm. Section V shows system implementation and setup. Section VI discusses the 105 evaluation metrics and results. Lastly, Section VII presents the conclusion of this paper. TABLE I R ELATED W ORK S UMMARY II. R ELATED W ORK Use FL Use IoT [10] [11] [12] [14] [15] [16] [17] [19] [20] [21] [22] [24] [25] [26] Yes Yes Yes Yes No Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No No Yes No No No No Yes No Federated Learning (Global model) Cloud Computing AI-Ready Server Global Training Local models' updates Global model Global model Global model Global model Federated Learning (Local models) Internet of Things Local Training III. A F RAMEWORK : DFL FOR I OT- BASED D ECENTRALIZED H EALTHCARE S YSTEMS Implemented Healthcare Application HAR HAR HAR ECG Classification X-ray Classification X-ray Classification EHR EHR EHR - will capture users’ data and train a local deep learning model which is a copy of a previously received global model. After completing the local training process, the models will work collaboratively on training a global model using their updates rather than using the users’ raw data. These models’ updates are the changes to the models’ weights during the training process and they do not reflect any private or personal data of the users. Global model In integrating FL with IoT in healthcare, authors of [10] [11] [12] introduced FL for Human Activity Recognition for remote healthcare monitoring by proposing FL systems using IoT devices. Authors in [13] integrated FL with digital twin for IoT to preserve users’ privacy. Sun et. al. [14] proposed a framework for improving IoT digital twins learning efficiency using FL. In healthcare monitoring context, the authors of [15] presented a digital twin framework using IoT for healthcare monitoring without adding FL to the process. They tested their work on several algorithms using ECG data where the results showed that deep learning performed the best across all implemented algorithms. Others such as [16] and [17] proposed dedicated FL frameworks for detecting COVID-19 infection using X-ray images. The latter has applied transfer learning to several pre-trained models, based on the results, ResNet18 showed the best performance. Rahman et. al. [18] proposed FL framework for healthIoT by adding an edge layer for performing deep learning tasks and blockchain for more security and trustworthiness. Likewise, authors of [19] introduced a mechanism for sharing industrial IoT data using FL and blockchain. They evaluated their proposed work on text categorization dataset. Also in the healthcare domain, authors in [20]–[23] proposed work in FL for EHRs and all of them have shown promising results in this area. The first performed a clustering technique to create community-based data that had clinical meaning. The results showed that the clustering-based FL model exceeds the performance of the normal FL model. Another contribution in other industries, Yu et. al. [24] proposed an abnormal weights-clipping FL algorithm based on Federated Average technique to improve its performance and applied it to a general objects detection dataset. In another study [25], authors proposed an FL framework and applied it to vehicle image classification. Finally, the author of [26] proposed an FL classifier for dogs and cats images. The model has quite well outperformed the centralized one in performance. A summary of the related work can be found in Table I. Paper Local Training Local Training Local Training Local Training Fig. 1. Framework Overview In order to build healthcare systems that support decentralization and maintain user privacy at the same time, FL and IoT technologies will be employed in one framework [27]. Also, deep learning will be used to build robust, high-performance smart models that require fewer feature engineering processes. This framework will keep the users’ healthcare data on the IoT devices that captured this data. Moreover, it will use this data to train a machine learning model locally. Thus, the data will be preserved on the devices and only accessible by the data owners themselves. Figure 1 shows an overview of the framework and how the technologies will collaborate with each other. The IoT devices All participating models will send their updates to an AIready cloud server where these updates will be aggregated to train the global model. Once the global model training process is over, each device will receive a new copy of the updated global model. Accordingly, the models will be trained and updated constantly without sharing any private data. Thus, the framework will support IoT-based decentralized architecture where models will be distributed on IoT devices without requiring a centralized server to run the model and serve the users. Moreover, it will maintain privacy by processing and analyzing users’ data on the IoT devices without sharing it. 106 IV. T RAINING DATA ACQUISITION A LGORITHM DFL system automation is a huge task as this type of system deals with a decentralized architecture and each participant must independently perform their work starting from data acquisition to using the model for prediction at its best. This section proposes an algorithm to handle the data acquisition stage. According to Algorithm 1, a user is considered to be a participant in the DFL process after using the previously shared global model for the detection of skin diseases. When the user captures an image and uses the model for prediction, the user will be asked to rate the quality of the prediction on a scale from 1 to 5. If the rating is higher than 2, the image and its predicted label will be considered as a training sample then used in the local training process. This will ensure offering training samples for the local training process, as well as checking the quality of the model prediction over the long term. Skin diseases Dataset Transfer Learning Original Model Device 1 Device 2 FR1 ... Device n-1 Device n Updates Model R1 Algorithm 1: Training Data Acquiring P TS: (img,label) Input: F(), img Output: T S 2 function Dataacquisitionround(F(img)) 3 Prediction label=F (img) 4 if (Rating of label > 2 ) then 5 TS ←(img, label) 6 end 7 return T S 8 end Device 1 Device 2 ... Updates 1 FR4 Updates Device n-1 Device n . . . Model R4 Device 1 Device 2 9 ... Device n-1 Device n Fig. 2. Implementation Process V. S YSTEM I MPLEMENTATION AND S ETUP The implemented experiment discusses a skin-disease detection system through images capture by devices’ cameras. Therefore a Deep learning model was built to detect skin diseases using transfer learning and to support privacy and decentralization, the federated learning technique was used on the same model for four training rounds to update it. Figure 2 illustrates the implementation process in detail. A. Dataset Atlas Dermatology dataset [28] was used to build the first global model. The dataset contains ≈10,000 images for 361 classes. The images was fit into the model in the shape of 224×224×3. Most of the data was used as a training dataset ≈90% and the rest of the data was used as a testing dataset. B. Original Model and Transfer Learning Since the dataset was insufficient to build a robust deep learning model, the dataset used was doubled to increase the number of samples. Also, the Transfer Learning technique was used to obtain a good model performance. In the Transfer Learning phase, one of the Keras applications deep learning models was used. ResNet50 model was fine-tuned using the dermatology dataset and compiled over 50 epochs using ADAM optimizer and learning rate of 0.0001. C. Federated Learning As mentioned in Section III, a copy of the global model will be sent to each participating device in the process, here the FL process begins. The FL process was conducted for 4 rounds, each round contains 50 users with 2 samples per user. Initially, the original skin-diseases detection model will be used as a global model and its weight will be broadcast to all users. Then the following steps will be repeated in each Federated round: • • • After receiving the global model weights, a local training process will be initiated on the users’ data separately, then the models’ weights will be updated. All models’ weights will be aggregated and averaged. Update the global model weights using the average of all local models’ updates. This phase was conducted using Tensorflow Federated Learning framework on Google Colab TPU, with 64 GiB memory. Also, each federated average training round used the Adam optimizer and the default learning rate of 0.001 for both client and server. 107 VI. E VALUATION AND R ESULTS Accuracy Precision Macro Average Precision Weighted Average F1 Macro Average F1 Weighted Average Recall Macro Average Recall Weighted Average 0.94 In this section, we discuss three metrics for evaluation. The first is the classification report, then the AUC and finally the QoS. Percent 0.92 A. Classification Report metrics Figure 3 illustrates how the FL changed the original model results for weighted average and macro average of each classification metric. During the four federated rounds, the original model’s accuracy decreased from 85% to 1% for recovery in the final round R4 and up to 85% again. The macro average percentages for precision, recall, and F1-Score decreased during federated rounds while the weighted average for the same metrics fluctuated between increasing the precision up to 87%, a 1% decrease in recall from the original model then recovery in the fourth round, and maintaining the same percentage of F1-Score. B. Area Under The Curve (AUC) It is a measurement of the model performance at various threshold settings. The area under the curve ”Receiver Operating Characteristic Curve” shows to what level the model can distinguish between the data classes. The higher AUC near 1 is better as the model has a good performance in classifying the data samples. FL improved the original model AUC as shown in Figure 4 where the percentage increased up to 97.4% during the federated rounds. C. Quality of Service (QoS) Building systems that are decentralized and maintain privacy is a critical issue because these two factors may affect the complexity of the system and thus affect the quality of service. This section will discuss the evaluating of these factors on the QoS of Federated Learning systems (FL) and Centralized Learning systems (CL). Given the Privacy-Preserving (P) factor is not sharing the users’ personal data, the FL systems will satisfy the factor while the CL systems do not. Therefore, the FL systems will have P1 of P factor because they are privacy-preserving systems as they don’t share personal data, unlike the CL 0.88 0.86 0.84 Original R1 R2 R3 R4 Fig. 3. Classification report metrics 97.4 97.3 Percent Classification report is a summary report for main classification metrics that describe the model performance such as: Precision, Recall, F1-Score and Accuracy. Accuracy is the percent of how often the model classifies the samples correctly. As it combines true positives and true negatives that represent the correctly predicted samples over all predicted data samples. Precision is the ratio between true positive predicted samples among all positives samples true and false ones. Recall is the percentage of true positives that were identified correctly. It takes the total true positives over the true positives and the false negatives. F1-Score is defined as the harmonic mean of the model’s precision and recall. It used for better evaluation in unbalanced dataset scenarios. 0.90 97.2 97.1 97.0 96.9 Original R1 R2 R3 R4 Fig. 4. Models’ AUC systems that have P0 of P factor as they require sharing users’ data to one cloud storage. Also, since the FL systems achieve the Decentralization (D) factor while the CL systems do not, the FL systems will have D1 as they support the distributed learning technique between several devices. This is in contrast to CL systems that will have D0 since they require a centralized database and server to store the data and run the model. Taking into account the performed experiment, the FL model affected the conversion time for one sample and increased it by 1.08 seconds where it was 0.12 in the original model and became 1.20 on average for the four federated rounds. Since the original model was built with a centralized learning technique, the CL model conversion time was less than the FL model. However, the CL system did not achieve the D and P factors while the FL system did. Figure 5 illustrates the relationship between these factors. In conclusion, it is noted that the use of FL improved the AUC for the original model while maintaining the model’s accuracy. Also, other classification metrics results such as accuracy, recall, and F1-Score showed good performance of models during the federated rounds. Finally, although the FL system has a higher conversion time which may affect the quality of service for the user, it achieved the factors of decentralization and privacy-preserving which the CL system did not achieve even though its conversion time is less. In our future work, we expect to continue investigating the following open issues under this area; 108 FL 1.2 1.0 CT 0.8 0.6 CL 0.4 0.2 P0 P P1 D0 D D1 0.0 Fig. 5. Conversion Time vs. Decentralization vs. Privacy-Preserving for FL system and CT systems • • • Testing the capabilities of IoT devices: This is important since there is no minimum standard to run the FL models on IoT devices. Evaluate the experiment on a larger dataset: We got initial results for this due to the finite availability of the skin diseases dataset, but we need to build and evaluate the model using a much larger dataset. Network Capacity and Latency: We believe this will open new research avenues for both IoT device dropout and data transmission aspects. VII. C ONCLUSION This paper proposed a Deep Federated Learning framework for decentralized healthcare systems using IoT without sharing private data thus preserving user privacy. Moreover, an algorithm has been proposed to cover the training data acquiring stage for the federated averaging toward having a fully automated federated learning process. Furthermore, it described the implementation of using deep federated learning in the detection of skin diseases experiment. The results indicated a higher AUC percentage for the model after federated rounds. Also, maintaining the accuracy percentage for the model after federated average as well as good classification metrics results. Furthermore, achieving not sharing data objective in the decentralized architecture systems despite increasing the conversion time of the model. ACKNOWLEDGEMENT This work was supported by Qatar University under project No: IRCC [2020-003]. The findings achieved herein are solely the responsibility of the authors. R EFERENCES [1] Y. Al Mtawa, A. Haque, and B. Bitar, “The mammoth internet: Are we ready?” IEEE Access, 2019. [2] “How much would a data breach cost your business?” https://www.ibm. com/security/data-breach, accessed: 2021-02-28. [3] S. Otoum, I. Al Ridhawi, and H. T. Mouftah, “Blockchain-supported federated learning for trustworthy vehicular networks,” in GLOBECOM 2020 - 2020 IEEE Global Communications Conference, 2020. [4] “Medtech and the internet of medical things,” https://www2. deloitte.com/global/en/pages/life-sciences-and-healthcare/articles/ medtech-internet-of-medical-things.html, accessed: 2021-02-28. [5] B. D. Deebak, F. Al-Turjman, M. Aloqaily, and O. Alfandi, “An authentic-based privacy preservation protocol for smart e-healthcare systems in iot,” IEEE Access, 2019. [6] T. Shaikhina and N. A. Khovanova, “Handling limited datasets with neural networks in medical applications: A small-data approach,” Artificial intelligence in medicine, 2017. [7] V. Gulshan, L. Peng, M. Coram, M. C. Stumpe, D. Wu, A. Narayanaswamy, S. Venugopalan, K. Widner, T. Madams, J. Cuadros et al., “Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs,” Jama, 2016. [8] A. Esteva, B. Kuprel, R. A. Novoa, J. Ko, S. M. Swetter, H. M. Blau, and S. Thrun, “Dermatologist-level classification of skin cancer with deep neural networks,” nature, 2017. [9] “How intelligent 5g will drive iot growth in 2021,” https://www. pipelinepub.com/tech-trends-2020-2021/5G-and-IoT/2, accessed: 202102-28. [10] X. He, X. Su, Y. Chen, and P. Hui, “Federated learning on wearable devices: demo abstract,” in Proceedings of the 18th Conference on Embedded Networked Sensor Systems, 2020, pp. 613–614. [11] Y. Zhao, H. Haddadi, S. Skillman, S. Enshaeifar, and P. Barnaghi, “Privacy-preserving activity and health monitoring on databox,” in Proceedings of the Third ACM International Workshop on Edge Systems, Analytics and Networking, 2020, pp. 49–54. [12] Q. Wu, K. He, and X. Chen, “Personalized federated learning for intelligent iot applications: A cloud-edge based framework,” IEEE Open Journal of the Computer Society, 2020. [13] Y. Lu and et. al., “Communication-efficient federated learning for digital twin edge networks in industrial iot,” IEEE Transactions on Industrial Informatics, 2020. [14] W. Sun, S. Lei, L. Wang, Z. Liu, and Y. Zhang, “Adaptive federated learning and digital twin for industrial internet of things,” IEEE Transactions on Industrial Informatics, 2020. [15] H. Elayan, M. Aloqaily, and M. Guizani, “Digital twin for intelligent context-aware iot healthcare systems,” IEEE Internet of Things Journal, 2021. [16] W. Zhang, T. Zhou, Q. Lu, X. Wang, C. Zhu, Z. Wang, and F. Wang, “Dynamic fusion based federated learning for covid-19 detection,” arXiv preprint arXiv:2009.10401, 2020. [17] B. Liu, B. Yan, Y. Zhou, Y. Yang, and Y. Zhang, “Experiments of federated learning for covid-19 chest x-ray images,” arXiv preprint arXiv:2007.05592, 2020. [18] M. A. Rahman, M. S. Hossain, M. S. Islam, N. A. Alrajeh, and G. Muhammad, “Secure and provenance enhanced internet of health things framework: A blockchain managed federated learning approach,” Ieee Access, 2020. [19] Y. Lu, X. Huang, Y. Dai, S. Maharjan, and Y. Zhang, “Blockchain and federated learning for privacy-preserved data sharing in industrial iot,” IEEE Transactions on Industrial Informatics, 2019. [20] L. Huang and et. al., “Patient clustering improves efficiency of federated machine learning to predict mortality and hospital stay time using distributed electronic medical records,” Journal of biomedical informatics, 2019. [21] T. S. Brisimi, R. Chen, T. Mela, A. Olshevsky, I. C. Paschalidis, and W. Shi, “Federated learning of predictive models from federated electronic health records,” International journal of medical informatics, 2018. [22] H. Chen, H. Li, G. Xu, Y. Zhang, and X. Luo, “Achieving privacypreserving federated learning with irrelevant updates over e-health applications,” in ICC 2020-2020 IEEE International Conference on Communications (ICC), 2020, pp. 1–6. [23] O. Choudhury, A. Gkoulalas-Divanis, T. Salonidis, I. Sylla, Y. Park, G. Hsu, and A. Das, “Differential privacy-enabled federated learning for sensitive health data,” arXiv preprint arXiv:1910.02578, 2019. [24] P. Yu and Y. Liu, “Federated object detection: Optimizing object detection model with federated learning,” in Proceedings of the 3rd Conference on Vision, Image and Signal Processing, 2019. [25] H. Jiang, M. Liu, B. Yang, Q. Liu, J. Li, and X. Guo, “Customized federated learning for accelerated edge computing with heterogeneous task targets,” Computer Networks, 2020. [26] J. T. Raj, “Building decentralized image classifiers with federated learning,” in 2020 IEEE Region 10 Symposium, 2020, pp. 489–494. [27] I. Al Ridhawi, S. Otoum, M. Aloqaily, and A. Boukerche, “Generalizing ai: Challenges and opportunities for plug and play ai solutions,” IEEE Network, 2020. [28] “Dermatology atlas,” http://www.atlasdermatologico.com.br/index.jsf, accessed: 2021-02-28. 109