Available online at www.sciencedirect.com Available online at www.sciencedirect.com Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 00 (2021) 000–000 Procedia Computer Science (2021) 000–000 Procedia Computer Science 19200 (2021) 642–649 www.elsevier.com/locate/procedia www.elsevier.com/locate/procedia 25th International Conference on Knowledge-Based and Intelligent Information & Engineering 25th International Conference on Knowledge-Based Systems and Intelligent Information & Engineering Systems A A Deep Deep Learning Learning Approach Approach to to Subject Subject Identification Identification Based Based on on Walking Patterns Walking Patterns Cezara Beneguiaa Cezara Benegui a Department of Computer Science, University of Bucharest, Romania (e-mail: cezara.benegui@fmi.unibuc.ro a Department of Computer Science, University of Bucharest, Romania (e-mail: cezara.benegui@fmi.unibuc.ro Abstract Abstract For the time being, smartphone devices rely on direct interaction from the users for unlocking and authentication purposes through For the time being, smartphone devices rely on direct interactionorfrom the users for unlocking and authentication purposesauthentithrough implicit authentication systems such as PINs, facial recognition fingerprint scanning. While different passive two-factor implicit authentication such as PINs, facial recognition or fingerprint scanning. While different passivesystem. two-factor authentication systems based onsystems machine learning were explored in recent work, all require an implicit authentication In this study, cation systems based onand machine learning wereauthentication explored in recent work, all on require an implicit system. In this study, the focus is to develop introduce a passive system based walking patterns.authentication In this scenario, the authentication the focus is to developauthenticates and introduce passive authentication based walking patterns. thisofscenario, the authentication system continuously thea user in the background,system without any on further action. To theInbest our knowledge, this is the system continuously authenticates theprocessed user in thewith background, anybetter further action. Togait-based the best ofmotion our knowledge, this is the first study in which the data sets are the aim towithout generate performing signals. Compared first study instudied which work, the data are processed withstage the aim to generate better tiny performing gait-based motion signals.signals. Compared to previous wesets employ a processing in which we extract frames of data from the motion Our to previous studied work, we a processing in learning which we tiny movement frames of data from the Our contribution of processing gaitemploy data, allows for morestage robust of extract the subject and lowers themotion numbersignals. of samples contribution of processing gait data, allows for more robust learning of the subject movement and lowers the number of samples required to classify a user thereafter. Hence, our approach is more robust compared to using raw gait signals. Further, we transform required classify a user thereafter. approach is more compared to using raw gaitthe signals. Further, the we transform them intotogray-scale images for deepHence, neuralour network training androbust feature extraction. Conducting experiments, empirical them gray-scalethat images for deep neural network training extraction. the experiments, the presented empirical resultsinto demonstrate subjects can be identified with a very and highfeature accuracy through Conducting walking patterns employing the results demonstrate that subjects can be identified with a very high accuracy through walking patterns employing the presented techniques. Empirical results outline that a system based on gait data can be utilized as a passive authentication system. Therefore, techniques. Empirical results outline that aemploying system based on gait data can be utilized as a passive authentication Therefore, it is concluded that deep neural networks the technique described in this work for gait-based featuresystem. representation are it is concluded that deep neural networks employing the technique well suited for continuous and unobtrusive authentication systems. described in this work for gait-based feature representation are well suited for continuous and unobtrusive authentication systems. © 2021 The The Authors. Authors. Published Published by by Elsevier Elsevier B.V. B.V. © 2021 © 2021an The Authors. Published by Elsevier B.V. This This is is an open open access access article article under under the the CC CC BY-NC-ND BY-NC-ND license license (http://creativecommons.org/licenses/by-nc-nd/4.0/) (https://creativecommons.org/licenses/by-nc-nd/4.0) This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/) Peer-review under responsibility ofofthe committee ofof the KESInternational. International. Peer-review under responsibility thescientific scientific committee KES Peer-review under responsibility of the scientific committee of the KES International. Keywords: motion data; deep neural networks; user identification; continuous authentication Keywords: motion data; deep neural networks; user identification; continuous authentication 1. Introduction 1. Introduction The user identification task has been explored thoroughly recently. With the advancement of technology, more and Thetechniques user identification task has been thoroughly recently. the advancement of technology, more and more are introduced with theexplored aim to solve such tasks. FromWith standard PINs, to unlocking patterns and facial more techniques are introduced with the aim to solve such tasks. From standard PINs, to unlocking patterns and facial recognition systems, user identification requires interaction with a device in all cases. Nowadays, a great variety of recognition systems, user identification requires interaction with a device in all cases. Nowadays, a great variety of E-mail address: cezara.benegui@fmi.unibuc.ro E-mail address: cezara.benegui@fmi.unibuc.ro 1877-0509 © 2021 The Authors. Published by Elsevier B.V. 1877-0509 © 2021 Thearticle Authors. Published by Elsevier B.V. This is an open access under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/) 1877-0509 © 2021 Thearticle Authors. Published by Elsevier B.V. This is an open access under the scientific CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/) Peer-review under responsibility of the committee oflicense the KES(https://creativecommons.org/licenses/by-nc-nd/4.0) International. This is an open access article under the CC BY-NC-ND Peer-review under responsibility of the scientific committee of the KES International. Peer-review under responsibility of the scientific committee of KES International. 10.1016/j.procs.2021.08.066 Cezara Benegui et al. / Procedia Computer Science 192 (2021) 642–649 Cezara Benegui / Procedia Computer Science 00 (2021) 000–000 643 components are included in smartphones: high performance cameras, magnetometers, proximity and motion sensors, to name a few. Recent works[1, 2, 3, 4] proposed unobtrusive authentication systems that rely on motion sensors to identify users through machine learning techniques, such as deep neural networks. Although such systems are developed to improve implicit authentication systems, user interaction is not entirely eliminated. In this work, we study new methods for continuous unobtrusive background authentication for smartphone devices, based on walking patterns. Using gait data, a subject can be continuously registered and authenticated, without any implicit interaction. This work explores a novel approach processing gait motion signals and utilizing the in the subject identification task. To the best of our knowledge, we are the first who processed waling based motion signals into tiny data frames and further transforming them into gray-scale images. The goal of this work is to demonstrate that our technique of processing walking pattern data can attain substantial results in the subject identification task through machine learning models. To accomplish our work, we follow the same data set introduced by Vajdi et al. [5] in their work. The data set consists of motion sensor values and additional meta information collected from 93 subjects that perform two walking sessions between two predefined points. Each recording is performed using two iPhone 6s devices, placed on the left waist and the right thigh of the subject. During each walking session, the motion sensors available on the device, namely accelerometer and gyroscope, yield data points on three axis (X,Y,Z). In our experiments, we further process the data set into tiny frames of walking patterns, with the aim to attain more robust signals. Following the experimental settings described by Benegui et al. [1], the resulted images from our processed data set, are fed as inputs to CNN models. Compared to their work that employes 150 samples of motion data collected while performing taps on a smartphone device screen, our approach utilizes gait data which is represented by more distinct features. Compared to a short tap gesture, the biomechanics of walking yield very different signals. The resulted frames are transformed into gray-scale images. Furthermore, we extract embeddings from the CNN models and pass them to SVM classifiers for the subject identification task. Therefore, each processed tiny frame of gait data can be utilized to identify a subject. We present empirical results for each of the conducted experiments, that outline the performance of subject identification. The rest of this paper is organized as follows. Related work is discussed in Section 2. The methods utilized and the techniques are described in Section 3. The gait-based user identification is detailed in Section 4. Finally, conclusions are drawn in Section 5. 2. Related Work 2.1. Access control systems and protocols Nowadays, traditional authentication systems available for smartphone devices such as PIN unlocking, fingerprint scanning and face detection are prone to well known attacks [6, 7, 8, 9, 10, 11]. With smartphone usage increasing on a global scale, novel authentication systems are required to protect data integrity. Authentication system based on camera PRNU fingerprint [12, 13, 14] or on motion sensors [15, 3, 4, 16], were recently introduced with the aim to offer continuous and unobtrusive user identification. Furthermore, recent research show that such systems can be included in complex authentication protocols. However, one strategy alone might not provide high gains. Notwithstanding, literature shows that multiple techniques combined yield large improvements. Zhongjie et al [12] introduced the ABC authentication protocol that is based on smartphone camera fingerprint (PRNU). In addition to the traditional credentials exchange, the protocol takes advantage of passive identifiers in the process. Withal, the aforementioned protocol requires additional steps to be performed by the user. Smartphone devices embed a large number of sensors. By all of them, motion sensors are best candidates when it comes to generation of large amounts of passive data. Additionally, natural walking patterns, called gait, are regarded as a biometric traits [17, 18]. Therefore, authentication systems based on motion sensors do not require any implicit action from the user, the data being collected in background. In comparison to multiple other data streams, such information is not usually stored or shared in open contexts like social networks or shared folders. Hence, motion data can be regarded as secure and private data streams, making them a suitable candidate for user identification tasks. 644 Cezara Benegui et al. / Procedia Computer Science 192 (2021) 642–649 Cezara Benegui / Procedia Computer Science 00 (2021) 000–000 2.2. User identification based on motion data Recent studies have shown that user identification based on motion sensors [19, 5, 20, 21, 22], can attain high accuracy rates, therefore, making those methods great candidates for continuous or two factor authentication systems. Amongst literature, a broad variety of methods have been developed to identify users through different movement patterns. Notable accuracy was attained using different techniques, starting from micro-movements recorded while users perform a signature on a smartphone device [23], to eye movement identification systems [24, 25, 26], to continuous recording of hand movement and user profiling based on accelerometer and gyroscope data [2]. Other studies rely on a combination of multiple sensors, including motion, magnetometer and pressure sensors [22]. While different work gather sensor information from simple movements captured during a smartphone use, gait or more complex movement can also be used for subject identification. The best performing recent methods using motion data are based on convolutional neural networks or recurrent neural networks [1, 19, 5]. One of the best performing methods was introduced by Vajdi et al. [5], which obtains an accuracy of up to 99.10%. 3. Method In this section, we describe our signal recording procedure alongside with the data processing technique. Further, we review the deep learning model architectures and feature extraction process. For each network type, we describe the architectures and the user identification process. 3.1. Signal Recording and Processing To conduct our experiments, we employ a comprehensive gait database introduced by Vajdi et al [5]. The data set is composed of accelerometer and gyroscope sensor values, collected from 93 subjects, utilizing two iPhone 6s devices. Each subject carries one device on the right wrist and one on the left waist. The sensor values are collected at a frequency of 100Hz while the subjects perform two walking sessions on distance of approximately 320 meters. For the experiments conducted in this work, the values correlated to accelerometer and gyroscope are extracted from the database and processed. Our contribution on the data set is described next. While the biomechanics of walking yields complex data frames, our original approach to collect tiny frames generates more robust data points that puts an emphasis on distinctive biometric features of the subject. For each subject, each of the 6 axis (3 axis for accelerometer and 3 axis for the gyroscope) are concatenated into a single data sequence. Each axis is regarded as a 1D vector. With the aim to create more data points and demonstrate our novel approach to gait pattern identification, the concatenated result is split into sub-sequences of 150 discrete values. The data is collected at 100Hz, therefore, it results tiny sequences of walking data equal to time frames of 1.5 seconds. Lastly, each data point of the resulted sequences is normalized and transformed such that its value is represented by a positive integer between 0 and 255. For training and evaluation of our method, we utilized data from a sub-set of 50 subjects. The remaining samples representing the disjoint subset of 43 users, are utilized to simulate impersonation attacks. We employ our CNN models within the identification task following the experimental settings described by Benegui et al. [1]. We utilize the same de Brujin [27] sequence, with the aim to compose gray-scale images with dimensions equal to 25 × 150 pixels. 3.2. Subject Classification Task In order to perform the subject classification task, we employ a binary-classifier, namely an SVM [28]. Throughout experiments, we implement the classifier utilizing two types of kernels: Radial Basis Function (RBF) and linear. In the case of RBF, during optimisation phase, the model determines a hyperplane that discerns samples by a maximum margin. To streamline our work and conduct experiments in a replicable approach, we employ an implementation from Scikit-learn [29] for the SVM. Different values for regularization parameter C, namely 1, 10, 100, are selected to experiment and obtain the best results in the pattern classification task. Therefore, after applying the SVM on the data set, subjects are either classified as legitimate users, if the classifier predicts a positive label, or as an adversarial Cezara Benegui et al. / Procedia Computer Science 192 (2021) 642–649 Cezara Benegui / Procedia Computer Science 00 (2021) 000–000 645 if the model predicts a negative label. Further, the CNN architectures described in the following section are employed as feature extractors. 3.3. Deep learning models and feature extraction Deep learning models are known as best performing systems in object recognition and computer vision tasks [30, 31, 32, 33, 34, 35]. In order to perform the experiments described further, two different deep neural network types are employed, namely CNNs and LSTMs. Networks described forward are used in our experimental setting as embedding extractors. The scope of this action is to generate and provide a robust inputs for the SVM. 3.3.1. CNN Architectures In this paper, we propose four CNN architectures of different depths to be utilised as embedding extractors. In general, all networks share a similar architecture, as follows: each network utilizes Softmax activation for the classification layer while Rectified Linear Unites (ReLU) [36] are utilized as the activation functions for all other layers. The first CNN architecture is composed of a convolutional (conv) layer followed by 2 fully connected (fc) layers. Each fc layer has 256 neurons and an applied dropout rate of 0.4. Lastly, the classification layer follows. Similarly, the second architecture with 6-layers is composed of 3 conv layers, 2 fc layers and the Softmax layer. Correspondingly, two more networks are utilised in the experiments consisting of 9 layers and respectively 12layers. The 9-layers CNN has 2 fc layers, 6 conv layers and the classification layer while the deepest network has 9 conv layers, 2 fc layers and a Softmax layer. Training and evaluation of the models are performed using Adam optimizer with a categorical cross-entropy loss function. While other methods converge slower, Kingma et al [37] demonstrated that the Adam optimizer is faster, thus rendering it the best choice. 3.3.2. LSTM Architectures While the gait data set is represented by time-series data, we find it applicable to employ ConvLSTM models in the experiments. While ConvLSTM models can work directly with data, less pre-processing is required on the data set. However, with the scope to keep the same number of samples as in the case of the CNN models, frames of data with a size of 6 × 150 (6 rows, 150 data points) are created. Each row represents one sensor axis. Sequences of 150 values per axis are extracted to match the frame lengths described in the image generation process. The ConvLSTM networks have an input of size 6 × 150. In comparison to the CNN models, only one LSTM architecture is selected for the experiments performed, namely the 6-layer ConvLSTM. The architecture of the 6-layers network is described next. The first convolutional layer of the network utilizes a kernel size of 1 × 3, 64 filters and ReLU as an activation function. The next layer utilizes a higher number of filters, namely 128, yet the same activation function and kernel size. The 3rd convolutional layer has the same structure as the second, except the number of filters. Within the 3rd layer we utilize 256 filters. The following two layers are represented by fc layers with 256 neurons and ReLU activations. Finally, as in the case of the CNN models, the activation layer is represented by a Softmax layer, which contains 50 neurons, equal to the number of classes used in experiments. 3.3.3. Embeddings Extraction Both the ConvLSTM and CNN network types are exploited as feature extractors. After training, each network is utilized as a prediction model. During each prediction, the values generated in the second to last layer of the network are extracted as feature vectors. While each fully connected layer is composed of 256 neurons, the output will be composed of 256 values. The feature vector extracted is stored and correlated to an input data sample. Further, the resulted vectors are used as inputs for the SVM models, with the aim to perform the user classification task. 646 Cezara Benegui et al. / Procedia Computer Science 192 (2021) 642–649 Cezara Benegui / Procedia Computer Science 00 (2021) 000–000 4. Experiments 4.1. Data set We employed the same gait data set introduced by Vajdi et al [5], composed of motion sensor data and additional meta information collected from 93 subjects that performed two walking sessions. Each recording is performed using two iPhone 6s devices, placed on the left waist and the right thigh of the subject. During each walking session, the motion sensors on the device, accelerometer and gyroscope, produce values on three axis (X,Y,Z) at a sample rate of 100Hz. For each user, the collected data from the two walking sessions, on a distance of approximately 320 meters, is further processed. More robust data points that puts an emphasis on distinctive biometric features of the subject are generated by collecting tiny frames of data from the walking session signals. The frames are equal to 1.5 seconds of walking and consists of 6 axis (one for each sensor axis) with 150 values each. Our contribution of processing gait data, allows for more robust learning of the subject movement and lowers the number of samples required to classify a user thereafter. The resulted data set is randomly split into two disjoint set of users with lengths of 50, respectively 43 subjects. The two resulted sets are used for model training, validation and the test classification task. The first subset of users, consisting of 50 subjects is utilized for user recognition experiments. In the conducted experiments, we employ 400 data samples per user. For each user, we select 200 samples for training and validation, namely 160 samples for training and 40 samples for validation based on a 80% − 20% split ratio. Furthermore, 200 samples are used for testing experiments, employing SVM classification. Nonetheless, the remaining subset of 43 users, is utilized for impersonation attack experiments. 4.2. Evaluation Metrics To evaluate our technique, we employ classification accuracy as the main metric for evaluation. In the subject classification task, we compute and track the accuracy of the models (ACC), false acceptance rate (FAR) and false rejection rate (FRR). FAR represents the ratio between false acceptances and the sum of all negative samples while false rejection rate is the ratio between all false rejection and the sum of all positive samples. 4.3. Parameter Tuning The experiments conducted are based on the optimal hyper-parameters described by Benegui et al [1] in a similar motion-based user identification task using convolutional neural networks. Therefore, a learning rate of 10−3 and a batch size of 32 is selected. In regards to the training process, all models are trained for 50 epochs using Adam as the optimizer function of choice [37]. 4.4. Experiments structure For each of the selected network types, training and validation is done using a 80% − 20% split ratio. Training of each model is performed using 160 samples per user and validation for is done using the remaining 40 samples. By eliminating the classification layer from both network types and extracting the output of the last fully connected layer, a feature vector consisting of 256 values is obtained. The resulted feature vector is used as input for an SVM classifier, with the aim to identify the subjects during the experiments. 4.5. Results 4.5.1. Subject Classification Based on Gait Motion Data The empirical results obtained using different CNN architectures are described in Table 1. We can observe that, while the depth of the architecture increases, both the validation accuracy and training accuracy tend to drop slightly using a mini-batch size equal to 32. However, as Jastrzebski et al.[38] noted, the mini-batch size and the learning rate are strongly correlated. Throughout experiments, a fixed learning rate equal to 10−3 was selected while the mini-batch size was varied to take values equal to 32, 64 or 128. As seen in Table 1, employing a larger mini-batch size and Cezara Benegui et al. / Procedia Computer Science 192 (2021) 642–649 Cezara Benegui / Procedia Computer Science 00 (2021) 000–000 647 Table 1. Train and validation accuracy rates of various depths CNN architectures, on the multi-class subject classification task. Each architecture is trained with three different mini-batch sizes. Train and validation accuracy represent an indicator of the network extracted embeddings robustness. The best results are highlighted in bold. Model 6-layer CNN 9-layer CNN 12-layer CNN Batch size 32 64 128 32 64 128 32 64 128 Accuracy Training Validation 99.12% 98.95% 99.25% 99.87% 99.37% 99.92% 98.77% 98.65% 99.53% 99.50% 98.53% 98.45% 97.73% 93.20% 99.06% 99.12% 99.64% 99.92% a deeper network yields the best results, with a training accuracy of 99.64% and a validation accuracy of 99.92%. While a mini-batches of 128 samples are optimal for 12-layers and 6-layers CNN, the 9-layers network attains better results using 64 samples. Empirical results presented in Table 1 shows that the 6-layer and 12-layer CNN networks have stronger generalization capacity, attaining a validation accuracy of 99.92%. Nonetheless, the shallower network composed of 6 layers yields better generalisation results than the 12-layers network, when it comes to different batch sizes. Hence, throughout the following experiments the 6-layers CNN based on mini-batches of 128 samples is utilized for subsequent experiments. Table 2. Train and validation accuracy rates of the 6-layer CNN architecture versus the 6-layer ConvLSTM, on the multi-class subject classification task. All models are trained with mini-batches of 128 samples and produce 256-dimensional embeddings. Train and validation accuracy represent an indicator of the network extracted embeddings robustness. The best results are highlighted in bold. Model 6-layer ConvLSTM 6-layer CNN Accuracy Training Validation 97.91% 94.5% 99.37% 99.92% Both the CNN and ConvLSTM networks are used as feature extractors within the experiments, thus, the ConvLSTM is trained in a similar fashion to the CNN networks. Table 2 shows a comparison between the best performing CNN and the ConvLSTM. It can be observed that the ConvLSTM architecture does not surpass the accuracy and generalization capacity of the 6-layers CNN. While the 6-layers ConvLSTM performs best, the validation accuracy (94.5%) is 5.42% lower than the best performing CNN. Table 3. Subject classification results employing SVM classifier based on CNN and ConvLSTM features. Two kernel functions, linear and RBF are applied to the SVM along with the regularization parameter C of different values. The reported accuracy, FAR and FRR values, represent the average test accuracy determined on the 50 subjects involved in the user identification task. The best results are highlighted in bold. Model LSTM + SVM CNN + SVM Kernel linear RBF linear RBF C 1 10 1 10 Accuracy 98.67% 98.79% 98.67% 98.79% FAR 1.94% 1.04% 1.94% 1.04% FRR 0.72% 1.38% 0.72% 1.38% After embeddings extraction from both networks, an SVM classifier is applied. Table 3 highlights the empirical results obtained by employing an SVM classifier on the extracted embeddings. Results show that both CNN and 648 Cezara Benegui et al. / Procedia Computer Science 192 (2021) 642–649 Cezara Benegui / Procedia Computer Science 00 (2021) 000–000 ConvLSTM extracted features yield equal results. While using different kernels and regularization parameters, the accuracy variance between models is slim to none (0.12%). It can be noted that, by employing an RBF kernel, better results are obtained. However, the false rejection rate increases when compared to a linear kernel. 5. Conclusion In this paper we studied the subject identification task based on gait data. Implementing pre-trained CNN and LSTM networks as feature extractors and SVM as classifiers produces great results in subject identification task. The experiments show that gait-based data sets are very well suited for this task, attaining an accuracy rate of up to 98.79%. Passive collection of gait data and continuous user identification is a strong two-factor authentication layer. We hereby conclude that deep neural networks for gait-based subject identification are very promising, attaining an accuracy up to 98.79%. Compared to [5], the presented method performs similarly, the difference being an accuracy decrease of 0.31%. We thus conclude that our gait-based user identification system is suitable for industry usage, having a high accuracy and a low misclassification rate. Furthermore, our system does not require any user interaction, therefore rendering it a great unobtrusive security layer. In future work, we aim to identify other solutions to further reduce the FAR and the FRR values, since we believe that the ABC protocol still has enough potential to become a reliable authentication protocol. References [1] C. Benegui and R. T. Ionescu, “Convolutional Neural Networks for User Identification based on Motion Sensors Represented as Images,” IEEE Access, vol. 8, no. 1, pp. 61 255–61 266, 2020. [2] A. Buriro, B. Crispo, and Y. Zhauniarovich, “Please Hold On: Unobtrusive User Authentication using Smartphone’s built-in Sensors,” in Proceedings of ISBA, 2017, pp. 1–8. [3] Z. Sitová, J. Šedenka, Q. Yang, G. Peng, G. Zhou, P. Gasti, and K. S. Balagani, “HMOG: New Behavioral Biometric Features for Continuous Authentication of Smartphone Users,” IEEE Transactions on Information Forensics and Security, vol. 11, no. 5, pp. 877–892, 2016. [4] L. Sun, Y. Wang, B. Cao, S. Y. Philip, W. Srisa-An, and A. D. Leow, “Sequential keystroke behavioral biometrics for mobile user identification via multi-view deep learning,” in Proceedings of ECML-PKDD, 2017, pp. 228–240. [5] A. Vajdi, M. R. Zaghian, S. Farahmand, E. Rastegar, K. Maroofi, S. Jia, M. Pomplun, N. Haspel, and A. Bayat, “Human gait database for normal walk collected by smart phone accelerometer,” arXiv preprint arXiv:1905.03109, 2019. [6] G. Ye, Z. Tang, D. Fang, X. Chen, K. I. Kim, B. Taylor, and Z. Wang, “Cracking Android Pattern Lock in Five Attempts,” in Proceedings of NDSS, 2017. [7] H. won Kwon, J.-W. Nam, J. Kim, and Y. K. Lee, “Generative adversarial attacks on fingerprint recognition systems,” in 2021 International Conference on Information Networking (ICOIN). IEEE, 2021, pp. 483–485. [8] L. Yang, Q. Song, and Y. Wu, “Attacks on state-of-the-art face recognition using attentional adversarial attack generative network,” Multimedia Tools and Applications, vol. 80, no. 1, pp. 855–875, 2021. [9] H. Shin, S. Sim, H. Kwon, S. Hwang, and Y. Lee, “A new smart smudge attack using cnn,” International Journal of Information Security, pp. 1–12, 2021. [10] J. Fei, Z. Xia, P. Yu, and F. Xiao, “Adversarial attacks on fingerprint liveness detection,” EURASIP Journal on Image and Video Processing, vol. 2020, no. 1, pp. 1–11, 2020. [11] L. J. González-Soler, M. Gomez-Barrero, L. Chang, A. Pérez-Suárez, and C. Busch, “Fingerprint presentation attack detection based on local features encoding for unknown attacks,” IEEE Access, vol. 9, pp. 5806–5820, 2021. [12] B. Zhongjie, P. Sixu, F. Xinwen, K. Dimitrios, M. Aziz, and R. Kui, “ABC: Enabling Smartphone Authentication with Built-in Camera,” in Proceedings of NDSS, 2018. [13] I. Amerini, P. Bestagini, L. Bondi, R. Caldelli, M. Casini, and S. Tubaro, “Robust smartphone fingerprint by mixing device sensors features for mobile strong authentication,” in Media Watermarking, Security, and Forensics. Ingenta, 2016, pp. 1–8. [14] D. Valsesia, G. Coluccia, T. Bianchi, and E. Magli, “User Authentication via PRNU-Based Physical Unclonable Functions,” IEEE Transactions on Information Forensics and Security, vol. 12, no. 8, pp. 1941–1956, 2017. [15] C. Shen, T. Yu, S. Yuan, Y. Li, and X. Guan, “Performance Analysis of Motion-Sensor Behavior for User Authentication on Smartphones,” Sensors, vol. 16, no. 3, p. 345, 2016. [16] E. Vildjiounaite, S.-M. Mäkelä, M. Lindholm, R. Riihimäki, V. Kyllönen, J. Mäntyjärvi, and H. Ailisto, “Unobtrusive multimodal biometrics for ensuring privacy and information security with personal devices,” in Proceedings of PERVASIVE, 2006, pp. 187–201. [17] I. Olade, C. Fleming, and H.-N. Liang, “Biomove: Biometric user identification from human kinesiological movements for virtual reality systems,” Sensors, vol. 20, no. 10, p. 2944, 2020. [18] M. Kos and I. Kramberger, “A wearable device and system for movement and biometric data acquisition for sports applications,” IEEE Access, vol. 5, pp. 6411–6420, 2017. Cezara Benegui et al. / Procedia Computer Science 192 (2021) 642–649 Cezara Benegui / Procedia Computer Science 00 (2021) 000–000 649 [19] N. Neverova, C. Wolf, G. Lacey, L. Fridman, D. Chandra, B. Barbello, and G. Taylor, “Learning Human Identity from Motion Patterns,” IEEE Access, vol. 4, pp. 1810–1820, 2016. [20] Y. Ku, L. H. Park, S. Shin, and T. Kwon, “Draw it as shown: Behavioral pattern lock for mobile user authentication,” IEEE Access, vol. 7, pp. 69 363–69 378, 2019. [21] H. Li, J. Yu, and Q. Cao, “Intelligent Walk Authentication: Implicit Authentication When You Walk with Smartphone,” in Proceedings of BIBM, 2018, pp. 1113–1116. [22] R. Wang and D. Tao, “Context-Aware Implicit Authentication of Smartphone Users Based on Multi-Sensor Behavior,” IEEE Access, vol. 7, pp. 119 654–119 667, 2019. [23] A. Buriro, B. Crispo, F. Delfrari, and K. Wrona, “Hold and sign: A novel behavioral biometrics for smartphone user authentication,” in Proceedings of SPW, 2016, pp. 276–285. [24] D. J. Lohr, S. Aziz, and O. Komogortsev, “Eye movement biometrics using a new dataset collected in virtual reality,” in ACM Symposium on Eye Tracking Research and Applications, 2020, pp. 1–3. [25] X. Wang, X. Zhao, and Y. Zhang, “Deep-learning-based reading eye-movement analysis for aiding biometric recognition,” Neurocomputing, 2020. [26] S. N. A. Seha, D. Hatzinakos, A. S. Zandi, and F. J. Comeau, “Improving eye movement biometrics in low frame rate eye-tracking devices using periocular and eye blinking features,” Image and Vision Computing, p. 104124, 2021. [27] A. Ralston, “De Bruijn Sequences–A Model Example of the Interaction of Discrete Mathematics and Computer Science,” Mathematics Magazine, vol. 55, no. 3, pp. 131–143, 1982. [28] C. Cortes and V. Vapnik, “Support-Vector Networks,” Machine Learning, vol. 20, no. 3, pp. 273–297, 1995. [29] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg et al., “Scikitlearn: Machine learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011. [30] K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” in Proceedings of CVPR, 2016, pp. 770–778. [31] M.-I. Georgescu, R. T. Ionescu, and M. Popescu, “Local Learning with Deep and Handcrafted Features for Facial Expression Recognition,” IEEE Access, vol. 7, pp. 64 827–64 836, 2019. [32] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” in Proceedings of NIPS, 2015, pp. 91–99. [33] R. T. Ionescu, B. Alexe, M. Leordeanu, M. Popescu, D. Papadopoulos, and V. Ferrari, “How hard can it be? Estimating the difficulty of visual search in an image,” in Proceedings of CVPR, 2016, pp. 2157–2166. [34] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only Look Once: Unified, Real-Time Object Detection,” in Proceedings of CVPR, 2016, pp. 779–788. [35] N. Wahab, A. Khan, and Y. S. Lee, “Transfer learning based deep CNN for segmentation and detection of mitoses in breast cancer histopathological images,” Microscopy, vol. 68, no. 3, pp. 216–233, 2019. [36] V. Nair and G. E. Hinton, “Rectified Linear Units Improve Restricted Boltzmann Machines,” in Proceedings of ICML, 2010, pp. 807–814. [37] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in Proceedings of ICLR, 2015. [38] S. Jastrzebski, Z. Kenton, D. Arpit, N. Ballas, A. Fischer, Y. Bengio, and A. Storkey, “Width of Minima Reached by Stochastic Gradient Descent is Influenced by Learning Rate to Batch Size Ratio,” in Proceedings of ICANN, vol. 11141, 2018, pp. 392–402.