Cybersecurity Tools Analysis & Reinforcement Learning

Analysis of Tools Available for Use in Cybersecurity Zachary Sherman-Burke and Aadam Bodunrin Department of Computer Science East Carolina University Greenville, NC, 27858, USA shermanburkez12@students.ecu.edu bodunrina22@students.ecu.edu Abstract Models (LSTM), and Transformer Models. This paper aims to evaluate these tools, their uses, and new developments that have arisen as a result. Once these tools have been examined, this paper will aim to propose a new approach using reinforcement learning to be utilized by cybersecurity experts. This paper is organized as follows: in Section 2 we discuss what research or uses are utilizing the tools in cybersecurity use and how they are implemented. Section 3 provides the underlying architecture of the tools discussed in this paper. Section 4 discusses a new tool utilizing reinforcement learning in cybersecurity. Section 5 outlines how the environment will be addressed. Section 6 addresses how the reward function will be created. Section 7 will discuss the agent and how it all comes together. Section 8 discusses the containerization of the reinforcement learning program. We close the document with conclusions and suggested avenues of research. Cybersecurity is a critical part of almost all organizations’ infrastructure in virtually every industry. With recent advancements in technology, specifically Large Language Models, there have arisen many new threats and even the creation of many new adversaries which did not exist before due to a lack of skill or knowledge. Prior to the rise of large language models (LLM), there was already a struggle among cybersecurity experts to keep up with attackers. Now that numerous LLMs have been released to the public, such as ChatGPT, and models continue to be improved and released, the struggle cybersecurity experts face has grown even more dire. As a result, there is an even greater need for cybersecurity experts to assess, understand, and fully utilize the tools available to them. The first part of this study examines the current tools that exist for use by experts to help defend and fortify a given network. The second part proposes an approach for utilizing reinforcement learning as a tool for further protecting a cybersecurity system via anomalous activity detection. 2. Related Work The experiments described in this study are aimed at examining how machine learning tools can be used in the defense of a network against attackers. Keywords: 1. Introduction 2.1 With recent advancements in Artificial Intelligence (AI) and Machine Learning (ML), there has been an increase in cybersecurity tools and threats. Developments in ML, especially Large Language Models (LLM) and derived tools, such as ChatGPT, have not only given malicious actors the tools needed to be much more efficient and dangerous, but it has also created a new market that did not exist before. With tools like ChatGPT, it is now possible for people without the technical knowledge or training to create tools and methods of attacking networks and proprietary systems with only a description of what they want to do. The tools that will be examined in this paper are Knowledge Graphs (KG), Docker, Long Short-Term Memory Detecting Insider Threats with LSTM When a malicious actor is attempting to attack an organization they typically first target the network in some way, such as phishing emails, vulnerability exploits, or improper use of credentials. With the breadth of network activity, web browsing, authentication, daily job use, etc..., one of the most important aspects of a cybersecurity expert’s job is understanding and fully utilizing the tools available to them. With the rise in power and capability of ML, researchers have started looking at ways it can be implemented in the defense of a system. One tool that has been developed by Lopez and Sartipi utilizes an LSTM model to analyze electronic logs in order to create a probabilistic model that is 1 used to analyze activity and determine the likelihood that a given event is a threat [1]. If an event is determined to be a threat it can then individually be analyzed by an end-user without having to go through all of the logs. The work done in this paper shows the viability of LSTM models to parse very large electronic logs to detect anomalous behavior. 2.2 Paying Threat Attention to the formed in utilizing them together for a variety of purposes. One area of research looked at by Antoine Bosselut, is the utilization of a transformer to automatically build a knowledge base based on the relation between different concepts, attributes, etc..., in effect creating a knowledge graph. The transformers created in this paper are referred to as COMmonsense Transformers (COMET) and have shown promise for the automatic construction of knowledge graphs [5]. Insider In addition to the research performed with LSTMs, Lopez and Sartipi also researched the viability of using Bidirectional Encoder Representations from Transformers (BERT) as a tool for cybersecurity by means of anomaly detection and user behavior prediction. The transformer is a newer type of machine-learning model than LSTMs and differs in underlying architecture each with its own strengths and weaknesses. BERT is a special type of transformer that only consists of an encoder that can pass information forward and backward in the network. This research was performed by training BERT on the Los Alamos cybersecurity events data set [2]. Once BERT was initially trained on the Los Alamos data set, the model was updated and fine-tuned on 14 days. The model was fine-tuned on the delay of a day. Finally, the model was updated and tested on a time frame of one second. What the results showed is that a model was able to be built was able to go through activity logs, identify data, and alert a user to a threat within a second of the threat occurring with a high degree of accuracy [3]. 2.3 3. Background 3.1 Artificial neural networks (ANN), also known as simple neural networks (NN), are a component of most machine learning models. NNs work by mimicking the way that biological neurons signal to one another. At its most basic a neural network takes inputs through the input layer, sends them sequentially through a series of processing steps in the hidden layers, and then produces an output at the final layer, known as the output layer (Figure 1). The input layer takes external information processes it, analyzes it, and passes it to the next layer. The hidden layer takes input from the input layer or another hidden layer analyzes the output of that layer, processes it further, and passes it to the next layer. The output layer will take the final processed data and provide the final result. A simple ANN utilizes a feed-forward mechanism in which each layer always passes information to the next. Knowledge Graphs Knowledge graphs are a powerful tool in the fight against hackers and are capable of network data aggregation, data integration, and knowledge discovery [4]. Knowledge graphs are also inherently helpful in cybersecurity by allowing the tracking of a network and possible hacker activity to understand the breadth and depth of the attack. While a knowledge graph is more of a concept, tools like Neo4j exist which allow for easy knowledge graph creation with querying, adding, removing, and filtering nodes. A tool like Neo4j would be helpful as it allows for the visualization of a knowledge graph with connected nodes which could be used to aid in visually spotting outliers and tracking their activity throughout a network. Overall, Knowledge graphs can serve as an excellent tool to help those who need to defend a network from attackers by providing visual information to help in making the decisions that can isolate a hacker and keep the rest of the network safe. 2.4 Neural Networks Figure 1. Overview of a Neural Network An advancement on the simple neural network is the recurrent neural network (RNN). RNNs follow the same fundamental principles as the ANN but introduce the concept of memory by allowing the output of one layer to be used as the input of previous layers. Hybrid Approaches While knowledge graphs and ML models are both powerful tools on their own merits, recent research has been per2 Figure 2. Overview of a Simple Recurrent Neural Network A shortcoming of RNNs is their inability to preserve information for use later due to what is known as gradient vanishing and gradient exploding problems [2][3]. The source of both problems is that as time goes on the weights for each hidden layer neuron are updated but if the weights get too small the first part of your network will be overshadowed by the later half and all of the initial information could be lost. Conversely, the gradient exploding problem is the exact opposite, the weights get too big, and the first part of your network will dominate the rest. LSTMs were developed, in part, to address these issues. 3.2 Figure 3. Transformer Architecture A further refinement of the transformer model is a model known as bidirectional encoder representations from transformers (BERT). BERT differs from a typical transformer in a few ways different ways. A few differences between a transformer and BERT are BERT drops the decoder and only uses an encoder, BERT is typically pre-trained on an unsupervised learning task, whereas a transformer typically uses supervised training, and BERT was developed specifically for NLP-based tasks whereas Transformers are focused on generating an output sequence from an input sequence. In addition to machine learning-based tools, there also exists a tool known as Docker. LSTM An LSTM is a type of recurrent neural network (RNN) that has ”long-term memory” and ”short-term memory” that saves information for use later to avoid the gradient vanishing problem. At a high level LSTMs analyze sequences as a whole and individually and by doing this can support both long and short-term memory to prevent gradient issues. Mathematically LSTMs work by adding an additive update function which provides better-defined behavior as well as a gating function which provides a direct mechanism for controlling how much the gradient vanishes or grows at each step [8]. 3.3 3.4 Transformers Docker Docker is software that allows for OS virtualization to create environments, also called containers, almost completely separate from the hard and host operating system (OS). A Docker container can contain scripts, programs, or even its own OS, that exists entirely within the container separated from the host. By providing separation Docker also allows for the creation and use of multiple OSs on a single set of hardware. Docker’s primary purposes are creating test environments to protect your system, complete separation of systems with minimal hardware, and easy and quick deployment of all necessary tools, programs, and packages for a given role. All of these uses can be applied to cybersecurity making Docker a useful tool. Docker can be used to create environments to simulate networks, attacks, or malware, it can also be used to create a complete separation between critical infrastructure even if it is located on the One of the newest advances in AI and ML is the concept of Transformers, introduced by Vaswani et al. [9]. Transformers take the concepts of an RNN and build on it using a mechanism known as self-attention to allow for parallel processing. Self-attention works by ”modulating the representation of a token by using the representations of related tokens in the sequence” [6]. Transformers are able to parallelize by performing multiple self-attentions together in a mechanism known as Multi-Head Attention (Figure 3). The two main components of a transformer are the encoderdecoder. An encoder extracts important features for use and the decoder takes the output from the encoder, processes the data in multi-head attention layers, shifts the data in one position, and produces an output. 3 same hardware. Docker is also useful for creating entire deployments for cybersecurity employees or students wanting to learn more about the field. In addition to Docker, another non-ML-based tool are knowledge graphs. 3.5 KGs in cybersecurity could allow for the automation and combination of the most commonly used tools in cybersecurity. As shown by Lopez and Eduardo [1][3], ML models can be built to parse logs to find anomalous activity and predict user behavior in near real time. Using KGs you could pull in and build known databases which are used for storing and tracking malicious sites, emails, software, etc.., and combine them with the ML models to improve accuracy, continually build out databases, and help keep track of activity. Knowledge Graph Knowledge graphs are collections of data that provided relations between the data, (nodes) points based on a given relation (edges). KGs serve purposes in many fields including social media, organizational structure, questionanswering systems (QAS), and cybersecurity. The ability of KGs to track and maintain relations as well as communicate them to humans and computers is where the strength lies. In cybersecurity, one of the most valuable things an expert could know or have access to is an overview of their network and connections. Understanding a network and the communications that take place helps to build a strong, layered defense that can be adjusted per layer as needed. By having an overview of your network, you can also better assess the potential risk associated with a compromised system based on where it lies in the network and what connections it makes. 3.6 4. Approach The remaining sections outline how cybersecurity tools discussed in Section 3 can further be expanded on the work introduced in Section 2. This study proposes the use of reinforcement learning for the creation of a model to analyze and detect anomalous network activity. The first step is to outline the environment to operate in and then design a reward function and agent. Once the model is complete and runs successfully, the next step is to take the model and its components and create a Docker container for use as a demonstration and teaching tool. Machine Learning Integrated Knowledge Graphs 4.1 Reinforcement Learning The approach proposed in this paper is to use reinforcement learning as a means of anomaly detection. The ML techniques discussed so far have fallen into either one of two categories, supervised or unsupervised learning. RL, however, is a type of machine learning which falls outside of both of these categories (Figure 4) Another approach that has been proposed is a combination of MLs and KGs in conjunction to help overcome the weaknesses of the other. As previously discussed, ML models and KGs are useful tools for cybersecurity, but the ability to use them together could provide even greater tools. There has been recent research and developments in the integration of ML models and KGs [5][10]. The strength of knowledge graphs is that they have clear, structured connections between data based on a given set of relations. The strength of KGs can also serve as a weakness when a user does not have enough knowledge or information to provide the information for the knowledge graph to create or find the relations needed or if a user makes spelling or grammatical errors the search results might be completely irrelevant to the user’s desired inquiry. The converse is true of ML models, ML models excel at inference, understanding, and finding relevant information based on the context of a query, but the information might only be partially relevant, only partially true, or sometimes entirely fabricated since it is creating a response rather than pulling it from a database. KGs could provide a foundation for ML models so that they are bound by a set of rules so the information provided is significantly more likely to be accurate and relevant rather than creating an answer that is irrelevant or fabricated. Similarly, ML models can allow for better interpretation and understanding of a query. Using ML-integrated Figure 4. Types of Machine Learning [11] In supervised learning the model is presented with a set of correct actions to compare performance against. In unsupervised learning the data is unlabeled and the goal is to 4 group data based on inherent similarities and differences. RL differs from supervised learning in that the feedback provided is rewards and punishments as signals for negative and positive behaviors. When compared to unsupervised learning, RL’s difference comes from the goal. In RL the goal is to maximize the agent’s total cumulative reward, whereas in supervised learning the goal is to find differences based on the inherent properties of the data. The main components of an RL model are the agent, environment, state, action, and reward. The agent is the component that makes decisions, receives punishments and rewards, and interacts with the environment. The environment is the world in which the agent operates and interacts with. The state is the observation the agent performs on the environment after performing an action and is the current situation of the agent. Action is what the agent performs on the environment based on the agent’s observation. Reward is the feedback the agent receives based on the action that was performed, it can be either positive or negative. These components remain the same for each of the three types of reinforcement learning implementation, Value-based, Policy-based, and Model-based. The approach behind model-based reinforcement learning is to find the optimal function value. In policy-based methods, the agent works to develop a policy so that the actions performed in each state help to maximize future rewards. In model-based approaches, virtual models are created for every environment and the agent explores the model to learn it. The algorithms behind these implementations fall into one of two categories: model-free RL algorithm or modelbased RL algorithm. The approach outlined by this paper will propose the utilization of a Deep Q-Learning which is a model-free, valuebased algorithm that maximizes future rewards. The reason for focusing on Deep Q-Learning for this study is the desire to create a model that is able to quickly and efficiently parse authentication records to correctly identify activity. The goal of the model will be to classify whether an activity is normal or anomalous while building the most efficient algorithm for parsing the log. parse such a large file but since we also heavily value accuracy, a time penalty will be applied to incorrectly labeled data. This leads to the next component of our model, the reward function. 6. Reward Function Add the Discussion about the proposed approach here. 7. Agent The agent will be a Deep Q-Network (DQN) Agent which is a value-based agent that operates in a discrete action space but can operate in a continuous or discrete observation space. DQN Agent is best for the goal of this study as the actions that our model are deciding whether an activity is normal or anomalous and deciding the next communication activity to examine. The primary goal of the agent is to examine each communication quickly and label it accurately. To simplify the agent’s goals, the agent will look solely at the time and each incorrect labeling will incur a time penalty. We believe this is the best approach for the agent, since in a real situation if the model falsely alerts someone to an issue that person would need to spend time verifying the false alarm and thus be delayed in the event of a real threat coming shortly after. Overall the agent will operate in a network communication environment that occurs in real-time, in an attempt to classify all communication in and efficient and accurate manner alerting a third party when an anomaly is detected. 8. Containerization The goal of this study is to create a tool that can be utilized by cybersecurity experts and educators to demonstrate the capabilities of reinforcement learning, provide a model to experiment with and encourage further research in this area. In order to share this research, the code and data will be containerized into a Docker container that contains all of the resources needed to duplicate the work performed in this study. Docker was chosen because it allows for a portable solution for others to copy this work and it allows for the installation of necessary packages and dependencies once upon creation. 5. Environment The environment will consist of a pseudo-continuous network communication space, the data log, in which the agent will be responsible for navigating entries to classify and identify anomalous behavior. At each state, the environment will consist of a timestamp, communication (source and destination computer and user), domain, authentication, and logon types, orientation, and the communication outcome. While the primary goal of the agent is to detect anomalous activity, the key parameter monitored will be time. The goal will be to find the most efficient way to 9. Conclusion To conclude this study, the principles outlined in Section 3 provide the basis for the tools outlined in Section 2. The proposed research in Sections 4, 5, 6, and 7 provides a basis for continued research. Section 8 outlines the plan and methodology to make the work that will be performed as a result of this paper accessible and repeatable. 5 ML and KGs form the foundation for the majority of the cybersecurity tools put forth in this study. Due to the complex and interwoven nature of network communications ML and KG show great promise for 10. References [1] E. Lopez and K. Sartipi, “Detecting the Insider Threat with Long Short Term Memory (LSTM) Neural Networks,” arXiv:2007.11956 [cs], Jul. 2020, Accessed: Apr. 13, 2023. [Online]. Available: https://arxiv.org/abs/2007.11956 [2] A. D. Kent. Comprehensive, Multi-Source Cybersecurity Events, 2015. [3] E. Lopez and K. Sartipi, “Paying Attention to the Insider Threat,” Proceedings of the 34th International Conference on Software Engineering and Knowledge Engineering, Jul. 2022, doi: https://doi.org/10.18293/seke2022-059. [4] L. F. Sikos, “Cybersecurity knowledge graphs,” vol. 65, no. 9, pp. 3511–3531, Apr. 2023, doi: https://doi.org/10.1007/s10115-023-01860-3. [5] A. Bosselut, H. Rashkin, M. Sap, Chaitanya, M. Asli Celikyilmaz, and Y. Choi, “COMET : Commonsense Transformers for Automatic Knowledge Graph Construction,” 2019. Accessed: Jul. 26, 2023. [Online]. Available: https://aclanthology.org/P19-1470.pdf. [6] F. Chollet, Deep Learning with Python. Shelter Island (New York, Estados Unidos): Manning, Cop, 2018. [7] S. Hochreiter, “The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions,” International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, vol. 06, no. 02, pp. 107–116, Apr. 1998, doi: https://doi.org/10.1142/s0218488598000094. [8] N. Arbel, “How LSTM networks solve the problem of vanishing gradients,” Medium, May 16, 2020. https://medium.datadriveninvestor.com/how-do-lstmnetworks-solve-the-problem-of-vanishing-gradientsa6784971a577. [9] A. Vaswani et al., “Attention Is All You Need,” arXiv.org, 2017. https://arxiv.org/abs/1706.03762 [10] N. Rohrseitz, “Knowledge Graphs and Machine Learning,” Medium, Feb. 13, 2022. https://towardsdatascience.com/knowledge-graphs-andmachine-learning-3939b504c7bc [11] S. Bhatt, “Reinforcement Learning 101,” Medium, Apr. 19, 2019. https://towardsdatascience.com/reinforcement-learning101-e24b50e1d292 [12] [13] [14] [15] 6

Cybersecurity Tools Analysis & Reinforcement Learning

Related documents

Products

Support

Cybersecurity Tools Analysis & Reinforcement Learning

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib