Computers & Security 116 (2022) 102639 Contents lists available at ScienceDirect Computers & Security journal homepage: www.elsevier.com/locate/cose CAVP: A context-aware vulnerability prioritization model Bill Jung∗, Yan Li, Tamir Bechor Claremont Graduate University, United States a r t i c l e i n f o Article history: Received 4 June 2021 Revised 8 January 2022 Accepted 1 February 2022 Available online 4 February 2022 Keywords: CVSS Vulnerability prioritization CVEs Risk management Exploitability Temporal vulnerability a b s t r a c t With the growing number of vulnerabilities while attacks are becoming more advanced, known software security vulnerabilities need to be managed more efficiently with prioritization and contextualization. The current industry standard approach towards vulnerability management at a large scale is limited for two reasons. First, it does not automatically capture the temporal characteristics of the Common Vulnerability Exposures (CVEs) (i.e., how CVEs change over time). Second, it requires manual labor to prioritize identified vulnerabilities. To address these limitations, this research designs a context-aware vulnerability prioritization (CAVP) model to calculate temporal-enabled vulnerability scores of CVEs and prioritize these vulnerabilities visually. The CAVP model includes an enhanced Context-Aware Vulnerability Scoring System (CAVSS) that automatically derives temporal metric values of CVEs through a set of expert-validated heuristic rules. The CAVP model is the first attempt to provide a step-by-step process of vulnerability prioritization that can be integrated within the risk management workflow of an organization. The implementation of the CAVP model in two organizations validates its usefulness. Published by Elsevier Ltd. 1. Introduction Information technology (IT) advances at an incredible pace and makes organizations’ business operations digital-transformed. Examples are aplenty, such as remote health diagnosis and mobile payments. As many IT-driven products and services are becoming more essential, cybersecurity risks are also proliferating while cyber-attacks are becoming more sophisticated, exemplified by the prevalence of Advanced Persistent Threats (APTs) (Li et al., 2019) and zero-day exploits (Vakilinia and Sengupta, 2020). In managing cybersecurity risks, vulnerability management is significant because IT tools inevitably expose software security vulnerabilities that could potentially turn into risks. With the growing number of vulnerabilities and attacks are becoming more advanced, known software security threats or vulnerabilities need to be managed more efficiently with prioritization and contextualization. To assist vulnerability management, National Vulnerability Database (NVD) provides helpful vulnerability information as data feeds (National Institute of Standards and Technology (NIST), 2021) that organizations can download free of charge. In the firsttime public announcement of a new Common Vulnerability Exposure (CVE), a unique ID is assigned by CVE Numbering Authorities (CNAs) (Dong et al., 2019), and the CVE is captured in NVD. NVD also provides a Common Vulnerability Scoring System ∗ Corresponding author. E-mail address: bill.jung@cgu.edu (B. Jung). https://doi.org/10.1016/j.cose.2022.102639 0167-4048/Published by Elsevier Ltd. (CVSS) (Mell et al., 2007), through which a vulnerability Base Score (Torkura et al., 2021) is provided to quantitatively measure the severity of an IT asset’s vulnerability. However, the Base Scores alone are insufficient for vulnerability management because the base metrics only cover the stable attributes of exploits and do not capture their dynamic characteristics. These dynamic characteristics can be described as how exploits would change over time (Singh et al., 2016). Currently, the CVSS requires the Security Operation Center (SOC) team from an organization to manually input the temporal metric value for each CVE. Such an approach is inefficient in managing vulnerabilities at a large scale. This leads to our first research question: how to automatically calculate Temporal scores of CVSS based on existing data about CVEs? The ability to automatically calculate temporal-enabled vulnerability scores alone is not sufficient to manage security risks. Traditional vulnerability management solutions often produce thousands of vulnerabilities with high severity using automated vulnerability scanners, such as Nessus (Figueira et al., 2020). The SOC team would then mostly rely on manual human labor to prioritize these vulnerabilities for risk management. This leads to confusing priorities and complicates remediation efforts (Zhang et al., 2020). It is also time-consuming and the chances for human mistakes or oversights are high. A notorious example is the Equifax data breach incident that was caused by an individual who failed to patch the vulnerability later exploited by the malicious actors (United States Government Accountability Office, 2018). Such a mistake could have been avoided if CVEs are better prioritized so that vulnerabilities with the highest priorities are remediated B. Jung, Y. Li and T. Bechor Computers & Security 116 (2022) 102639 promptly. Thus, our second research question investigates how to improve the existing CVSS for better vulnerability prioritization. To answer the research questions, we design a context-aware vulnerability prioritization (CAVP) model to calculate temporalenabled vulnerability scores of CVEs and prioritize these vulnerabilities. The CAVP model includes an enhanced CAVSS system that automatically derives temporal metric values of CVEs through a set of expert-validated heuristic rules. The rest of the paper is organized as follows. Section 2 presents related work and Section 3 presents the research background, followed by the proposed CAVP model description in Section 4. Section 5 evaluates the CAVP using case studies in two real-world organizations. Section 6 concludes the paper and discusses future research directions. more recent study by Figueroa-Lorenzo et al. (2020) expanded CVSS to both IT and operational technology (OT) assets by adding other open data sources (e.g., Common Weakness Enumeration and Common Attack Pattern Enumeration and Classification) in addition to CVEs, though the CVSS scores were still calculated using the CVSS v3.1 metrics. Allodi and Massacci (2014) used a casecontrol study methodology by considering risk factors other than sole CVSS scores. None of these existing studies supported automatically Temporal score calculation. Another stream of research focuses on different vulnerability prioritization methods, especially for its contextualization. For example, Liu and Zhang (2011) designed a VRSS (Vulnerability Rating and Scoring System) that uses qualitative rating methods to separate vulnerabilities risk levels and then takes a quantitative scoring method to produce a numeric score. The VRSS was improved by adding a vulnerability type factor (VTF) to the Exploitability SubScore and the VTF was created using Analytic Hierarchy Process (AHP) (Liu et al., 2012). In a related study, Huang et al. (2013) proposed an evaluation framework for vulnerability prioritization using CVSS metrics as the evaluation criteria. The framework utilized the Fuzzy Delphi method to filter the evaluation criteria and define the weight for these criteria, and then prioritize the vulnerabilities through a fuzzy analytical hieratical process. Another methodology, WIVSS (Weighted Impact Vulnerability Scoring System) (Spanos et al., 2013), ranks and scores vulnerabilities using the same factors of CVSS but considers different weights for the impact metrics. Another study (Houmb et al., 2010) used a subset of attributes from the CVSS to estimate the frequency and impact of vulnerabilities and then derive risk levels from these estimates. All these approaches are time-consuming and cannot be automated within an organization’s vulnerability assessment process. Worth mentioning is related research that includes both vulnerability prioritization and mitigation in vulnerability management. For example, VULCON (VULnerability CONtrol) (Farris et al., 2018), a vulnerability management strategy, used two metrics: Time-toVulnerability Remediation (TVR) and Total Vulnerability Exposure (TVE). The first metric is related to the organization’s mitigation effort and the second is related to un-remediated TVEs. A recent study (Dissanayaka et al., 2020) added root causes analysis after vulnerability prioritization and then suggested available mitigation techniques. While we acknowledge that vulnerability mitigation is an important step in security risk management, it is out of the scope of this paper based on our research questions. 2. Related work Early research on vulnerability management focused on how to quantify vulnerabilities and understand exploitation and remediation of vulnerabilities at a large scale. Among these early studies, Frei et al. (2006) were the first to investigate the relationship between the time of the discovery of vulnerabilities and the time of the availability of security patches. Their study showed how exploits could change over time throughout the vulnerability life cycle that includes discovery time, exploitation time, and patching time. Their study was extended by Shahzad et al., (2012) to include functionality of the exploit, access requirement for exploitation and risk level of vulnerabilities, and software vendors and products. Both studies included NVD as one of their data sources and highlighted the importance of contextualization in large-scale vulnerability analysis. There are different security metrics (Nayak et al., 2014; Dobrovoljc et al., 2017) for vulnerability analysis and prioritization. Based on the count of vulnerabilities exploited in the wild and exploit ratio, Nayak et al. (2014) proposed two security metrics: attack volume (i.e., the average number of attacks experienced) and exercised attack surface (i.e., the number of distinct exploited vulnerabilities). Economic-driven metrics were also proposed to measure potential economic damages and expected post response costs (Ghani et al., 2013). Other security metrics for measuring software vulnerabilities include tendency-to-be-exploited, topology metrics (i.e., how the topological properties of attack graphs affect network attacks), and effort metrics (i.e., the degree of effort by a defender to mitigate vulnerabilities exploits) (Pendleton et al., 2016). Among these different metrics, CVSS was created and improved over time (Mell et al., 2007; Scarfone and Mell, 2009) as an effort to establish a standard framework for vulnerability evaluation with an emphasis on prioritization patching operations. However, research has shown significant differences between the scores provided by experts and the CVSS base score (Holm and Afridi, 2015), largely due to the lack of information about the environment and temporal characteristics of the vulnerability. Thus, many studies aimed to improve the existing CVSS metrics. For example, Fruhwirth and Mannisto (2009) proposed new time-aware metrics to measure Exploitability and Remediation Level. Using a sample of 720 CVEs from the NVD, their study found that adding context information significantly improved the prioritization and selection of the vulnerability response process. Singh and Joshi (2016) created a new Temporal metric by including two categories: availability of patches and maturity of exploit code (determined by the publication date of the vulnerability from NVD). They then proposed a modified overall risk level metric as the sum of the CVSS score and an estimated frequency of the vulnerability. Munaiah and Meneely (2016) compared CVSS and vulnerability Bounties, a closely related measure for security impact, and found that CVSS lacks explicit references to code execution, and privilege escalation. A 3. Background: common vulnerability scoring system The Common Vulnerability Scoring System (CVSS) is an industry standard to quantitatively measure the severity of an asset’s vulnerability and characterize the risk (Mell et al., 2007). CVSS was first introduced (Schiffman et al., 2004). It was improved by (Mell and Scarfone, 2007) to version 2.0 and by Abraham et al. (2015) in 2015. This paper is based on CVSS v3.1, a recent version released in June 2019. CVSS consists of three metric groups: Base, Temporal, and Environmental. The Base metric group identifies “fundamental characteristics of a vulnerability that are constant over time and user environment” (Mell et al., 2007). The CVSS Base Score (B) is computed as follows: CV SS (B ) = ISS × ESS (1) where ISS (Impact Sub-Score) measures the impact the exploitation of the vulnerability has on the IT asset’s Confidentiality, Integrity, and Availability (CIA), and ESS (Exploitability Sub-Score) measures the likelihood of exploit. The Temporal metric group measures “the current state of exploit techniques or code availability, the existence of any patches 2 B. Jung, Y. Li and T. Bechor Computers & Security 116 (2022) 102639 Table 1 Temporal metric and related values. Temporal metric Metric values Description Exploit code maturity (ECM) Unproven (U) Proof-of-Concept (P) Functional (F) High (H) Not Defined (X) Unavailable (U) Workaround (W) Temporary Fix (T) Official Fix (O) Not Defined (X) Unknown (U) Reasonable (R) Confirmed (C) Not Defined (X) Metric values are at an increasing maturity level. Remediation Level (RL) Report Confidence (RC) or workarounds, or the confidence that one has in the description of a vulnerability” (Abraham et al., 2015). The Temporal Scores capture exploit’s evolution (Singh et al., 2016) or changes of a vulnerability (Sheehan et al., 2019). It includes three sub-metric groups: Exploit Code Maturity (ECM), Remediation Level (RL), and Report Confidence (RC). Each sub-metric group has a set of metric values that define its score. ECM means how mature exploit code is in a vulnerability life cycle (Frei et al., 2006), or maturity levels of the exploit mechanism (Joh, 2019). If the exploit code is available or purely theoretical, the ECM level is considered as Unproven (U); if the exploit code is available but not practical or requires modifications, the ECM level is considered as Proof-of-concept (P); if the functional exploits exist, the ECM level is considered as Functional (F); and if details about exploits are widely available and autonomous functional exploit code exists, the ECM level is considered as High (H) (Mézešová and Bahsi, 2019). RL is the existence or effect of remediation (Yang et al., 2016). If remediation is not available, the RL value is considered as Unknown (U); if there is a workaround related to the exploit, the RL value is considered as Workaround (W); if temporary remediation is available, the RL value is considered as Temporary fix (T), and if official remediation is available, the RL value is considered as Official (O). RC indicates confidence levels of exploits. If the presence of a vulnerability is indicated but reports differ or not certain, the confidence level is considered low as Unknown (U); if exploits have been replicated and explained, the confidence level is considered as Reasonable (R); and if exploits have independently verified software components, vendor confirmation, and replicability, the confidence level is considered as Confirmed (C) (Mézešová and Bahsi, 2019). Table 1 depicts the metric values for each Temporal metric subgroup (FIRST.org, 2018). The CVSS Temporal score (T) is calculated as: CV SS (T ) = B × ECM × RL × RC Metric values are at an increasing certainty of vendors’ official remedies. Metric values are the levels of exploit’s existence. Fig 1. CAVP (context-aware vulnerability prioritization) model. defined, the CVSS Environmental score (E) can be calculated using the formula provided by CVD (National Institute of Standards and Technology (NIST), 2021). Lastly, the CVSS calculates an Overall Score (O) as: CV SS (O ) = MIN (B, T , E ) (2) The current CVSS has two limitations. First, it currently does not provide an automatic Temporal score (T) calculation. Thus, organizations need to provide the values for each Temporal sub-metric group manually. This is not scalable for security risk management at a large scale (Stellios et al., 2021). Second, the CVSS does not support context-aware vulnerability prioritization. Such prioritization is important because the organization can turn the CVEs into prioritized risks based on the Temporal and Environmental characteristics of their IT assets. Below, we describe the CAVP model designed to address these limitations. 4. Proposed methodology: context-aware vulnerability prioritization model To facilitate better prioritization in managing vulnerabilities, we propose a context-aware vulnerability prioritization (CAVP) model. The model has two key design considerations. First, it improves existing CVSS metrics by incorporating temporal characteristics of vulnerabilities. Second, it provides a step-by-step process of vulnerability prioritization that can be integrated within the organization’s risk management workflow. The CAVP model consists of five steps as shown in Fig. 1. We explain each step as follows. (1a) The Environmental metric group “represents the characteristics of a vulnerability that are relevant and unique to a particular user’s environment” (FIRST.org, 2018). It allows the SOC team to define the exploit’s impact for an IT asset based on its security requirements as: CR (Confidentiality Requirement), IR (Integrity Requirement), and AR (Availability Requirement). That is, if Availability is most important for an IT asset, a greater value is assigned to AR. Additionally, the SOC team can define an environmental impact modifier for an IT asset as MC (Modified Confidentiality), MI (Modified Integrity impact), and MA (Modified Availability). This allows the SOC team to make a qualitative judgment on the impact to the environmental metric values based on the mitigation measures in place. Once environmental impact requirements and modifiers are 4.1. Step 1: CVE database construction and analysis The first step is to construct a database for CVEs. An initial load of known vulnerabilities and related attributes. Because new CVEs are added daily, the CVE database should include a batch daily load to capture new CVEs when they are added using the API (Byers and Owen, 2019) provided by the NVD. The NVD data feed includes useful information that can be used to calculate temporal scores. First, the NVD original data feed (in JSON format) includes an object called references. The references ob3 B. Jung, Y. Li and T. Bechor Computers & Security 116 (2022) 102639 Table 3 Frequency distribution of tags. Fig 2. An example of references field in NVD JSON Data. Table 2 Frequency distribution of urls. URL Frequency securityfocus.com github.com redhat.com securitytracker.com debian.org opensuse.org apple.com ubuntu.com oracle.com gentoo.org fedoraproject.org apache.org microsoft.com openwall.com exploit-db.com 20,591 18,600 12,751 10,847 9362 7737 6837 6582 5591 4973 4820 4332 4182 3648 3319 Tag Frequency Third Party Advisory Vendor Advisory VDB Entry Patch Exploit Mailing List Issue Tracking Release Notes Permissions Required US Government Resource Mitigation Product Technical Description Broken Link Not Applicable Press/Media Coverage Tool Signature 95,551 57,150 31,162 28,250 21,052 11,924 10,523 6117 2027 1972 1908 1375 1168 1065 330 98 24 vulnerability databases as more authoritative references than other non-vendor websites. The tags key is paired with values related to different categories of the referenced resources, such as vendor advisory, third party advisory, technical description, VDB (Vulnerability Database) entry, mitigation, exploit, and patch. They are also called as resources in each CVE detail web page from the NVD. Similarly, an exploratory analysis of the most frequently used tags was performed. Table 3 lists the frequency distribution of tags using the same 2016 to 2020 NVD data. Note, a CVE might have contained one or more tags as shown in Fig. 2. The frequency distribution shows that “Third Party Advisory”, “Vendor Advisory”, and “VDB Entry” tags comprised more than half of the total tags. jective has two keys that are used to capture contextual information about a CVE: url and tags (see Fig. 2). The url key is paired with a Uniform Resource Locator (URL) to a referenced website related to vulnerabilities, such as vendor websites, security bulletins, media coverage, and so forth. The referenced URLs can be used to infer the credibility and authority of the exploit references, an important factor in deriving values for CVSS temporal metrics. A useful first step is to analyze the frequency distribution of these URLs. Table 2 lists the top 15 referenced URLs based on 75,690 CVEs from 2016 to 2020 in the NVD database. Within the list, most URLS are from IT vendors such as Opensuse, GitHub (a cloud code repository), Red Hat, Oracle, and Apple. These are CNAs, i.e., organizations authorized to assign CVE IDs to vulnerabilities. According to The MiTRE Corporation, (2019), 161 organizations were participating as CNAs as of March 30, 2021. These organizations are classified by six CNA types (The MiTRE Corporation, 2019), one of which is “Vendors and Projects”. This CNA type is defined as “assigns CVE IDs for vulnerabilities found in their own products and projects” (The MiTRE Corporation, 2021). This means that, for this CNA type, software vendors have checked the reported vulnerability, assessed its legitimacy, and acknowledged the vulnerability by assigning a unique CVE ID. Thus, we argue that, if an URL link is from a CNA with the type of “Vendors and Projects”, it is considered as a more authoritative source. The URL list also includes three referenced vulnerability databases, namely, Secruityfocus (an online computer security news portal and purveyor of information security services), Securitytracker (a web portal that tracks the latest security vulnerabilities, and Exploit-db (a CVE compliant archive of public exploits and corresponding vulnerable software). These vulnerability databases have been cited often in the literature as representative websites to obtain the vulnerability reports linked by CVEs (Adebiyi et al., 2013; Mu et al., 2018; Dong et al., 2019). Thus, we consider these 4.2. Step 2: environmental vulnerability scan This step concerns an environmental vulnerability scan by matching the known CVEs in the CVE database with the IT assets within an organization. Because there are thousands of CVEs relating to thousands of vendor products, this step helps filter CVEs to these only relevant within the organization’s technological environment. The potential risk of a vulnerability depends not only on its base metric attributes, but also on the security requirements of the affected system and the distribution and collateral damage potential associated with the vulnerability (Houmb et al., 2010). Thus, an important initial step for the SOC team is to rate each IT asset for its security requirements (i.e., CR, IR, and AR) based on its business criticality. The SOC team should also rate each IT asset for its potential impact on the base metric attributes based on mitigation measures that are currently in place. For example, a server protected by a firewall will reduce its likelihood to be exploited, and hence the potential impact of its exploit is lower. Once the security requirements of the IT assets are assigned, the environmental vulnerability scan can be performed weekly or monthly, depending on the organization’s vulnerability assessment requirements. For each IT asset that is affected by a CVE, an overall environmental score E can be calculated as described in Section 3. 4.3. Step 3: context-aware vulnerability scoring system (CAVSS) In this step, we present a context-aware vulnerability scoring system that improves the existing CVSS by creating an expert validated heuristic rule-based approach towards automatically deriving temporal metric values. As discussed earlier, three categories of temporal metrics are Exploit Code Maturity (ECM), Remedy Level (RL), and Report Con4 B. Jung, Y. Li and T. Bechor Computers & Security 116 (2022) 102639 Table 4 Heuristic rules for defining temporal metrics. Metric Metric Value Rule Exploit Code Maturity (ECM) Not Defined High Not used A CNA (Vendors and Projects) link & Vendor Advisory tag & Third Party Advisory tag Third Party Advisory & ECM Value is NOT High (VDB Entry tag or US Government Resource tag or URL contains SecurityFocus or SecurityTracker or Exploit-Db) & ECM Value is NOT High & ECM Value is NOT Functional Not satisfy any rules above (default) Not used A CNA (Vendors and Projects) link and Patch tag Excluded Patch tag OR Mitigation tag & RL Value is NOT Official Fix Not satisfy any rules above (default) Not used A CNA (Vendors and Projects) link & Vendor Advisory tag VDB Entry tag Or Issue Tracking tag & RC value is NOT Confirmed Not satisfy any rules below (default) Functional Proof of Concept Remedy Level (RL) Unproven Unavailable Official Fix Temporary Fix Workaround Report Confidence (RC) Not Defined Not Defined Confirmed Reasonable Unknown fidence (RC). Currently, there are no defined measures for these temporal metrics. Thus, we first developed a set of heuristic rules to define different temporal metric values through rounds of internal discussions with security risk management experts and practitioners. Then, an initial version of the rules was presented to a group of four cybersecurity experts and their feedback was incorporated. The process resulted in a final set of heuristic rules as shown in Table 4. In general, the rules apply more stringent criteria to the higher metric values in each metric group. We elaborate on the justifications for these heuristic rules and how they are used to calculate the overall context-aware vulnerability score. patched (i.e., with a “patch” tag), it is an indication of official remediation. Thus, the rule for Official Fix, the highest RL metric value, is defined as having both a “Vendor and Project” CNA link and a “Patch” tag. The metric value Workaround means that there is no official fix to the exploit, but vendors or experts have started some efforts to migrate the vulnerabilities, such as coding changes or specification changes. Thus, the RL rule for “Workaround” includes CVES that have a “Migration” tag or a “Patch” tag but are not Official Fix. The metric value, “Temporary Fix”, is excluded in the ruleset, because publicly available CVE data feeds do not differentiate an Official Fix and a Temporary Fix. Only software vendors would be able to provide such a differentiation. Similarly, the “Unavailable” metric value was not used. Instead, any CVE that does not satisfy the rules for Official Fix or Workaround is assigned a default metric value of Not Defined. 4.3.1. Exploit code maturity (ECM) To measure the ECM, we first considered the types of URL links. As discussed earlier, an URL link with a CNA with the type “Vendor and Project” is considered the most authoritative (Jøsang et al., 2015). Thus, all links from the “Vendor and Project” CNA type are flagged in the database. The “Vendor and Project” CNA type flag is used to derive the high ECM metric values. Additionally, we reason that an exploit with the “Third Party Advisory” or “Vendor Advisory” tags is the most mature. Thus, the rule for high ECM is defined as these CVEs that reference “Vendor and Project” links and also have both “Third Party Advisory” and “Vendor Advisory” tags. The Functional ECM rule excludes any CVEs that satisfy the High ECM rule. It also removes the requirement for the CNA type, and only includes these CVEs with the “Third Party Advisory” tag. The Proof of Concept ECM rule excludes any CVEs that are High or Functional. Additionally, it includes the CVEs that reference any of the three vulnerability databases discussed earlier or have either a “VDB Entry” tag or a “US Government Resource” tag. Lastly, a CVE is assigned the “Unproven” metric value as default if it does not satisfy any of the three rules described above. 4.3.3. Report confidence (RC) For the RC, we reason that if an exploit has a CNA link from the “Vendor and Projects” type and also a “Vendor Advisory” tag, it means that the software vendor has “confirmed” the exploit and provided some pieces of advice about it. Thus, the RC rule for “Confirmed” is defined as these CVEs that reference “Vendor and Project” links and also have the “Vendor Advisory” tag. For the exploit to be considered as a reasonable existent, it should have been recorded in some vulnerability databases (i.e. VDB Entry) or its related issues have been tracked. Thus, the RC “Reasonable” rule includes these CVEs with a “VDB Entry” tag or an “Issue Tracking” tag but are not “Confirmed”. Lastly, any CVE that does not satisfy the RC rules for “Confirmed” or “Reasonable” is assigned a default metric value of “Unknown”, and the “Not Defined” metric value was not used. 4.3.2. Remediation level (RL) Because a software vendor or a project (i.e. “Vendor and Projects” CNAs) would assign a CVE ID for a new vulnerability and start to develop official remediation (Almukaynizi et al., 2017), an exploit with a CNA link from the “Vendor and Projects” type is considered to have a higher remediation level. Additionally, if the exploit’s reference links include the date of the vulnerability was 4.3.4. Calculate the CAVSS score Based on these defined temporal metric values, a temporal score can be calculated based on the data from the CVE database using the equation below: CAV SS (T ) = B × ECM × RL × RC 5 (3) B. Jung, Y. Li and T. Bechor Computers & Security 116 (2022) 102639 Table 5 Details of case studies. Case study A B Research site A local government agency vulnerability scanner IT assets scanned OpenVAS Desktops (250+) Windows server (3) Networked printers (5) 45 3 A higher education institution OpenVAS Public-facing web services Unique CVEs Interviews B’s SOC team were assigned to facilitate the model implementation and evaluation. Table 5 lists details of the two case studies. For each study, the following steps were performed following the CAVP model. In Step 1, a CVE database was created from an initial load of known vulnerabilities for each organization, followed by an initial analysis of URLs and tags. In Step 2, an environmental vulnerability scan was performed on the IT assets for each organization. While we were able to apply the CAVP model to two real-world organizations, the scope of each case study was limited. Organization A provided a test environment for the scan and Organization B provided all the publicfacing web services. Because only limited IT assets were scanned, 45 and 10 unique CVEs were found within the provided environment of Organization A and Organization B, respectively. In Step 3, each CVE was analyzed to obtain a CVASS score. 35 out of 45 CVEs and six out of 10 CVEs had CVSS v3 base scores for Organization A and Organization B respectively. For organization A, the base score ranged from 3.7 to 9.8, the temporal score ranged from 3.6 to 9.4, and the environmental score ranged from 4.2 to 9.4. When comparing the overall CVASS scores to the based CVSS score, 14% (5 CVEs) remained unchanged, 35% (12 CVEs) had a decrease, and 51% (18 CVEs) had an increase. Of these CVEs with increased overall CVASS scores, 5 had a severity escalation – 2 from High to Critical and 3 from Medium to High. Additionally, 4 out of 5 CVEs with Critical base scores remained as Critical. Thus, 9 out of 45 CVEs, or 20% of the CVEs, were highlighted for future investigation and mitigation by the SCO team. For organization B, a relatively small number of CVEs were identified, largely because the scanned IT assets were limited to pubic-facing web services. The base score ranged from 0 to 7.5, the temporal score ranged from 0 to 6.5, and the environmental score ranged from 0 to 8. After comparing the overall CVASS scores with the based CVSS score, we found that 50% (5 CVEs) remained unchanged or had a decrease, and 50% (5 CVEs) had an increase. Of these CVEs with increased overall CVASS scores, 2 had a severity escalation – one from Medium to High and one from Low to Medium. Additionally, one CVE with a High base score remained as High. Thus, 3 out of 10 CVEs, or 30% of the CVEs, were highlighted for future investigation and mitigation. By applying the CAVP model, only 20% to 30% of the known CVEs were prioritized for further investigation. This would allow the SOC teams to better allocate limited resources for mitigating these high-risk CVEs. In Step 4, a visualization tool was developed and deployed at both organizations. It included three tabs (see Fig. 4). The Main Tab displays two pie charts showing the base and the overall CAVSS score summaries and a table below showing details of each CVE. The CVE Reference Tab displays a user-selected CVE description and related reference information. If the selected CVE has a severity elevation, the tab will describe the reason for the severity elevation. The Assets tab displays the imported environmental scan result. The visualization tool allowed the SOC team to not only review the summary statistics of all CVEs but also drill down details Fig 3. An example of severity elevation description. An overall CAVSS score is then calculated using the formula below: CAV SS (O ) = MIN (B, T , E ) 10 1 (4) Same as the base score, the overall CAVSS score is then normalized into five levels: Critical, High, Medium, Low, and Not Available. 4.4. Step 4: vulnerability prioritization visualization After the vulnerability scores are improved via CAVSS, the decision makers need to see prioritized vulnerability information, reflecting the evolving exploit characterizations and contextual information pertaining to their organization. This step is important for the SOC team to prioritize the resources needed to mitigate these vulnerabilities. The vulnerability score visualization should include the display of both temporal and environmental scores, as well as the overall scores for each CVE from the environmental scan. Some more advanced visualization techniques can be applied to further facilitate the decision makers. For example, vulnerability visualization can include mashups of summary tables and charts, where the user can interactively filter, slice, and drill down needed information. Additionally, the visualization should include hyperlinks to the NVD CVE information or other reference links. In this way, it allows the decision makers to quickly review the relevant CVE information, rather than search each individually. Last but not least, the visualization should help identify a severity elevation (e.g., a change of severity level from High to Critical) and explain the reasons for such an elevation (see Fig. 3 for an example). 5. Case study evaluation To evaluate the proposed CAVP model, two case studies were conducted in two different organizations. Organization A was a local government agency, and Organization B was a higher education institution. Three security analysts from Organization A’s SOC team and one vulnerability management analyst from Organization 6 B. Jung, Y. Li and T. Bechor Computers & Security 116 (2022) 102639 temporal [score] is pinpoints the critical devices (or what I deem critical) that pose a bigger threat to the department. Participants One & Two added the importance of having vulnerability scores about the organization’s environment, as stated below: There are a couple of ‘high’ base severities switched to critical in the overall severity. And then, there are a couple of mediums that switched to high. Also, there are mediums and stayed as mediums. But, if I want to look at what’s critical, then potentially I may not identify all the vulnerabilities because eight vulnerabilities would be missed. It identifies what assets to be aware of. Participant Four considered the Temporal score was particularly important in helping the risk assessment: Fig 4. A snippet of the Main tab of the CAVSS visualization tool. The added benefit of that being is that it takes into your changes [added Temporal scores] to the score to help us better determine the actual risks…It’s [adding the Temporal and Overall scores] definitely helpful in a way that I can get a more accurate result as to what is the actual risk score, or scores assigned to that vulnerability. of each CVE. The latter is useful when the team plans for mitigation strategies. The ability to review reasons for severity escalation is also important for determining the time and approach towards vulnerability remediation. The informants (i.e., the security analysts from Organization A and the vulnerability management analyst from Organization B) were debriefed about the CAVP model and then were given sufficient time to use the CAVSS visualization tool. We then conducted semi-structured interviews of these analysts with regards to the usefulness of the CVAP model and the enhanced CVASS scores for the vulnerability assessment and security incident responses. The findings are detailed below. 5.3. Usefulness of the CVASS visualization tool Comparing to existing vulnerability assessment tools, all participants highlighted the effectiveness of the visualization tool in prioritizing vulnerabilities and providing useful mitigation information. For example, Participant Four stated: All interviewees indicated that the CVAP model is useful in providing a vulnerability reference point with reliable information sources to mitigate vulnerabilities. For example, Participant One (Organization A) stated: Convenient visualization board in the dashboard format, charts, displaying already prioritized vulnerabilities in percentage correlation.” He elaborated on the tool’s improvement on the mitigatory aspect, “Also, it’s useful in helping with Remediation, as it extracts good information about vulnerability from database and shows possible Remediation ‘Solution’ and ‘Summary’ Recommendation how to fix vulnerability. It helps identify. First, you have links to the NIST websites to really help understand what the vulnerability is so that we can do a little bit of research and come up with certain standards, procedures, and protocols.” Participants especially appreciated the hyperlinks embedded in the visualization tool that points to the latest vulnerability information from trusted sources to help them identify potential remedies. 5.1. Usefulness of the CVAP model Similarly, Participant Two (Organization A) stated: 5.4. Other suggestions Seeing the CVE ID with a link to the description and solution to mitigate the vulnerability, I will be able to quickly forward my query to the appropriate group that manages the server or device and give them the info provided by the Vulnerability Management Console so they can quickly patch the device. Participants also provided some additional suggestions to improve the CAVP model and the CAVSS scoring system. One suggestion was to include an exception handling for the contextual vulnerabilities where the security experts from the SOC team know the strategies to mitigate these risks, as stated by Participant Four below: and participant Four (Organization B) felt the same and also indicated that it was an improvement over their existing exiting tool (OpenVAS): But because they run an application that you know maybe only works on that out of the operating system. We know that we have to run it and accept the risk so we mitigate it another way.” This … fill out a lot of gaps if you are using a vulnerability scanner that doesn’t provide much detailed information.…it lets me look at the actually published information you know about the vulnerabilities so they can do research specific to it and try to either mitigate or fix it. The exception handling can be added through expert input, as indicated by the same participant: So, it did have no way of knowing that unless I can provide the data to it. 5.2. Usefulness with the CAVSS scores Another suggestion is to enable the integration of the CAVSS scoring system within the organization’s IT infrastructure, as commented by Participant One: All participants considered the CAVSS scores added more contextualization to the CVSS Base scores and enabled them to see more vulnerable factors for assessing accurate risks. For example, Participant Three (Organization A) stated: If this [the vulnerability score tool] was connected to our inventory, then we have better ideas where it [the vulnerability] is. Using this tool, if I were to remediate, it would be nice to know that the tool is connected to our inventory to know exactly where it is, Although the Base Score gives me an overall picture of the department’s state in patches and updates, E.S [environmental score] and 7 B. Jung, Y. Li and T. Bechor Computers & Security 116 (2022) 102639 like within the building, opposed to roundabout this IP address is being pulled off this building. But, if it’s connected to our inventory (system), it would be a comprehensive tool that, one, we identified it, then we can mark it. References Abraham, R., et al., “Common vulnerability scoring system v3.0: specification document.” Forum of Incident Response and Security Teams, Jun. 2015. Accessed: Apr. 05, 2021. [Online]. Available: https://www.first.org/cvss/v3.0/ cvss- v30- specification_v1.9.pdf. Adebiyi, A., Arreymbi, J., Imafidon, C., 2013. A neural network based security tool for analyzing software. In: Proceedings of the Doctoral Conference on Computing, Electrical and Industrial Systems, pp. 80–87. Allodi, L., Massacci, F., 2014. Comparing vulnerability severity and exploits using case-control studies. ACM Trans. Inf. Syst. Secur. (TISSEC) 17 (1), 1. Almukaynizi, M., Grimm, A., Nunes, E., Shakarian, J., Shakarian, P., 2017. Predicting cyber threats through the dynamics of user connectivity in darkweb and deepweb forums. ACM Comput. Soc. Sci.. Byers, B. and Owen, H., “Automation support for CVE retrieval,” p. 21, Sep. 2019. Dissanayaka, A.M., Mengel, S., Gittner, L., Khan, H., 2020. Vulnerability prioritization, root cause analysis, and mitigation of secure data analytic framework implemented with mongodb on singularity linux containers. In: Proceedings of the 4th International Conference on Compute and Data Analysis, pp. 58–66. Dobrovoljc, A., Trcek, D., Likar, B., 2017. Predicting exploitations of information systems vulnerabilities through attackers’ characteristics. IEEE Access 5, 26063– 26075. doi:10.1109/ACCESS.2017.2769063. Dong, Y., Guo, W., Chen, Y., Xing, X., Zhang, Y., Wang, G. “Towards the detection of inconsistencies in public security vulnerability reports,” in 28th {USENIX} Security Symposium ({USENIX} Security 19), 2019, pp. 869–885. Farris, K.A., Shah, A., Cybenko, G., Ganesan, R., Jajodia, S., 2018. VULCON: a system for vulnerability prioritization, mitigation, and management. ACM Trans. Priv. Secur. (TOPS) 21 (4), 16. Figueira, P.T., Bravo, C.L., López, J.L.R., 2020. Improving information security risk analysis by including threat-occurrence predictive models. Comput. Secur. 88, 101609. Figueroa-Lorenzo, S., Añorga, J., Arrizabalaga, S., 2020. A survey of IIoT protocols: a measure of vulnerability risk analysis based on CVSS. ACM Comput. Surv. 53 (2), 1–53. doi:10.1145/3381038. FIRST.org, Inc., “CVSS v3.0 specification document,” FIRST — Forum of Incident Response and Security Teams, 2018. https://www.first.org/cvss/v3.0/ specification-document (accessed Jul. 29, 2018). Frei, S., May, M., Fiedler, U., Plattner, B., 2006. Large-scale vulnerability analysis. In: Proceedings of the SIGCOMM workshop on Large-scale attack defense, pp. 131–138. Fruhwirth, C., Mannisto, T., 2009. Improving CVSS-based vulnerability prioritization and response with context information. In: Proceedings of the 3rd international Symposium on Empirical Software Engineering and Measurement, pp. 535–544. Ghani, H., Luna, J., Suri, N., 2013. Quantitative assessment of software vulnerabilities based on economic-driven security metrics. In: Proceedings of the International Conference on Risks and Security of Internet and Systems (CRiSIS), pp. 1–8. Holm, H., Afridi, K.K., 2015. An expert-based investigation of the common vulnerability scoring system. Comput. Secur. 53, 18–30. Houmb, S.H., Franqueira, V.N., Engum, E.A., 2010. Quantifying security risk level from CVSS estimates of frequency and impact. J. Syst. Softw. 83 (9), 1622–1634. Huang, C.C., Lin, F.Y., Lin, F.Y.S., Sun, Y.S., 2013. A novel approach to evaluate software vulnerability prioritization. J. Syst. Softw. 86 (11), 2822–2840. doi:10.1016/ j.jss.2013.06.040. Jøsang, A., Ødegaard, M., Oftedal, E., 2015. Cybersecurity through secure software development. In: Proceedings of the IFIP World Conference on Information Security Education, pp. 53–63. Joh, H., 2019. Software risk assessment for windows operating systems with respect to CVSS. Eur. J. Eng. Res. Sci. 4 (11), 41–45. Li, Y., Dai, W., Bai, J., Gan, X., Wang, J., Wang, X., 2019. An intelligence-driven security-aware defense mechanism for advanced persistent threats. IEEE Trans. Inf. Forensics Secur. 14 (3), 646–661. Liu, Q., Zhang, Y., 2011. VRSS: a new system for rating and scoring vulnerabilities. Comput. Commun. 34 (3), 264–273. Liu, Q., Zhang, Y., Kong, Y., Wu, Q., 2012. Improving VRSS-based vulnerability prioritization using analytic hierarchy process. J. Syst. Softw. 85 (8), 1699–1708. doi:10.1016/j.jss.2012.03.057. Mézešová, T., Bahsi, H., 2019. Expert knowledge elicitation for skill level categorization of attack paths. In: Proceedings of the International Conference on Cyber Security and Protection of Digital Services (Cyber Security), pp. 1–8. Mell, P., Scarfone, K., 2007. Improving the common vulnerability scoring system. IET Inf. Secur. 1 (3), 119–127. Mell, P., Scarfone, K., Romanosky, S. “A complete guide to the common vulnerability scoring system version 2.0,” p. 23, Jun. 2007. Mu, D., et al., 2018. Understanding the reproducibility of crowd-reported security vulnerabilities. In: Proceedings of the 27th {USENIX} Security Symposium ({USENIX} Security 18), pp. 919–936. Munaiah, N., Meneely, A., 2016. Vulnerability severity scoring and bounties: why the disconnect? Proc. Int. Workshop Softw. Anal. 8–14. National Institute of Standards and Technology (NIST), “NVD - data feeds,” n.d. https://nvd.nist.gov/vuln/data-feeds (accessed Apr. 23, 2021). National Institute of Standards and Technology (NIST), “NVD - CVSS v3 calculator,” Common Vulnerability Scoring System Calculator Version 3, n.d. https://nvd.nist. gov/vuln- metrics/cvss/v3- calculator (accessed Sep. 17, 2021). Nayak, K., Marino, D., Efstathopoulos, P., Dumitraş, T., 2014. Some vulnerabilities are different than others. Int. Workshop Recent Adv. Intrusion Detect. 426–446. Future research may investigate how to add locations of the CVE assets within the organization’s IT infrastructure when constructing the initial CVE database. 6. Conclusion This paper presents a context-aware vulnerability prioritization (CAVP) model that can be integrated within the organization’s risk management workflow. The CAVP model was implemented and evaluated in two different organizations. The evaluation results showed that the CVAP model was useful for vulnerability prioritization, especially with its result visualization. Additionally, the CAVP model includes a new context-aware scoring system that was useful by add contextualization to the CVSS Base scores. The contribution of this research is two-fold. First, the CAVP model is the first attempt to provide a step-by-step process of vulnerability prioritization. It can be easily adopted by researchers and practitioners in vulnerability management. Second, the CAVSS enhances the existing CVSS by automatically calculating temporal metric values using a set of expert-validated heuristic rules. This research also has several limitations. First, the Temporal scores calculated by the CAVSS were not validated because currently there are no golden standards to compare different timeaware metrics. Future research may investigate how to better compare and validate Temporal scores. Second, as the findings from the case studies suggested, the SOC team has implicit knowledge about environmental vulnerability. Such knowledge is not captured by the CAVP model. Instead, it requires the SOC team to manually assign the vulnerability sub-metric values to each IT asset. Future research may include an additional feedback step so that the knowledge SOC team can be learned. Third, our case study evaluation was mostly qualitative from one source of information - interviews. We were not able to perform triangulation by incorporating other data sources, because the research is scarce in investigating automatic context-aware, temporal-enabled vulnerability prioritization. While the CAVSS scores were compared with the CVSS V3 base score, we were not able to compare with other existing vulnerability scoring systems, such as these reviewed in the related work section. For example, WIVSS was based on CVSS V2, while our CAVSS is based on CVSS V3. From CVSS V2 to V3, two new metrics were added, and some additional allowed values were also added for these existing metrics. Thus, CAVSS and WIVSS scores could not be directly compared. We hope our research would spur more similar studies so that future research may include a more rigorous evaluation. Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. CRediT authorship contribution statement Bill Jung: Conceptualization, Methodology, Data curation, Software, Formal analysis, Writing – original draft, Visualization. Yan Li: Conceptualization, Methodology, Validation, Writing – original draft, Writing – review & editing, Visualization, Supervision. Tamir Bechor: Conceptualization, Supervision. 8 B. Jung, Y. Li and T. Bechor Computers & Security 116 (2022) 102639 Pendleton, M., Garcia-Lebron, R., Cho, J.H., Xu, S., 2016. A survey on systems security metrics. ACM Comput. Surv. (CSUR) 49 (4), 1–35. Scarfone, K., Mell, P., 2009. An analysis of CVSS version 2 vulnerability scoring. In: Proceedings of the 3rd International Symposium on Empirical Software Engineering and Measurement. Lake Buena Vista, FL, USA, pp. 516–525. doi:10.1109/ ESEM.2009.5314220. Schiffman, M., Wright, A., Ahmad, D., and Eschelbeck, G., “The common vulnerability scoring system,” National Infrastructure Advisory Council, Vulnerability Disclosure Working Group, Vulnerability Scoring Subgroup, 2004. Shahzad, M., Shafiq, M.Z., Liu, A.X., 2012. A large scale exploratory analysis of software vulnerability life cycles. In: Proceedings of the Software Engineering (ICSE), 34th International Conference on, pp. 771–781. Sheehan, B., Murphy, F., Mullins, M., Ryan, C., 2019. Connected and autonomous vehicles: a cyber-risk classification framework. Transp. Res. part A Policy Pract. 124, 523–536. Singh, U.K., Joshi, C., 2016. Quantitative security risk evaluation using CVSS metrics by estimation of frequency and maturity of exploit. In: Proceedings of the World Congress on Engineering and Computer Science, 1, pp. 19–21. Singh, U.K., Joshi, C., Gaud, N., 2016. Information security assessment by quantifying risk level of network vulnerabilities. Int. J. Comput. Appl. 156 (2), 37–44. Spanos, G., Sioziou, A., Angelis, L., 2013. WIVSS: a new methodology for scoring information systems vulnerabilities. In: Proceedings of the 17th Panhellenic Conference on Informatics, pp. 83–90. Stellios, I., Kotzanikolaou, P., Grigoriadis, C., 2021. Assessing IoT enabled cyber-physical attack paths against critical systems. Comput. Secur., 102316. The MiTRE Corporation, “CVE - CVE numbering authorities,” Aug. 29, 2019. https: //cve.mitre.org/cve/cna.html (accessed Mar. 03, 2021). The MiTRE Corporation, “CVE - request CVE IDs,” Apr. 23, 2021. https://cve.mitre. org/cve/request_id.html#key_cna_roles_and_types (accessed Apr. 23, 2021). Torkura, K.A., Sukmana, M.I., Cheng, F., Meinel, C., 2021. Continuous auditing and threat detection in multi-cloud infrastructure. Comput. Secur. 102, 102124. United States Government Accountability Office, “Data protection: actions taken by equifax and federal agencies in response to the 2017 Breach.” Aug. 30, 2018. Accessed: Apr. 23, 2021. [Online]. Available: https://www.warren.senate.gov/imo/ media/doc/2018.09.06%20GAO%20Equifax%20report.pdf. Vakilinia, I., Sengupta, S., 2020. Vulnerability market as a public-good auction with privacy preservation. Comput. Secur. 93, 101807. Yang, G., et al., 2016. An assessment method of vulnerabilities in electric CPS cyber space. In: Proceedings of the 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), pp. 397–402. Zhang, F., Huff, P., McClanahan, K., Li, Q., 2020. A machine learning-based approach for automated vulnerability remediation analysis. IEEE Conf. Commun. Netw. Secur. (CNS) 1–9. doi:10.1109/CNS48642.2020.9162309. Bill Jung is a software engineer working in the Los Angeles metropolitan area in the public sector and teaches Information Technology at California State University Fullerton. He received the M.S. degrees in software engineering and information technology from California State University Fullerton in 2010 and 2013. Then, he received the M.S. and Ph.D. degrees in information systems and technology from Claremont Graduate University in 2020. His-main research interests include vulnerability prioritization, risk management, and cybersecurity. Yan Li is an Associate Professor in the Center for Information Systems and Technology at Claremont Graduate University (CGU). Driven by her intellectual curiosity for data and emergent information technologies, and her passion for designing and building things, she has oriented her career in the direction that integrates research, teaching, and practice in the realm of information science. Her research focuses on data and knowledge management areas such as machine learning, natural language processing, data warehousing, and semantic technologies with an emphasis on exploring the synergies between information systems and data analytics. Her other research stream focuses on developing Information and Communication Technology (ICT) solutions for underserved population in low-resource areas and improving social inclusion in health care. Prior to joining CISAT, she was a data scientist in industry with hands-on experience in advanced analytics, machine learning, and big data platforms. She is a member of AIS and serves as an editorial board member of Information Technology for Development. You can contact Yan by email: Yan.Li@cgu.edu. Tamir Bechor is a clinical associate professor in Claremont Graduate University¹s Center for Information Systems and Technology (CISAT). Bechor interests focus on the competencies and practices needed to unlock and strengthen managers’ abilities to deliver desired business value through the use of disruptive digital technologies. Bechor examines the nature of cyber-threats that shape the digital economy and offers strategies by which companies build digital trust. Bechor is a co-founder of an automotive cyber security company CYMOTIVE Technologies developing next generation of cyber security solutions for connected/autonomous vehicles. 9