Instituto Superior Técnico Network Science 1st semester 2023/2024 Identification of Efficient Spreaders in Complex Networks António Coelho, no. 95535 Cristi Savin, no. 95549 Duarte Almeida, no. 95565 Abstract The phenomenon of spreading processes is a crucial aspect of several domains. However, the identification of the most influential spreaders, without testing every node, remains an open challenge. In [1] it was shown that it is possible to obtain a better prediction with metrics other than the degree or betweeness of a node. Since then, the number of publications about the subject has massively increased. In this work, we aim to provide a systematic survey of various methods designed to discriminate influential nodes. We also provide a comparison between the various methods using real-world public datasets, which, to our knowledge, has not been done to the extent we propose ourselves to. The results show that measures that simultaneously consider both k-shell and node degree achieve the best performance. Our work intends to both provide a complete introduction to the problem, as well as providing a framework for future works to more easily measure against each other. Keywords— Efficient Dissemination, Influential Spreaders, K-shell, Complex Networks 1 Introduction The phenomenon of spreading processes is ubiquitous to several domains, such as societal interactions and information dissemination. Furthermore, the identification of influential spreaders is of great importance, since it can leverage methods to either hinder spreading (as is the case in pandemic control)[2] or accelerate it (which is desirable in information dissemination) [1]. However, quantifying a node’s influence is a very challenging task, as many times it requires making accurate predictions about the topology of very large networks. In this paper, we aim to provide a systematic survey of various methods to identify influential nodes and compare them, based on highly differentiated public real-world datasets. We focus our attention mainly on topology-based measures, which are pragmatic and efficient approaches - desired properties when working with very large and highly heterogeneous networks. To our knowledge, this type of work has not yet been done in the literature, since other papers compare only a subset of them, or use private or hardly-accessible datasets. We start by examining the groundbreaking work of Kitsak et al.[1] on the k-shell measure, which establishes that, in some contexts, it can surpass the results from more established metrics like degree or betweenness in evaluating a node’s spreading capability. However, contrary to the authors’ claim, we observe that, within a k-shell, nodes with higher degree might be more influential1. Consequently, arises the idea of combining both metrics, giving them different configurable weights, as in [3]. However, this proposal has the setback of needing calibration of the weights, which may be ideal in one scenario but be sub-optimal on a different network. Another proposal that aims at further exploring and improving the k-shell method is the one by Liu et al.[4], that combines it with the distance to the core. However, it only obtained a significant improvement(more than 0.01 points of imprecision) over the k-shell in the flights’ dataset4.Besides that, it requires computing shortest distances which may be prohibitively expensive when looking mainly for performance, in very large graphs. Furthermore, the paper from Joonhyun Bae and Sangwook Kim[5] proposes another method, the Extended Coreness, for identifying influential nodes with the assistance of the k-shell metric. In our experiments, it was consistently more precise 1 Identification of Efficient Spreaders in Complex Networks than the other methods4. Besides that, it is even more attractive by not requiring any calibration of parameters. [TODO: como falar do mcde e weight neighborhodd] Finally, we also analyze some novel works, MCDE and WNC, by Sheikhahmadi et al. [6] and Wang et al.[7], respectively. Even though, these fell short in our experiments 4, they present new interesting ideas to deal with the problem. The first uses entropy [8] to measure how uniformly the nodes are spread from the perspective of a node, while the second quantifies the importance of edges, by attributing more weight to those that connect hubs. The rest of this work will be structured as follows. In Section 2, we provide descriptions and motivations for each measure of spreading efficiency, along with the definition of performance metrics. In Section 3, we evaluate these measures using real-world datasets. 2 Methods 2.1 Selection of single spreaders We now explore methods for evaluating the spreading efficiency of individual nodes when they function as the sole initially infected node. In addition to the measures presented in the following sections, we also take into account node degree as an indicator of this efficiency, which, given its straightforward nature, it requires no further elaboration. The k-shell decomposition The k-shell is a measure that consists in an integer index π π that is assigned to a node and which expresses its "coreness", that is, its location within successive layers of the network. It is designed in such a way that smaller values of π π are associated with the periphery of the network, while larger π π values correlate with the innermost core of the network. The coreness values are determined through an iterative pruning process. Initially, nodes with a degree of one (π = 1) are removed, along with their connected edges. This pruning process continues iteratively until no nodes with a degree of one remain in the network. All the nodes and links removed during this process collectively form the 1-shell of the network (π π = 1). This procedure is then repeated on the remaining subgraph, setting π π = 2, and it continues iteratively until all nodes have been assigned a coreness value. Kitsak et. all motivate the usefulness of this measure by constructing instances where hubs reveal to be bad spreaders due to their peripheric location within the network. Moreover, they also show through simulations that nodes in high-π π layers are more susceptible to infection during a typical epidemic event and are infected earlier than nodes in lower-π π layers, sustaining an infection in the early stages of an outbreak. Thus, these nodes contribute to the epidemic’s ability to reach a critical mass and fully develop within the network and hence they are expected to exhibit the best spreading capacity. [1] also shows that nodes with the same degree have very diverse spreading capabilities, while nodes in the same k-shell have a homogeneous spreading capabilities. Mixed degree decomposition The k-shell decomposition comes with several notable downsides. Firstly, it only considers the links between the remaining nodes (i.e., to inner cores), disregarding the connections to the removed nodes, while these turn out to be important in real networks [3]. Additionally, the k-shell decomposition frequently results in multiple nodes receiving the same k-shell index, leading to ambiguity in identifying the efficient spreaders within the network [3][5][7]. (π) To tackle both these problems, the mixed degree decomposition (MDD) recurs to the mixed degree π π of a node π as a criterion to remove nodes from the network and add them to shells [3]. It is defined as: (π) ππ (π) = ππ + π · ππ (1) where π denotes a parameter between 0 and 1, π (π) denotes the residual degree (i.e., number of edges connected to non-removed nodes) and π (π) denotes the exhausted degree (number of edges connected to removed nodes). Initially, π (π) = π (π) = π for all the nodes. Then, all nodes with the smallest π (π) , say π, are assigned to the π-shell, and the value for π (π) is updated for the remaining nodes according to (1). Until there are no nodes with π (π) less or equal than π, these two steps are repeated and the removed nodes are added to the π-shell. This iterative procedure persists until all nodes are assigned to a shell. It’s important to note that, when π = 0, the MDD method effectively reduces to the conventional k-shell method, while π = 1 corresponds to the standard 2 Identification of Efficient Spreaders in Complex Networks degree-based approach. Although this measure tackles the aforementioned problems, it requires that a value for π is set; however, different network topologies give rise to different optimal values for π [3]. Improved k-shell In order to address the degeneracy issue of the k-shell, the improved k-shell method aims to distinguish nodes within the k-shell by favouring nodes that are closer to the core on average: Õ π(π) = −(π π max − π π (π) + 1) π ππ (2) π∈Γ(π π max ) where π π max denotes the maximum k-shell value of the network, Γ(π π max ) denotes the network core (i.e., the set of nodes in the π π max -shell and π ππ is the shortest distance between nodes π and π. The negative sign at the beginning of the expression is introduced to ensure that higher values correspond to more efficient nodes, aligning with other similar measures. A drawback regarding this measure pertains to its computational complexity, since it requires computing the shortest distance between the core and all the nodes. Neighborhood Coreness Based on the fact that the spreading quality of the node is also determined by the spreading quality of its neighbors [5], we can define a new measure neighborhood coreness as: Õ πΆ ππ (π) = π π (π€) (3) π€∈π(π) where π(π) denotes the set of nodes adjacent to a node π. The effectiveness of this measure, as emphasized by its creators, lies in its ability to consider both the degree and coreness of neighboring nodes. We can also construct a higher order version of this measure, designated by extended neighborhood coreness: Õ πΆ ππ + (π) = πΆ ππ (π€) (4) π€∈π(π) Weighted Neighborhood Centrality Unlike the previous measures, this new approach takes into account the importance of links in facilitating the spreading process. It builds upon two major assumptions: a node’s spreading capacity is both determined by its own intrinsic qualities and by the collective influence of its neighboring nodes; moreover, the spreading power of these neighboring nodes is weighted by the importance of the edges connecting them. The weighted neighborhood centrality is thus defined as: πΆ(π) = π π (π) + Õ π€ ππ π€∈π(π) β¨π€β© π π (π€) (5) where π€ ππ denotes the weight of edge (π, π), is defined as π€ ππ = π π π π and quantifies the diffusion importance by favouring edges that connect hubs. Naturally, β¨π€β© is the mean value of all edge weights. It’s worth noting that, the expression in Equation (5) can be adapted to incorporate alternative benchmark measures, although here we exclusively focus on its variation utilizing the k-shell measure for the sake of simplicity. MCDE Simulations conducted by [last paper’s authors] revealed that the presence of core-like groups can undermine the accuracy of influential spreader identification using k-shell decomposition. These core-like groups consist of nodes that have the highest k-shell values but display poor connectivity with the rest of the network. Consequently, they don’t turn out to be the most effective spreaders. Based on this, the mixed core, degree and entropy (MCDE) [6] considers not only the degree and the k-shell of a node, but also the distribution of their neighbors among network cores. To favour this dispersion, for each node π, MCDE extends the previous MDD measure by employing Shannon’s entropy πΈ(π), which we know to be maximal when the neighbors are uniformly spread among the shells: 3 Identification of Efficient Spreaders in Complex Networks E(π) = − πππ₯ πÕ π π π (π) log(π π (π)) π=1 where π π (π) is the proportion of neighbors of the node π which is in core π. MCDE is subsequently defined as a weighted combination of the node’s entropy, degree, and k-shell: MCDE(π) = πΌπ π (π) + π½π(π) + πΎE(π) Similarly to the MDD, this measure also require the parameters πΌ, π½ and πΎ to be adequately set. 2.2 Datasets, Models and Metrics In order to comprehensively compare the methods presented, we use datasets from different domains, and with different properties. The first we consider is a Network of Jazz musicians from [9] (Jazz), which connects two musicians if they have played in the same band. Second, a Protein-protein interaction in yeast (Yeast) in [10], where each node corresponds to a protein, and the edges are interactions between different proteins. Third, we use a dataset for a Flights’ network[11], where the nodes are airports and edges indicate flights between them. Finally, we consider Social Circles in Ego Networks[12] (Facebook), where for a given user (central node, which is not included in the nodes), their friends are represented as nodes, and there are connections between them if in turn they are also friends, making it possible to identify "circles" of common attributes between them. For simplicity, we remove self-loops and only consider the largest connected component in each network. To simulate spreading processes, we employ the Susceptible-Infectious-Recovered (SIR) model with a fixed recovery rate parameter πΎ = 0.1, and an infection rate parameter π½ which is network-dependent. In our analysis, we choose small values for π½, since the spreading always reaches a large proportion of the network when large values are used [1]. Nonetheless, we must set π½ such that the expected number of infected nodes is greater than zero (i.e. πΎβ¨πβ© π½ > π½ π = π π πΎ = β¨π 2 β©−β¨πβ© ) [13]. In all networks, we set π½ to be π½ π rounded up to the decimal place corresponding to the first non-zero digit of π½ π (i.e., if π½ π = 0.0024, π½ is set to 0.003). Each SIR simulation is over when there are no infected nodes. To measure a node’s spreading capacity, we recur to the spreading efficiency π π , defined as the proportion of infected nodes in a simulation when node π is the only initially infected node, averaged over π = 1000 simulations. To evaluate the effectiveness of various measures that rank individual spreaders, we introduce the imprecision π (π) function [1], defined as π π (π) = 1 − π π (π) , where π ∈ (0, 1), ππ (π) denotes the sum of the efficiencies of the best eff π π nodes (as ranked by the measure π) and πeff denotes the sum of the efficiencies of the actual most efficient π π nodes. Values of π(π) near 0 for all values of π indicate goodness of the ranking measure, while a low imprecision for small values π near zero indicate that the measure is effective in identifying the best spreaders. If the lowest rank of the π π selected nodes is πΌ and there are π πΌ of such nodes in the selected set, we average the imprecision resulting from considering random 1000 subsets of π πΌ elements from the ππΌ network nodes with rank πΌ. It’s worth noting that this calculation assumes that SIR simulations considering all nodes in the network as the initial infected node have been previously conducted. 3 Results and Discussion Table 1 contains several properties of the explored networks for future reference. Network Name Jazz Yeast Flights Facebook π 198 2224 2905 4039 πΈ 2742 6609 15645 88234 β¨πβ© 27.697 5.943 10.771 43.691 β¨π 2 β© 1070.242 98.994 601.453 4656.144 π» 38.641 16.657 55.840 106,570 π 0.0202 -0.105 0.0489 0.0636 β¨πΆβ© 0.617 0.138 0.456 0.606 π max 100 64 242 1045 π π max 29 10 28 115 π½c 0.00266 0.00639 0.00182 0.00095 π½ 0.003 0.007 0.002 0.001 Table 1: Several properties of each network. We record the number of nodes π, the number of edges πΈ, the first and second degree moments β¨πβ© and β¨π 2 β©, the heterogeneity π» = β¨π 2 β©/β¨πβ©, the network assortativity coefficient π, the clustering coefficient β¨πΆβ©, the maximum degree and k-shell π max and π π max , the threshold infection rate π½ c and the used infection rate π½ 4 Identification of Efficient Spreaders in Complex Networks 0.012 0.010 0.010 0.05 0.008 0.008 0.004 0.004 0.002 0.002 1 0.02 1 0.01 0 10 ks 20 30 0 10 20 ks 30 0.0125 0.0100 0 10 ks 0.0075 0.0050 0.0025 1 0.03 0.0150 0.006 1 0.006 0.0175 M M k M k k 10 0.04 0.0200 M k 10 0.06 Facebook 1000 Yeast 100 0.012 10 Flights 10 0.07 100 100 Jazz 0 100 ks 200 Figure 1: Heatmaps of the average efficiency of each range of π and π π . The range of [π min , π max ] was partitioned in 10 equally spaced bins on a logarithmic scale and the range of [π π min , π π max ] was partitioned in 10 equally spaced bins on a linear scale. Then, for each resulting two-dimensional bin, the average efficiency of nodes falling into the corresponding range of k and k-shell values was computed. Next, we analyse the relationship between the degree, k-shell and corresponding average efficiency in each network through the heatmaps present in Figure 1. We note that there is an agreement with Kitsak et al. [1] in the sense that a node’s degree and its k-shell value are not perfectly correlated, as nodes from a range of degrees corresponding to a bin can be dispersed among all shells, given the existence of rows which are almost entirely filled out with colored cells. While these figures may suggest that there is greater variability in efficiency for each degree value than for each k-shell value, given the preponderance of vertical color bands and gradients of colors in the rows, there are instances where variability can also be significant within a single k-shell, which happens to be the case in the two highest k-shells of all networks. This indicates that, even though k-shells provide a degree of organization, they do not perfectly determine each node’s efficiency. Moreover, while the heatmaps provide evidence for homogeneity of efficiency within each k-shell, they do not support the claim that efficiency correlates well with the k-shell. For instance, in the Yeast network, the heatmap counters the claim that the innermost k-shell contains the most efficient spreaders. Jazz Flights 0.04 M(0.05) M(0.05) 0.05 0.03 Facebook 0.007 0.014 0.006 0.006 0.012 0.005 0.005 0.010 0.004 0.003 M(0.05) 0.06 Yeast 0.007 M(0.05) 0.07 0.004 0.003 0.008 0.006 0.02 0.002 0.002 0.004 0.01 0.001 0.001 0.002 0.00 0.000 0.000 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.000 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 Figure 2: Average efficiency of the 0.05π nodes with the highest MDD values for each value of π in {0, 0.1, 0.2, . . . , 0.9, 1.0} In Figure 2, we analyze the average efficiency of the 0.05π nodes with the highest MDD values as function of π. We first note that the optimal value of π differs across the various networks, thereby validating its dependency on the network topology. We highlight the fact that, while the Jazz and Facebook networks exhibit similar assortativities, clustering coefficients, network heterogeneities, and have first and second-degree moments in the same order of magnitude, the corresponding optimal π values differ by 0.5 (the largest difference between any values). This shows us that it is difficult to determine the optimal π a priori solely based on a network’s characteristics. Moreover, we can also conclude that, for the Jazz network, a measure that assigns more weight to the degree than to the k-shell (i.e., one that is closer to the usual degree) is more efficient, while a measure that is closer to the k-shell is more efficient for the three other networks, given that more weight is given to the exhausted degree. Overall, this analysis suggests that a combined approach considering both the exhausted and residual degree leads to enhanced efficiency. We proceed in a similar fashion to find the optimal values for (π½, πΎ) to use for the MCDE measure in each network, this time considering all possible combinations of π½ and πΎ in {0, 0.1, 0.2, . . . , 0.9, 1.0}×{0, 0.1, 0.2, . . . , 0.9, 1.0}. A notable observation that can be made through Figure 3 is that the parameter π½ has a pivotal role in determining the efficiency of the MCDE measure within these networks, since the values associated with efficiency remain relatively consistent for a fixed π½ value. However, we note that πΎ is still a relevant parameter, since the optimal values yield πΎ values different than 0 (which corresponds in practise to the MDD), with the exception of the one 5 1.0 Identification of Efficient Spreaders in Complex Networks 1 2 3 4 5 6 (x 0.1) 7 8 9 10 0 1 2 3 4 5 6 (x 0.1) 7 8 9 10 M 4 0.0144 3 0.0070 2 1 0.007150 0 0 4 0.007175 0.0146 0.0071 3 0.007200 0.0069 0.0142 0 0 0 0.063 2 1 2 0.064 1 3 3 4 0.007225 0.0148 M (x0.1) 5 6 M (x0.1) 5 6 0.007250 2 4 0.065 0.007275 9 10 0.0072 7 0.007300 M (x0.1) 5 6 (x0.1) 5 6 0.066 0.0150 7 8 0.0073 7 7 Facebook 1 9 10 9 10 8 0.007325 8 0.067 0.007350 8 Yeast Flights 0.068 9 10 Jazz 0 1 2 3 4 5 6 (x 0.1) 7 8 9 10 0 1 2 3 4 5 6 (x 0.1) 7 8 9 10 Figure 3: Average efficiency of the 0.05π nodes with the highest MCDE values for all pairs of possible values for (π½, πΎ) in {0, 0.1, 0.2, . . . , 0.9, 1.0} × {0, 0.1, 0.2, . . . , 0.9, 1.0}. found for Jazz network. This also shows that finding out the parameters a priori without performing simulations remains a daunting task. Jazz 0.35 k-shell degree mdd improved k-shell coreness extended coreness weighted neighborhood mcde 0.08 0.30 0.07 k-shell degree mdd improved k-shell coreness extended coreness weighted neighborhood mcde 0.20 0.15 0.10 0.06 (p) 0.25 (p) Flights 0.09 0.05 0.04 0.03 0.02 0.05 0.01 0.00 0.01 0.02 0.03 0.04 0.05 p 0.06 0.07 0.08 0.09 0.01 0.02 0.03 Yeast 0.3 0.25 0.20 0.2 0.07 0.08 0.09 0.15 0.10 0.1 0.0 0.06 k-shell degree mdd improved k-shell coreness extended coreness weighted neighborhood mcde 0.30 (p) (p) 0.4 0.05 p Facebook k-shell degree mdd improved k-shell coreness extended coreness weighted neighborhood mcde 0.5 0.04 0.05 0.00 0.01 0.02 0.03 0.04 0.05 p 0.06 0.07 0.08 0.09 0.01 0.02 0.03 0.04 0.05 p 0.06 0.07 0.08 0.09 Figure 4: Imprecision function of all considered 8 measures for values of π in {0, 0.1, 0.2, . . . , 0.9, 1.0} We now delve into the imprecision function plots for all eight considered measures. We first note that, in the Facebook dataset (where the MDD measure’s optimal π parameter was the highest) measures that solely consider node degree consistently underperform compared to others, irrespective of the value of parameter π. This suggests that there are structural properties of these networks that link efficiency either with the node degree or the 6 Identification of Efficient Spreaders in Complex Networks k-shell, giving rise to a bias toward a subset of measures. Furthermore, it’s worth highlighting that the measures which consistently perform best across all datasets are those that take into account both degree and k-shell centrality Specifically, the extended coreness measure stands out as the top performer for all datasets except Jazz, but it still manages to achieve the lowest imprecision values for π ≤ 0.02 This underscores the advantage of considering a combination of degree and coreness measures to assess node efficiency. Nevertheless, our analysis does not provide significant evidence regarding the impact of considering the dispersion of neighbors among cores, as the MCDE both outperforms and underperforms MDD depending on the dataset. Lastly, while it remains uncertain whether weighted neighborhood centrality surpasses vanilla centrality, these results suggest that, in the context of identifying efficient spreaders, it is more advantageous to consider the coreness of a broader neighborhood rather than explicitly modeling the diffusion importance of edges. Additionally, the weighted neighborhood centrality appears to find a "sweet spot" by automatically combining degree and coreness without the need for parameter tuning. 4 Concluding Remarks In this paper, we conducted a survey on some of the most cited papers on how to identify influential nodes in complex networks, especially the ones based on topology-based measures. In addition, we provided some experiments and comparisons between them, given four distinct networks. From this study, it is possible to conclude that, while the performance of some metrics is reliant on the scenario (i.e. characteristics of the network in question), some methods’ results seem to consistently exceed others. That is, algorithms such as the extended coreness, that take into consideration not only the location of nodes in the network, but also the properties of their neighbors, achieve higher spreading rates on average than those that consider only the first of the two. These methods also appear to bypass structural biases that networks have in favouring degree or k-shell as a discriminative feature of spreading efficiency. Looking forward, it will be interesting to observe in which novel ways researchers will be able to obtain better results than those mentioned in this paper. Another equally important path forward, is the improvement of the computational complexity of the studied methods, in order for them to become feasible with networks of many orders of magnitude larger. Finally, we expect our work to provide a good comprehensive description of the state-of-the-art, as well as a framework for past and future works to transparently compare against each other, in diverse types of networks. Hopefully, this will propel forward this area of research, which has been garnering more and more interest, due to phenomena such as social networks, and the recent viral epidemics. 7 Identification of Efficient Spreaders in Complex Networks References [1] Maksim Kitsak et al. “Identification of influential spreaders in complex networks”. In: Nature Physics 6.11 (Aug. 2010), pp. 888–893. doi: 10.1038/nphys1746. url: https://doi.org/10.1038%2Fnphys1746. [2] Christian M. Schneider, Tamara Mihaljev, Shlomo Havlin, and Hans J. Herrmann. “Suppressing epidemics with a limited amount of immunization units”. In: Phys. Rev. E 84 (6 Dec. 2011), p. 061911. doi: 10.1103/PhysRevE.84.061911. url: https://link.aps.org/doi/10.1103/PhysRevE.84.061911. [3] An Zeng and Cheng-Jun Zhang. “Ranking spreaders by decomposing complex networks”. In: Physics Letters A 377.14 (2013), pp. 1031–1035. issn: 0375-9601. doi: https://doi.org/10.1016/j.physleta.2013.02.039. url: https://www.sciencedirect.com/science/article/pii/S0375960113002260. [4] Jian-Guo Liu, Zhuo-Ming Ren, and Qiang Guo. “Ranking the spreading influence in complex networks”. In: Physica A: Statistical Mechanics and its Applications 392.18 (2013), pp. 4154–4159. issn: 0378-4371. doi: https://doi.org/10.1016/j.physa.2013.04.037. url: https://www.sciencedirect.com/science/article/pii/S0378437113003506. [5] Joonhyun Bae and Sangwook Kim. “Identifying and ranking influential spreaders in complex networks by neighborhood coreness”. In: Physica A: Statistical Mechanics and its Applications 395 (2014), pp. 549–559. issn: 0378-4371. doi: https://doi.org/10.1016/j.physa.2013.10.047. url: https://www.sciencedirect.com/science/article/pii/S0378437113010406. [6] Amir Sheikhahmadi and Mohammad Ali Nematbakhsh. “Identification of multi-spreader users in social networks for viral marketing”. In: Journal of Information Science 43.3 (2017), pp. 412–423. doi: 10.1177/0165551516644171. eprint: https://doi.org/10.1177/0165551516644171. url: https://doi.org/10.1177/0165551516644171. [7] Junyi Wang, Xiaoni Hou, Kezan Li, and Yong Ding. “A novel weight neighborhood centrality algorithm for identifying influential spreaders in complex networks”. In: Physica A: Statistical Mechanics and its Applications 475 (2017), pp. 88–105. issn: 0378-4371. doi: https://doi.org/10.1016/j.physa.2017.02.007. url: https://www.sciencedirect.com/science/article/pii/S0378437117301218. [8] Claude Elwood Shannon. “A mathematical theory of communication”. In: The Bell system technical journal 27.3 (1948), pp. 379–423. [9] PABLO M. GLEISER and LEON DANON. “COMMUNITY STRUCTURE IN JAZZ”. In: Advances in Complex Systems 06.04 (Dec. 2003), pp. 565–573. doi: 10.1142/s0219525903001067. url: https://doi.org/10.1142%2Fs0219525903001067. [10] Bu D et al. “Topological structure analysis of the protein-protein interaction network in budding yeast”. In: Nucleic acids research 31.9 (2003), pp. 2443–50. issn: 1362-4962. doi: 10.1093/nar/gkg340. url: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC154226/. [11] Ryan A. Rossi and Nesreen K. Ahmed. “The Network Data Repository with Interactive Graph Analytics and Visualization”. In: AAAI. 2015. url: https://networkrepository.com. [12] Jure Leskovec and Julian Mcauley. “Learning to Discover Social Circles in Ego Networks”. In: Advances in Neural Information Processing Systems. Ed. by F. Pereira, C.J. Burges, L. Bottou, and K.Q. Weinberger. Vol. 25. Curran Associates, Inc., 2012. url: https: //proceedings.neurips.cc/paper_files/paper/2012/file/7a614fd06c325499f1680b9896beedebPaper.pdf. [13] Albert-László Barabási. “Network science”. In: Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 371.1987 (2013), p. 20120375. [14] Linyuan Lü, Tao Zhou, Qian-Ming Zhang, and H. Eugene Stanley. “The H-index of a network node and its relation to degree and coreness”. In: Nature Communications 7.10168 (Jan. 2016). doi: 10.1038/ncomms10168. url: https://doi.org/10.1038/ncomms10168. [15] Tian Bian and Yong Deng. “Identifying influential nodes in complex networks: A node information dimension approach”. In: Chaos: An Interdisciplinary Journal of Nonlinear Science 28.4 (Apr. 2018), p. 043109. issn: 1054-1500. doi: 10.1063/1.5030894. eprint: https://pubs.aip.org/aip/cha/articlepdf/doi/10.1063/1.5030894/10314679/043109\_1\_online.pdf. url: https://doi.org/10.1063/1.5030894. 8 Identification of Efficient Spreaders in Complex Networks [16] Ahmad Zareie, Amir Sheikhahmadi, and Mahdi Jalili. “Influential node ranking in social networks based on neighborhood diversity”. In: Future Generation Computer Systems 94 (2019), pp. 120–129. issn: 0167-739X. doi: https://doi.org/10.1016/j.future.2018.11.023. url: https://www.sciencedirect.com/science/article/pii/S0167739X18319009. [17] Min Wang, Wanchun Li, Yuning Guo, Xiaoyan Peng, and Yingxiang Li. “Identifying influential spreaders in complex networks based on improved k-shell method”. In: Physica A: Statistical Mechanics and its Applications 554 (2020), p. 124229. issn: 0378-4371. doi: https://doi.org/10.1016/j.physa.2020.124229. url: https://www.sciencedirect.com/science/article/pii/S0378437120300558. [18] Lei Guo, Jian-Hong Lin, Qiang Guo, and Jian-Guo Liu. “Identifying multiple influential spreaders in term of the distance-based coloring”. In: Physics Letters A 380.7 (2016), pp. 837–842. issn: 0375-9601. doi: https://doi.org/10.1016/j.physleta.2015.12.031. url: https://www.sciencedirect.com/science/article/pii/S0375960115010671. [19] Ying Liu, Ming Tang, Tao Zhou, and Do Younghae. “Core-like groups result in invalidation of identifying super-spreader by k-shell decomposition”. In: Scientific Reports 5.9602 (2015). issn: 2045-2322. doi: https://doi.org/10.1038/srep09602. 9