Article

Vital nodes identification in complex networks

Authors:
  • The Institute of Service-Oriented Manufacturing
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Real networks exhibit heterogeneous nature with nodes playing far different roles in structure and function. To identify vital nodes is thus very significant, allowing us to control the outbreak of epidemics, to conduct advertisements for e-commercial products, to predict popular scientific publications, and so on. The vital nodes identification attracts increasing attentions from both computer science and physical societies, with algorithms ranging from simply counting the immediate neighbors to complicated machine learning and message passing approaches. In this review, we clarify the concepts and metrics, classify the problems and methods, as well as review the important progresses and describe the state of the art. Furthermore, we provide extensive empirical analyses to compare well-known methods on disparate real networks, and highlight the future directions. In despite of the emphasis on physics-rooted approaches, the unification of the language and comparison with cross-domain methods would trigger interdisciplinary solutions in the near future.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... [18] using NetDraw-assisted UCINET Ver.6. The results of this analysis are the values of 5 aspects, namely: (1) eigenvector centrality, (2) degree centrality, (3) closeness centrality, (4) betweenness centrality, and (5) network density [19]. ...
... Closeness centrality is the average distance a node requires to reach all nodes in the network, so this measure describes the closeness between nodes and the vital role of nodes in the network [33], [34]. The smaller the node distance, the tighter the communication network is, so actors with high closeness centrality will spread information faster and more expensive to all actors [19]. Based on the closeness centrality analysis of independent learning classes, five actors with the highest closeness centrality values are obtained, which can be seen in Table 5. ...
... Betweenness centrality is a measurement to determine how far a node can control the flow of information between other actors and how well actors can facilitate communication with other actors [19], [35]. Actors with high betweenness centrality have an enormous capacity to facilitate interactions between actors. ...
Article
Full-text available
As one of the 21st-century competencies, mathematical communication ability must be achieved through interactions created between teachers and students, and among students. Discussion groups are an alternative that generates interaction between students. Currently, not many teachers design discussion groups based on communication networks. This study aims to describe the results of Social Networking Analysis in independent mathematics learning through group discussions using graph representation. This network analysis is a complete communication network analysis with a quantitative descriptive method using UCINET Ver.6. This study uses five aspects to analyze the data, namely: (1) eigenvector centrality, (2) degree centrality, (3) closeness centrality, (4) betweenness centrality, and (5) network density. The subjects of this study were 32 students at a junior high school in Yogyakarta, Indonesia, who were selected based on suggestions from the mathematics teacher.The data in this study were collected using questionnaires, observation, and interviews. Hence, the validity and reliability of each one has been examined. According to the study, 43.8% of students' independent arithmetic learning falls into the medium category. It implies that students frequently decide on study sessions with discussion partners and take the initiative to identify and arrange the answers. Based on the data, four groups were created, each with eight pupils. This study is anticipated to serve as a benchmark for other investigations into the efficacy of discussion groups created in conformity with 21st-century skills.
... Within this realm of study, the identification of key nodes has persistently remained a crucial and foundational issue. Given the heterogeneity of network structures, it often transpires that a small number of nodes exert significant influence within a network, and certain nodes can even wield substantial impact on the structure and functionality of the network [7]. For instance, in networks modeling the spread of infectious diseases, the phenomenon of "super-spreaders" is prevalent. ...
... Currently, a multitude of methods for identifying crucial nodes in complex networks have been proposed [7,11], and researchers have introduced some novel approaches in recent years [12][13][14]. However, existing methods often neglect to consider the clustering coefficient of nodes, which is an essential attribute [15]. ...
... Therefore, considering the clustering coefficients of first and second-order neighbors is akin to simultaneously taking into account the interaction information of nodes' third-order or even higher-order neighbors. This allows us to incorporate as much information as possible without increasing algorithm complexity significantly, thus compensating for some local centrality measures' limitations in terms of inadequate information consideration. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 A c c e p t e d M a n u s c r i p t New J.Phys XX (XXXX) XXXXXX Hao Wang et al 7 As mentioned earlier, in the SNC method, a node's degree is regarded as its volume, and its clustering coefficient is seen as density. The product of volume and density yields mass. ...
Article
Full-text available
Identifying key spreaders in a network is one of the fundamental problems in the field of complex network research, and accurately identifying influential propagators in a network holds significant practical implications. In recent years, numerous effective methods have been proposed and widely applied. However, many of these methods still have certain limitations. For instance, some methods rely solely on the global position information of nodes to assess their propagation influence, disregarding local node information. Additionally, certain methods do not consider clustering coefficients, which are essential attributes of nodes. Inspired by the quality formula, this paper introduces a method called SNC (Structure-based Node Centrality) that takes into account the neighborhood information of nodes. SNC measures the propagation power of nodes based on first and second-order neighborhood degrees, local clustering coefficients, structural hole constraints, and other information, resulting in higher accuracy. A series of pertinent experiments conducted on twelve real-world datasets demonstrate that, in terms of accuracy, SNC outperforms methods like CycleRatio and KSGC. Additionally, SNC demonstrates heightened monotonicity, enabling it to distinguish subtle differences between nodes. Furthermore, when it comes to identifying the most influential Top-k nodes, SNC also displays superior capabilities compared to the aforementioned methods. Finally, we conduct a detailed analysis of SNC and discuss its advantages and limitations.
... These diverse complex networks encompass almost every aspect of human life and even the natural world [3], including widely observed biological networks [4], power systems [5], social networks [6], and even virus transmission networks [7], among others [8]. Due to the heterogeneity of network structures, a small number of nodes often play a dominant role in complex networks, and certain important nodes can significantly influence the structure and functionality of the network [9]. Therefore, the fast and accurate identification of key nodes in the network is an urgent and significant problem. ...
... However, as our understanding of complex networks deepened, along with the expansion of network scale and complexity, new challenges emerged in the field of important node mining. Confronted with progressively complex network structures, researchers have put forth numerous new methods for mining important nodes [9,12]. For instance, the K-Shell (KS) method [13] determines node centrality based on its ...
... As pointed out by Lu et al, weighted networks carry richer information compared to unweighted networks [9]. For example, studying social networks with weighted information helps in measuring and analyzing the complex functionality and evolution of real-world societies [33]. ...
Article
Full-text available
The identification of important nodes in complex networks has always been a prominent topic in the field of network science. Nowadays, the emergence of large-scale networks has sparked our research interest in complex network centrality methods that balance accuracy and efficiency. Therefore, this paper proposes a novel centrality method called Spon (Sum of the Proportion Of Neighbors) Centrality, which combines algorithmic efficiency and accuracy. Spon only requires information within the three-hop neighborhood of a node to assess its centrality, thereby exhibiting lower time complexity and suitability for large-scale networks. To evaluate the performance of Spon, we conducted connectivity tests on 16 empirical unweighted networks and compared the monotonicity and algorithmic efficiency of Spon with other methods. Experimental results demonstrate that Spon achieves both accuracy and algorithmic efficiency, outperforming eight other methods, including CycleRatio, Collective Influence, and Social Capital. Additionally, we present a method called W-Spon to extend Spon to weighted networks. Comparative experimental results on 10 empirical weighted networks illustrate that W-Spon also possesses advantages compared to methods such as I-Core and M-Core.
... An indirect and trivial way to quantify simplices' influence involves using average node-level metrics, such as defining the degree of a triangle as the average of its three nodes' degrees. Specifically, there are generally three categories of methods mainly employed to identify vital nodes: neighborhood-based centralities, path-based centralities, and iterative refinement centralities [11]. ...
... They play a crucial role in various network tasks such as information spreading [6], synchronization, and control [16]. Notably, a small fraction of vital nodes can influence a large number of nodes within the entire network [11]. To identify these vital nodes, different methods have been developed, including structural centralities, iterative refinement centralities, and deep-learning-based approaches. ...
... Additionally, some algorithms cannot be classified as above, other algorithms also include the entanglement models [23], [24], and the random walkbased gravity model [25]. A more detailed overview can be found in the following reference [11]. In Table I, we list some widely used centrality metrics along with their formulas and explanations, and some of them serve as baselines for comparison with our model. ...
Preprint
Simplicial complexes have recently been in the limelight of higher-order network analysis, where a minority of simplices play crucial roles in structures and functions due to network heterogeneity. We find a significant inconsistency between identifying influential nodes and simplices. Therefore, it remains elusive how to characterize simplices' influence and identify influential simplices, despite the relative maturity of research on influential nodes (0-simplices) identification. Meanwhile, graph neural networks (GNNs) are potent tools that can exploit network topology and node features simultaneously, but they struggle to tackle higher-order tasks. In this paper, we propose a higher-order graph learning model, named influential simplices mining neural network (ISMnet), to identify vital h-simplices in simplicial complexes. It can tackle higher-order tasks by leveraging novel higher-order presentations: hierarchical bipartite graphs and higher-order hierarchical (HoH) Laplacians, where targeted simplices are grouped into a hub set and can interact with other simplices. Furthermore, ISMnet employs learnable graph convolutional operators in each HoH Laplacian domain to capture interactions among simplices, and it can identify influential simplices of arbitrary order by changing the hub set. Empirical results demonstrate that ISMnet significantly outperforms existing methods in ranking 0-simplices (nodes) and 2-simplices. In general, this novel framework excels in identifying influential simplices and promises to serve as a potent tool in higher-order network analysis.
... Complex networks are based on the usage of nodes in the analysis process [12]. It often leads to incompatible nodes assessment taking into account the centrality measures. ...
... This can be expressed using an Eq. (12). ...
... Sunil 15 provided a GNN-based (Graph Neural Network) inductive framework to approximate BC using the message passing mechanism. These methods are almost all based on the macro-statistical characteristics of graphs 16 , and pay less attention to the routing characteristics of the actual Internet. Unlike the above methods, another research takes certain characteristics of the actual Internet into account. ...
... The algorithm first utilizes the probe source V v to continuously and high-frequency measure the target area A and extracts the nodes and edges of target area A based on the measurement paths (a path from the probe source to the target node) to construct the initial topology structure of target area A (lines [1][2][3][4][5]. Then, count the number of occurrences of all measurement paths and retain the path with the highest number of occurrences as a stable path to construct a denoised graph G s (lines [6][7][8][9][10][11][12][13][14][15][16]. Afterwards, the number of times each edge in G s appears in the stable path is counted as the weight of the edge to obtain a weighted graph (lines [17][18][19][20][21][22]. ...
Article
Full-text available
Vital node discovery is a hotspot in network topology research. The key is using the Internet’s routing characteristics to remove noisy paths and accurately describe the network topology. In this manuscript, a vital regional routing nodes discovery algorithm based on routing characteristics is proposed. We analyze the stability of multiple rounds of measurement results to overcome the single vantage point’s path deviation. The unstable paths are eliminated from the regional network which is constructed through probing for target area, and the pruned topology is more in line with real routing rules. Finally, we weight the edge based on the actual network’s routing characteristics and discover vital nodes in combination with the weighting degree. Unlike existing algorithms, the proposed algorithm reconstructs the network topology based on communication and transforms unweighted network connections into weighted connections. We can evaluate the node importance in a more realistic network structure. Experiments on the Internet measurement data (275 million probing results collected in 107 days) demonstrate that: the proposed algorithm outperforms four existing typical algorithms. Among 15 groups of comparison in 3 cities, our algorithm found more (or the same number) backbone nodes in 10 groups and found more (or the same number) national backbone nodes in 13 groups.
... Edge Clustering Co-efficient Edge clustering (NC) is defined as the addition of edge features along with node features [18]. NC is a method for edge or node clustering, that appraise centrality by taking into account a node's neighbours' centrality as well. ...
... According to the rules of lethality-centrality hypothesis, hub nodes participate in most routes and these pathways are conserved. In a scale-free network like Wireless network, the centrality-lethality principle is valid [18]. As a result, we compare the different centrality metrics for biological networks in this research. ...
Article
Full-text available
Analysis of protein interaction is important for detailing the cell physiology and predicting disease conditions and drug optimizations. The detection of the crucial proteins in Protein Protein Interaction (PPI) networks is made easier by the accession of these interaction data. The revelation of essential protein nodes in PPI networks is possible using a variety of centrality methods. The hub nodes are decisive in a biological structure because these nodes adjoin profoundly and operate as regulatory hub. The majority of techniques, however, focus on the topological characteristics of PPI. For determining essential proteins, topology and gene annotation are rarely combined. Graph-theoretic methods are used to infer this biological framework in PPI networks. The protein, their interconnections, and the subnetworks are the main subjects of the topological study. In this study, we examine the standard centrality metrics. In order to identify the PPI's prominent nodes and the influence of topological features on centrality metrics, we carefully examined each node's centrality aspect. In this research, we consider Mammalian Protein Database (MIPS) and Biological General Repository for Interaction Networks (BioGRID) datasets and the empirical analysis of individual centrality measures are performed on PPI networks The experimental interpretation shows the behavior of centrality measures on the datasets.
... Studying complex networks can reveal numerous potential associations between elements. Traditional network models represent individual elements as nodes and capture the relationships between elements through links connecting the nodes [1][2][3]. In this way, the relationships between pairs of individuals can be represented as links in a social network, the functions of pairs of proteins can be abstracted as links in a protein network, and an email network can also serve as a representation of interactions among pairs of colleagues or other people [4][5][6][7]. ...
Article
Full-text available
Link prediction is a crucial area of study within complex networks research. Mapping nodes to low-dimensional vectors through network embeddings is a vital technique for link prediction. Most of the existing methods employ “node–edge”-structured networks to model the data and learn node embeddings. In this paper, we initially introduce the Clique structure to enhance the existing model and investigate the impact of introducing two Clique structures (LECON: Learning Embedding based on Clique Of the Network) and nine motifs (LEMON: Learning Embedding based on Motif Of the Network), respectively, on experimental performance. Subsequently, we introduce a hypergraph to model the network and reconfigure the network by mapping hypermotifs to two structures: open hypermotif and closed hypermotif, respectively. Then, we introduce hypermotifs as supernodes to capture the structural similarity between nodes in the network (HMRLH: HyperMotif Representation Learning on Hypergraph). After that, taking into account the connectivity and structural similarity of the involved nodes, we propose the Depth and Breadth Motif Random Walk method to acquire node sequences. We then apply this method to the LEMON (LEMON-DB: LEMON-Depth and Breadth Motif Random Walk) and HMRLH (HMRLH-DB: HMRLH-Depth and Breadth Motif Random Walk) algorithms. The experimental results on four different datasets indicate that, compared with the LEMON method, the LECON method improves experimental performance while reducing time complexity. The HMRLH method, utilizing hypernetwork modeling, proves more effective in extracting node features. The LEMON-DB and HMRLH-DB methods, incorporating new random walk approaches, outperform the original methods in the field of link prediction. Compared with state-of-the-art baselines, the proposed approach in this paper effectively enhances link prediction accuracy, demonstrating a certain level of superiority.
... Numerous academics have developed centrality-based approaches to diverse topological structures that restrict performance and adaptability. These centrality-based techniques have been further classified into a variety of categories [35]. The distance-based methods focus on the smallest route between nodes and include betweenness centrality [36] and proximity centrality [37]. ...
... Numerous research efforts within the field of network science have been dedicated to the exploration of techniques for identifying influential nodes within networks [8]. Various centrality measures, such as degree centrality, betweenness centrality, eigenvector centrality, and closeness centrality, have been introduced to capture distinct features of node influence [9]. However, these traditional centrality measures focus on local network properties and may oversee the combined effects of global information that contribute to node significance. ...
Article
Full-text available
Over the past decade, there has been extensive research conducted on complex networks, primarily driven by their crucial role in understanding the various real-world networks such as social networks, communication networks, transportation networks, and biological networks. Ranking influential nodes is one of the fundamental research problems in the areas of rumor spreading, disease research, viral marketing, and drug development. Influential nodes in any network are used to disseminate the information as fast as possible. Centrality measures are designed to quantify the node’s significance and rank the influential nodes in complex networks. However, these measures typically focus on either the local or global topological structure within and outside network communities. In particular, many measures limit their ability to capture the node’s overall impact on small-scale networks. To address these challenges, we develop a novel centrality measure called Isolating Clustering Distance Centrality (ICDC) by integrating the isolating and clustering coefficient centrality measures. The proposed metric gives a more thorough assessment of the node’s importance by integrating the local isolation and global topological influence in large-scale complex networks. We employ the SIR and ICM epidemic models to study the efficiency of ICDC against traditional centrality measures across real-world complex networks. Our experimental findings consistently highlight the superior efficacy of ICDC in terms of fast spreading and computational efficiency when compared to existing centrality measures.
... Social network analysis is widely utilized to study urban spatial network structures. In the article "Vital Nodes Identification in Complex Networks" [37], the identification of vital nodes in real networks and their roles in structure and function were explored. The article provides a comprehensive review of concepts, metrics, and methods, comparing well-known techniques across networks. ...
Article
Full-text available
The Chengdu–Chongqing urban agglomeration (CCUA), as the only national-level city cluster in southwestern China, serves as a strategic support for the Yangtze River Economic Belt and an important demonstration area for promoting new urbanization in the country. The study of the networked characteristics of the CCUA contributes to a systematic understanding of its spatial connectivity patterns, optimization of spatial structure and layout, and promotion of high-quality regional development. In this study, we constructed models for traffic flow, information flow, migration flow, and composite flow to calculate the strength of connections between cities and the total flow of various elements in the CCUA. ArcGIS spatial visualization tools were used to depict the spatial connectivity patterns of the element flows within the CCUA. Additionally, social network analysis methods, including network density, centrality, and cohesive subgroups analysis, were employed to reveal the spatial network structure characteristics of the CCUA. The findings are as follows: (1) The overall structure of the cities within the CCUA is relatively loose, with significant differences in connectivity strength. It exhibits a west-strong and east-weak pattern, with Chengdu-Chongqing, Chengdu-Deyang, Chengdu-Mianyang, and Chengdu-Meishan occupying the top tier, while Zigong and Ya'an have relatively weak connections with other cities. Chengdu and Chongqing have prominent positions in the CCUA, with Chengdu having a more prominent core position compared to Chongqing, resulting in an overall hierarchical distribution of “1 + 1+7 + 7”. (2) The network density of the element flows in the CCUA is relatively low, indicating a generally weak element connectivity. The centrality of cities other than Chengdu and Chongqing is at a moderate to lower level, suggesting a weak overall resource connectivity capacity in the CCUA. (3) Most cities tend to form cohesive subgroups based on geographic proximity, while the cohesive subgroup in Chongqing is still in its early stages of development due to administrative boundaries. The research results quantitatively depict the spatial network structure characteristics of the CCUA, providing theoretical references for its development.
... Centrality analysis investigates the most infuential nodes in a network. Tere are multiple defnitions of centrality that exploit either local or global characteristics of the networks [33,34]. Here, we perform a comparative analysis of the strength (number of fights in an airport) and degree (number of routes in an airport) centralities of the various components. ...
Article
Full-text available
The topological structure of the world air transportation network has been the subject of much research. However, to better understand the reality of air networks, one can consider the traffic, the number of passengers, or the distance between flights. This paper studies the weighted world air transportation network through the component structure, recently introduced in the network literature, by using the number of flights. The component structure is based on the community or multiple core-periphery structures and splits the network into local and global components. The local components capture the regional flights of these two mesoscopic structures (dense parts). The global components capture the inter-regional flights (links between the dense parts). We perform a comparative analysis of the world air transportation network and its components with their weighted counterparts. Moreover, we explore the strength and the s-core of these networks. Results display fewer local components well delimited and more global components covering the world than the unweighted world air transportation network. Centrality analysis reveals the difference between the top airports with high traffic and the top airports with high degrees. This difference is more pronounced in the global air network and the largest global component. Core analysis shows similitude between the s-core and the k-core for the local and global components, even though the latter includes more airports. For the world air network, the North and Central America-Caribbean airports dominate the s-core, whereas the European airports dominate the k-core.
... When considering the diffusion of social influence through social networks, we first need to establish an appropriate diffusion model. In the literature, diverse diffusion models have been presented [41]. Since this paper only deals with connection-specific diffusion models, the widely used model IC is adopted to imitate the spreading influence on a network. ...
Article
Full-text available
The influence maximization problem that has caused great attention in social network analysis aims at selecting a small set of influential spreaders so that the information cascade triggered by the seed set is maximized. The majority of the existing works mainly focus on developing single-stage seeding strategies that would ignite all the seeds before the influence spread. However, it cannot depict the scenarios of the practical, where ones would like to make further decisions based on observed activation. In this paper, we investigate the policies for the intractable sequential influence maximization problem. A Q-learning-driven discrete differential evolution algorithm based on the reinforcement Q-learning model, which is treated as a parameter controller to adaptively adjust the parameters during the evolution of the algorithm, is proposed. The policy distributes the seeding actions over the spreading process by estimating the latest node status of the network dynamically. Extensive simulations are conducted on six social networks of the practical, and the findings demonstrate the superiority and effectiveness of the hybrid meta-heuristic algorithm compared with the state-of-the-art methods.
... A detailed discussion on node ranking method for weighted networks, random walk based methods and iterative enhancement based methods is given in [25]. Also, a useful summary of state of the art of node ranking methods is given in [26]. ...
Article
Full-text available
Contagion spread is a common phenomenon observable on a variety of complex networks. The knowledge of key spreaders and contagion dynamics facilitates the design of applications that can either reduce the spread of unwanted contagion or amplify the proliferation of desired ones. Hence, it is essential to identify and rank the influential (key) spreaders in complex networks. Extended neighbourhood coreness (Cnc+) is one such method that uses the k-shell index to identify and rank the influential spreaders. The neighborhood of a node plays a very important role in contagion spread and the combination of local and global topological information of a node can better capture the spreading influence of the nodes. In this paper, a measure, namely, hybrid Cnc+ coreNess (HCN) is proposed that extends Cnc+ by including first and second order neighbourhood of a node (local information) along with the k-shell index. In experiments, HCN is compared with state of the art methods for both real and artificial datasets. The results show that HCN is accurate and better than state of the art methods. Further, least variation in ranking accuracy is observed in experiments of parameter variation for artificial networks. Computational complexity analysis shows that the proposed method achieves high accuracy incurring a small computational penalty.
... Complex networks hold substantial significance given their extensive reach and impact on diverse aspects of our lives. At the heart of complex networks lie the key players, also known as influential players [1], vital players [2], or critical players [3]. These players represent certain nodes, edges, or substructures that, when removed, can substantially degrade a network's specific functionality [4]. ...
Article
Full-text available
The problem of finding key players in a graph, also known as network dismantling, or network disintegration, aims to find an optimal removal sequence of nodes (edges, substructures) through a certain algorithm, ultimately causing functional indicators such as the largest connected component (GCC) or network pair connectivity in the graph to rapidly decline. As a typical NP-hard problem on graphs, recent methods based on reinforcement learning and graph representation learning have effectively solved such problems. However, existing reinforcement-learning-based key-player-identification algorithms often need to remove too many nodes in order to achieve the optimal effect when removing the remaining network until no connected edges remain. The use of a minimum number of nodes while maintaining or surpassing the performance of existing methods is a worthwhile research problem. To this end, a novel algorithm called MiniKey was proposed to tackle such challenges, which employs a specific deep Q-network architecture for reinforcement learning, a novel reward-shaping mechanism based on network functional indicators, and the graph-embedding technique GraphSage to transform network nodes into latent representations. Additionally, a technique dubbed ‘virtual node technology’ is integrated to grasp the overarching feature representation of the whole network. This innovative algorithm can be effectively trained on small-scale simulated graphs while also being scalable to large-scale real-world networks. Importantly, experiments from both six simulated datasets and six real-world datasets demonstrates that MiniKey can achieve optimal performance, striking a perfect balance between the effectiveness of key node identification and the minimization of the number of nodes that is utilized, which holds potential for real-world applications such as curbing misinformation spread in social networks, optimizing traffic in transportation systems, and identifying key targets in biological networks for targeted interventions.
... • This dataset can be used to further investigate information diffusion in social networks and could also be useful in the study of epidemiological models. There is a vast literature on identifying 'influential spreaders' within social or epidemiological networks [3] , with much work focused on developing centrality measures to more accurately and efficiently perform such identification [4] . These studies typically propose a novel centrality measure and evaluate its performance on various real-world networks. ...
Article
Full-text available
We present a social network dataset based on interactions between members of the 117th United States Congress between Feb. 9, 2022, and June 9, 2022. The dataset takes the form of a directed, weighted network in which the edge weights are empirically obtained “probabilities of influence” between all pairs of Congresspeople. Twitter's application programming interface (API) V2 was used to determine the number of times each member of Congress retweeted, quote tweeted, replied to, or mentioned other Congressional members, and the probability of influence was found by normalizing the summed influence by the number of tweets issued by each Congressperson. This network may be of particular interest to the study of information diffusion within social networks.
... The chosen seed nodes have a great influence on the identified communities and on the performance and efficiency of the community-detection methods [7,8]. Until now, different centrality measures tied to the network topology have been introduced to solve the issue of finding these initial community seeds, as the locally most influential nodes in a complex network [9]. ...
Article
Full-text available
One of the most important problems in complex networks is the location of nodes that are essential or play a main role in the network. Nodes with main local roles are the centers of real communities. Communities are sets of nodes of complex networks and are densely connected internally. Choosing the right nodes as seeds of the communities is crucial in determining real communities. We propose a new centrality measure named density-based entropy centrality for the local identification of the most important nodes. It measures the entropy of the sum of the sizes of the maximal cliques to which each node and its neighbor nodes belong. The proposed centrality is a local measure for explaining the local influence of each node, which provides an efficient way to locally identify the most important nodes and for community detection because communities are local structures. It can be computed independently for individual vertices, for large networks, and for not well-specified networks. The use of the proposed density-based entropy centrality for community seed selection and community detection outperforms other centrality measures.
... NEP, which is a significant metric utilized to describe the carbon sequestration effect of vegetation in a region, can indicate the discrepancy between the net primary productivity (NPP) of vegetation and soil heterotrophic respiration. In this study, we employed the MODIS MOD17A3HGF product to acquire NPP data in the Eyu region [47]. This product has been extensively utilized in vegetation NPP research in northwestern China. ...
Article
Full-text available
Optimizing the connectivity-carbon sequestration coupling coordination of forest and grassland ecological spaces (F&GES) is a crucial measure to enhance carbon sequestration effectively in mining areas. However, the prevailing strategies for optimizing F&GES often overlook the connectivity-carbon sequestration coupling coordination of the network. Therefore, this study aimed to propose a novel restoration plan to improve the connectivity-carbon sequestration coupling coordination of existing networks. Taking a typical mining area in northwestern China (Eyu County) as an example, we extracted the existing F&GES based on remote sensing ecological indicators and ecological risk assessments. Subsequently, we optimized the network using the connectivity-carbon sequestration coupling coordination degree (CSCCD) model from the perspective of connectivity-carbon sequestration coupling coordination, proposed potential alternative optimization schemes, and evaluated the optimization effects. The results showed that the range of Eyu County's F&GES structure had been determined. Ecological source sites with better carbon sequestration effects were primarily distributed in the central and northeastern parts of Eyu County. After optimization, the network added 26 ecological patches, and the added area reached 641.57 km 2. Furthermore, the connectivity robustness, edge restoration robustness, and node restoration robustness of the optimized network were significantly improved, and the carbon sequestration effect of the forest and grassland ecological space was increased by 6.78%. The contribution rate of ecological source sites was 97.66%, and that of ecological corridors was 2.34%. The CSCCD model proposed in this study can effectively improve the carbon sequestration effect in mining areas, promote carbon neutrality, and save network optimization time while improving efficiency. This restoration strategy is also applicable to forest and grassland ecosystem management and optimization of ecological spaces in other mining areas, which has positive implications for promoting ecological civilization construction and sustainable development.
... Various centrality measures have been proposed to identify such influential spreaders in the SIR model with homogeneous transmission probability, including degree, k-core, betweenness centrality, eigenvector centrality, and PageRank, among others [3]. Such classic approaches suffer from two major drawbacks, however: they do not generally take into account specific details of the spreading model, and they do not apply to WD networks. ...
... For small-world networks or larger community networks, the effects of the above algorithms are not satisfactory. For more relevant node importance evaluation index methods, please refer to the review literature [27]. ...
Article
Full-text available
In complex networks, identifying influential nodes is of great significance for their wide application. The proposed method integrates the correlation properties of local and global, and in terms of global features, the K-shell decomposition method of fusion degree is used to improve the actual discrimination degree of each node. In terms of local characteristics, the Solton index is introduced to effectively show the association relationship between each node and adjacent nodes. Through the analysis and comparison of multiple existing methods, it is found that the proposed method can identify key nodes more accurately so as to help quickly disintegrate the network. The final manual network verification also shows that this method is also suitable for the identification of important nodes of small-world networks and community networks.
... The problems encountered in node importance ranking have been thoroughly discussed before. 47 Here, we focus on two concerned issues: (i) it is difficult to find an index that best quantifies the importance of nodes in all possible measures, and (ii) most known methods were essentially designed to rank individual vital nodes instead of a set of vital nodes, while the latter is more relevant to many real-world applications such as group infection-testing and immunization. In the process of ranking cliques, we find that, in terms of network robustness, synchronization, and propagation, almost all simulation results satisfy the ranking of HOP > HOC > HOD > HOH, where ">" means "better than." ...
Article
Traditional network analysis focuses on the representation of complex systems with only pairwise interactions between nodes. However, the higher-order structure, which is beyond pairwise interactions, has a great influence on both network dynamics and function. Ranking cliques could help understand more emergent dynamical phenomena in large-scale complex networks with higher-order structures, regarding important issues, such as behavioral synchronization, dynamical evolution, and epidemic spreading. In this paper, motivated by multi-node interactions in a topological simplex, several higher-order centralities are proposed, namely, higher-order cycle (HOC) ratio, higher-order degree, higher-order H-index, and higher-order PageRank (HOP), to quantify and rank the importance of cliques. Experiments on both synthetic and real-world networks support that, compared with other traditional network metrics, the proposed higher-order centralities effectively reduce the dimension of a large-scale network and are more accurate in finding a set of vital nodes. Moreover, since the critical cliques ranked by the HOP and the HOC are scattered over a complex network, the HOP and the HOC outperform other metrics in ranking cliques that are vital in maintaining the network connectivity, thereby facilitating network dynamical synchronization and virus spread control in applications.
... In previous studies, traditional centrality methods [6,7] focus on a network's structural properties and use top ranked nodes by different centralities as influential spreaders. In general, centralities can be classified as local and global measures [8]. Local measures are skewed towards the information of a neighborhood, such as degree centrality [9]. ...
Article
Full-text available
Network epidemiology plays a fundamental role in understanding the relationship between network structure and epidemic dynamics, among which identifying influential spreaders is especially important. Most previous studies aim to propose a centrality measure based on network topology to reflect the influence of spreaders, which manifest limited universality. Machine learning enhances the identification of influential spreaders by combining multiple centralities. However, several centrality measures utilized in machine learning methods, such as closeness centrality, exhibit high computational complexity when confronted with large network sizes. Here, we propose a two-phase feature selection method for identifying influential spreaders with a reduced feature dimension. Depending on the definition of influential spreaders, we obtain the optimal feature combination for different synthetic networks. Our results demonstrate that when the datasets are mildly or moderately imbalanced, for Barabasi–Albert (BA) scale-free networks, the centralities’ combination with the two-hop neighborhood is fundamental, and for Erdős–Rényi (ER) random graphs, the centralities’ combination with the degree centrality is essential. Meanwhile, for Watts–Strogatz (WS) small world networks, feature selection is unnecessary. We also conduct experiments on real-world networks, and the features selected display a high similarity with synthetic networks. Our method provides a new path for identifying superspreaders for the control of epidemics.
... The importance of nodes can be viewed from two perspectives: location in the network and network propagation dynamics [49]. Starting from the topology of the network, Lü et al. classified node importance ranking algorithms for single-layer networks into three categories according to the number of neighbors, paths, and eigenvectors of the adjacency matrix of nodes [50]. ...
Article
Complex systems widely exist in nature and human society. There are complex interactions between system elements in a complex system, and systems show complex features at the macro level, such as emergence, self-organization, uncertainty, and dynamics. These complex features make it difficult to understand the internal operation mechanism of complex systems. Networked modeling of complex systems is a favorable means of understanding complex systems. It not only represents complex interactions but also reflects essential attributes of complex systems. This paper summarizes the research progress of complex systems modeling and analysis from the perspective of network science, including networked modeling, vital node analysis, network invulnerability analysis, network disintegration analysis, resilience analysis, complex network link prediction, and the attacker-defender game in complex networks. In addition, this paper presents some points of view on the trend and focus of future research on network analysis of complex systems.
... These central nodes are important for pathway analysis. The centrality-lethality states that hub nodes are involved in the majority of the pathways and these pathways are conserved [23][24]. The centrality-lethality principle holds good for any scale free network [25]. ...
Article
Full-text available
Analysis of protein interaction is widely recognized to understand cell physiology and disease conditions. The increase in the accumulation of these interaction data facilitates the recognition of the essential proteins in Protein Protein Interaction (PPI) networks. An array of centrality measures are available to uncover essential proteins in PPI networks. However, majority approaches are centered around topological properties of PPI. Few approaches integrate gene annotation with topology for predicting essential proteins. This biological framework in PPI network are inferred in terms of graph-theoretic approaches. The topological analysis focuses on protein, their interactions, and the subnetworks. In this research, we review the common centrality measures. We thoroughly studied the centrality aspect of each node in the PPI to detect the influential nodes and the impact of topological features in centrality measures. We applied centrality measures to the PPI networks obtained from the Biological General Repository for Interaction Networks (BioGRID) and Mammalian Protein Protein Database (MIPS) datasets. The experimental evaluation shows the behavior of centrality measures to the datasets.
... The problem of vital node identification has attracted increasing attention in different fields [9]. Typically, researchers build user social networks based on participant interaction data collected over a period of time and then work to identify key nodes in the network. ...
Preprint
Backbone members are recognized as essential parts of an organization, yet their role and mechanisms of functioning in networks are not fully understood. In this paper, we propose a new framework called Twotier to analyze the evolution of community sports organizations (CSOs) and the role of backbone members. Tier-one establishes a dynamic user interaction network based on grouping relationships, and weighted k-shell decomposition is used to select backbone members. We perform community detection and capture the evolution of two separate sub-networks: one formed by backbone members and the other formed by other members. In Tier-two, the sub-networks are abstracted, revealing a core-periphery structure in the organization where backbone members serve as bridges connecting all parts of the network. Our findings suggest that relying on backbone members can keep newcomers actively involved in rewarding activities, while non-rewarding activities solidify relations between backbone members.
Article
In social network analysis, identifying the important nodes (key nodes) is a significant task in various applications. There are three most popular related tasks named influential node ranking, influence maximization, and network dismantling. Although these studies are different due to their own motivation, they share many similarities, which could confuse the non-domain readers and users. Moreover, few studies have explored the correlations between key nodes obtained from different tasks, hindering our further understanding of social networks. In this paper, we contribute to the field by conducting an in-depth survey of different kinds of key nodes through comparing these key nodes under our proposed framework and revealing their deep relationships. First, we clarify and formalize three existing popular studies under a uniform standard. Then we collect a group of crucial metrics and propose a fair comparison framework to analyze the features of key nodes identified by different research fields. From a large number of experiments and deep analysis on twenty real-world datasets, we not only explore correlations between key nodes derived from the three popular tasks, but also summarize insightful conclusions that explain how key nodes differ from each other and reveal their unique features for the corresponding tasks. Furthermore, we show that Shapley centrality could identify key nodes with more generality, and these nodes could also be applied to the three popular tasks simultaneously to a certain extent.
Article
In the manufacturing network, the fluctuation of process operation time can result in the delay of the total completion time, which brings a huge challenge to controlling entire production schedule. For aircraft manufacturing industries, the processes that have a significant impact on the completion time of the assembly line need to be identified and monitored in order to ensure the final assembly efficiency and reliability. However, due to the large number of process nodes and the complexity of resource constraints and relationship constraints, traditional node centrality algorithms cannot identify influential nodes accurately. Therefore, based on complex network theory, this paper studies the influential nodes identification problem in the context of assembly lines. Firstly, The Resource-Process Coupling Network for the assembly line is constructed based on the production background and network resource characteristics. Then, based on PageRank algorithm and the idea of resource iteration, Improved PageRank Algorithm (IPRA) is proposed to identify influential process nodes, which brings the resource and time parameters into the allocation rules. Finally, according to simulation results, a comparative analysis of IPRA and existing algorithms is conducted and concludes that our method can better identify influential nodes in actual complex production networks. Furthermore, this paper identifies influential process nodes of aircraft assembly line based on the case of the commercial aircraft manufactory.
Article
Hypergraph is the model of relations lying in clusters of objects. Identifying vital nodes is a fundamental problem in the analysis of the hypergraph. To reflect the multilayer feature of the hypergraph, in this paper, we deconstruct the hypergraph into a simplicial complex and analyze the homological dual relations of boundary and coboundary between simplices. For clarity, these two relations are summarized into a bidirectional graph, called the simplicial diagram, which provides a global framework for the exploration of the hypergraph. To determine the node importance in the hypergraph, we propose a parameter-free eigenvector centrality for weighted hypergraphs in terms of a simplicial complex, named Simplicial DualRank centrality. For each simplex, we define two indices of importance, the inner centrality and the outer centrality. Inner centrality transmits according to the relation of coboundary, which converts to outer centrality at the hyperlinks; in duality, outer centrality transmits according to the relation of boundary, which converts to inner centrality at the nodes. Therefore, a circuit of centrality is constructed on the simplicial diagram, the steady state of which defines the Simplicial DualRank centrality of all the simplices in the hypergraph. Moreover, we apply the Simplicial DualRank centrality to weighted complex networks, which results in a variant of the classical eigenvector centrality. Finally, experimental results in a science collaboration dataset show that the Simplicial DualRank can identify Nobel laureates from the prize-winning papers in Physics, top scientists should select collaborators more carefully to maintain their research quality, and scholars tend to find relatively effective collaborations in their future research.
Article
Identifying influential spreaders has theoretical and practical significance in complex networks. Traditional centrality methods can efficiently find a single spreader, but it could lead to influence redundancy and high initializing costs when used to identify a set of multiple spreaders. A cycle structure is one of the most crucial reasons for the complexity of a network and the cornerstone of the feedback effect. From this novel perspective, we propose a new method based on basic cycles in networks to identify multiple influential spreaders with superior spreading performance and low initializing costs. Experiments on six empirical networks show that the spreaders selected by the proposed method are more scattered in the network and yield the best spreading performance compared with those on seven well-known methods. Importantly, the proposed method is the most cost effective under the same spreading performance. The cycle-based method has the advantage of generating multiple solutions. Our work provides new insights into identifying multiple spreaders and hence can benefit wide applications in practical scenarios.
Article
Key nodes play a vital role in the transportation network by significantly influencing its structure, function, and reliability. Their failure can severely limit the transportation efficiency and passenger flow of urban agglomerations. To address the limitations of single traffic mode and incomplete evaluation indexes, this paper proposes a weighted multi-layer network node importance evaluation method that considers node connectivity, diverse transportation modes, and service capabilities (CDSM). Firstly, based on the long-distance bus stations, railway stations, and airports in the Beijing-Tianjin-Hebei urban agglomeration (BTHUA), the comprehensive transportation network (CTN) is constructed. Secondly, evaluation indexes including degree, closeness, capacity, and grade are selected to build the weighted model and obtain the comprehensive importance of nodes. Finally, the effectiveness of the method is verified by the susceptible-infected-recovered model. The results show that: 1) The CTN in this paper can reflect real traffic links among different transportation modes in urban agglomerations. 2) When the top 5 and top 10 key nodes identified by the CDSM are used as initial infected nodes, the infection and propagation rate of the CTN are higher than traditional methods such as the using degree, betweenness, closeness, and k-shell. Specifically, the infection rate is 2.8% and 3.2% higher than the average of these methods, and the propagation rate is 7 and 2.3 time steps faster respectively. 3) The more key nodes initially infected, the higher the infection and propagation rate of the CTN. 4) The key nodes identified in this paper include important transportation hubs of BTHUA, aligned with reality.
Article
Full-text available
Across the three domains of life, circadian clock is known to regulate vital physiological processes, like, growth, development, defence etc. by anticipating environmental cues. In this work, we report an integrated network theoretic methodology comprising of random walk with restart and graphlet degree vectors to characterize genome wide core circadian clock and clock associated raw candidate proteins in a plant for which protein interaction information is available. As a case study, we have implemented this framework in Ocimum tenuiflorum (Tulsi); one of the most valuable medicinal plants that has been utilized since ancient times in the management of a large number of diseases. For that, 24 core clock (CC) proteins were mined in 56 template plant genomes to build their hidden Markov models (HMMs). These HMMs were then used to identify 24 core clock proteins in O. tenuiflorum. The local topology of the interologous Tulsi protein interaction network was explored to predict the CC associated raw candidate proteins. Statistical and biological significance of the raw candidates was determined using permutation and enrichment tests. A total of 66 putative CC associated proteins were identified and their functional annotation was performed.
Article
Full-text available
Designing network systems able to sustain functionality after random failures or targeted attacks is a crucial aspect of networks. This paper investigates several strategies of link selection aiming at enhancing the robustness of a network by optimizing the effective graph resistance. In particular, we study the problem of optimizing this measure through two different strategies: the addition of a non-existing link to the network and the protection of an existing link whose removal would result in a severe network compromise. For each strategy, we exploit a genetic algorithm as optimization technique, and a computationally efficient technique based on the Moore–Penrose pseudoinverse matrix of the Laplacian of a graph for approximating the effective graph resistance. We compare these strategies to other state-of-the art methods over both real-world and synthetic networks finding that our proposals provide a higher speedup, especially on large networks, and results closer to those provided by the exhaustive search.
Article
Full-text available
Most of the transportation networks are not determined by their topology only but also by the traffic taking place on the links. It is therefore crucial to characterize the traffic and its possible correlations with the network’s topology. We first define and introduce some of the tools which allows for the characterization of the traffic and the structure of a weighted network. We illustrate these measures on the example of the airport network in which the nodes are airports and where links represent direct connections. The weight on each link is then given by the number of passengers. The main results are the following: (i) the traffic is very heterogeneous and is distributed according to a broad law; (ii) the number of passengers per connection is not constant and increases with the number of connections of an airport, which implies that traffic and topology are not independent. More generally, these measures show that the traffic cannot in general be ignored and that the modeling of transportation networks has to integrate simultaneously both the topology and the weights. We thus propose a model which allows to explain some of the features observed in real-world networks. The main ingredient in this weighted network growth model is a dynamical coupling between weights and links: every time a new link enters the system, the traffic is perturbed. We show that this simple ingredient allows to understand the structure of some real-world weighted networks as well as the interplay between traffic flows and the network’s architecture. La plupart des réseaux de transport ne sont pas uniquement déterminés par leur topologie mais aussi par le trafic sur les liens. Il est donc important de caractériser ce trafic ainsi que ses corrélations éventuelles avec la topologie. Dans un premier temps, nous définissons les outils qui permettent la caracté­risation du trafic et de la structure du réseau. Nous illustrons ces mesures sur l'exemple du réseau du transport aérien pour lequel les nœuds sont les aéroports et les liens les lignes directes. Le poids de chaque lien est alors donné par le nombre de passagers. Les résultats essentiels sont les suivants : (i) le trafic est très hétérogène car distribué selon une loi large; (ii) le trafic et la topologie ne sont pas indépendants, le nombre de passagers par connexion n'étant pas constant et augmentant avec le nombre de connexions d'un aéroport. Plus généralement, ces mesures démontrent que le trafic ne peut pas être ignoré et que la modélisation des réseaux de transports doit intégrer simultanément la topologie et les poids des liens. Nous décrivons donc un modèle qui permet d'expliquer certaines caractéristiques observées dans des cas réels. Ce modèle de formation de réseaux valués repose sur l'idée d'un couplage dynamique entre les poids et les liens : dès qu'un nouveau lien entre dans le système, le trafic est naturellement perturbé. Nous montrons alors que ce simple ingrédient permet de comprendre la structure de certains réseaux ainsi que l'articulation entre le flot du trafic et l'architecture du réseau.
Article
Full-text available
This paper develops an analytical model of contagion in financial networks with arbitrary structure. We explore how the probability and potential impact of contagion is influenced by aggregate and idiosyncratic shocks, changes in network structure and asset market liquidity. Our findings suggest that financial systems exhibit a robust-yet-fragile tendency: while the probability of contagion may be low, the effects can be extremely widespread when problems occur. And we suggest why the resilience of the system in withstanding fairly large shocks prior to 2007 should not have been taken as a reliable guide to its future robustness.
Article
Full-text available
A number of centrality measures are available to determine the relative importance of a node in a complex network, and betweenness is prominent among them. However, the existing centrality measures are not adequate in network percolation scenarios (such as during infection transmission in a social network of individuals, spreading of computer viruses on computer networks, or transmission of disease over a network of towns) because they do not account for the changing percolation states of individual nodes. We propose a new measure, percolation centrality, that quantifies relative impact of nodes based on their topological connectivity, as well as their percolation states. The measure can be extended to include random walk based definitions, and its computational complexity is shown to be of the same order as that of betweenness centrality. We demonstrate the usage of percolation centrality by applying it to a canonical network as well as simulated and real world scale-free and random networks.
Article
Full-text available
Matrix and tensor completion aim to recover a low-rank matrix / tensor from limited observations and have been commonly used in applications such as recommender systems and multi-relational data mining. A state-of-the-art matrix completion algorithm is Soft-Impute, which exploits the special "sparse plus low-rank" structure of the matrix iterates to allow efficient SVD in each iteration. Though Soft-Impute is a proximal algorithm, it is generally believed that acceleration destroys the special structure and is thus not useful. In this paper, we show that Soft-Impute can indeed be accelerated without comprising this structure. To further reduce the iteration time complexity, we propose an approximate singular value thresholding scheme based on the power method. Theoretical analysis shows that the proposed algorithm still enjoys the fast $O(1/T^2)$ convergence rate of accelerated proximal algorithms. We further extend the proposed algorithm to tensor completion with the scaled latent nuclear norm regularizer. We show that a similar "sparse plus low-rank" structure also exists, leading to low iteration complexity and fast $O(1/T^2)$ convergence rate. Extensive experiments demonstrate that the proposed algorithm is much faster than Soft-Impute and other state-of-the-art matrix and tensor completion algorithms.
Article
Full-text available
Scientific Reports 6 : Article number: 27823; 10.1038/srep27823 published online: 14 June 2016 ; updated: 25 August 2016 This Article contains errors in the Acknowledgements section.
Article
Full-text available
Identifying a set of influential spreaders in complex networks plays a crucial role in effective information spreading. A simple strategy is to choose top-$r$ ranked nodes as spreaders according to influence ranking method such as PageRank, ClusterRank and $k$-shell decomposition. Besides, some heuristic methods such as hill-climbing, SPIN, degree discount and independent set based are also proposed. However, these approaches suffer from a possibility that some spreaders are so close together that they overlap sphere of influence or time consuming. In this report, we present a simply yet effectively iterative method named VoteRank to identify a set of decentralized spreaders with the best spreading ability. In this approach, all nodes vote in a spreader in each turn, and the voting ability of neighbors of elected spreader will be decreased in subsequent turn. Experimental results on four real networks show that under Susceptible-Infected-Recovered (SIR) model, VoteRank outperforms the traditional benchmark methods on both spreading speed and final affected scale. What's more, VoteRank is also superior to other group-spreader identifying methods on computational time.
Article
Full-text available
We elaborate on a linear time implementation of the Collective Influence (CI) algorithm introduced by Morone, Makse, Nature 524, 65 (2015) to find the minimal set of influencers in a network via optimal percolation. We show that the computational complexity of CI is O(N log N) when removing nodes one-by-one, with N the number of nodes. This is made possible by using an appropriate data structure to process the CI values, and by the finite radius l of the CI sphere. Furthermore, we introduce a simple extension of CI when l is infinite, the CI propagation (CI_P) algorithm, that considers the global optimization of influence via message passing in the whole network and identifies a slightly smaller fraction of influencers than CI. Remarkably, CI_P is able to reproduce the exact analytical optimal percolation threshold obtained by Bau, Wormald, Random Struct. Alg. 21, 397 (2002) for cubic random regular graphs, leaving little improvement left for random graphs. We also introduce the Collective Immunization Belief Propagation algorithm (CI_BP), a belief-propagation (BP) variant of CI based on optimal immunization, which has the same performance as CI_P. However, this small augmented performance of the order of 1-2 % in the low influencers tail comes at the expense of increasing the computational complexity from O(N log N) to O(N^2 log N), rendering both, CI_P and CI_BP, prohibitive for finding influencers in modern-day big-data. The same nonlinear running time drawback pertains to a recently introduced BP-decimation (BPD) algorithm by Mugisha, Zhou, arXiv:1603.05781. For instance, we show that for big-data social networks of typically 200 million users (eg, active Twitter users sending 500 million tweets per day), CI finds the influencers in less than 3 hours running on a single CPU, while the BP algorithms (CI_P, CI_BP and BDP) would take more than 3,000 years to accomplish the same task.
Article
Full-text available
Recently, the abundance of digital data is enabling the implementation of graph-based ranking algorithms that provide system level analysis for ranking publications and authors. Here, we take advantage of the entire Physical Review publication archive (1893-2006) to construct authors' networks where weighted edges, as measured from opportunely normalized citation counts, define a proxy for the mechanism of scientific credit transfer. On this network, we define a ranking method based on a diffusion algorithm that mimics the spreading of scientific credits on the network. We compare the results obtained with our algorithm with those obtained by local measures such as the citation count and provide a statistical analysis of the assignment of major career awards in the area of physics. A website where the algorithm is made available to perform customized rank analysis can be found at the address http://www.physauthorsrank.org.
Article
Full-text available
The study of network disintegration has attracted much attention due to its wide applications, including suppressing the epidemic spreading, destabilizing terrorist network, preventing financial contagion, controlling the rumor diffusion and perturbing cancer networks. The crux of this matter is to find the critical nodes whose removal will lead to network collapse. This paper studies the disintegration of networks with incomplete link information. An effective method is proposed to find the critical nodes by the assistance of link prediction techniques. Extensive experiments in both synthetic and real networks suggest that, by using link prediction method to recover partial missing links in advance, the method can largely improve the network disintegration performance. Besides, to our surprise, we find that when the size of missing information is relatively small, our method even outperforms than the results based on complete information. We refer to this phenomenon as the “comic effect” of link prediction, which means that the network is reshaped through the addition of some links that identified by link prediction algorithms, and the reshaped network is like an exaggerated but characteristic comic of the original one, where the important parts are emphasized.
Article
Full-text available
Complex networks with inhomogeneous topology are very fragile to intentional attacks on the "hub nodes". It is very important and desirable to evaluate the node importance and find these "hub nodes". The networks agglomeration is defined firstly. A node contraction method of evaluation of node importance in complex networks is proposed based on a new evaluation criterion, i. e. the most important node is the one whose contraction results in the largest increase of the networks agglomeration. With the node contraction method, both degree and position of a node are considered and the disadvantage of node deletion method is avoided. An algorithm whose time complexity is O(n3) is proposed. Final experiments verify its efficiency.
Article
Full-text available
Identifying influential nodes in dynamical processes is crucial in understanding network structure and function. Degree, H-index and coreness are widely used metrics, but previously treated as unrelated. Here we show their relation by constructing an operator, in terms of which degree, H-index and coreness are the initial, intermediate and steady states of the sequences, respectively. We obtain a family of H-indices that can be used to measure a node's importance. We also prove that the convergence to coreness can be guaranteed even under an asynchronous updating process, allowing a decentralized local method of calculating a node's coreness in large-scale evolving networks. Numerical analyses of the susceptible-infected-removed spreading dynamics on disparate real networks suggest that the H-index is a good tradeoff that in many cases can better quantify node influence than either degree or coreness.
Article
Full-text available
PLAD (plasma doping) is promising for both evolutionary and revolutionary doping options because of its unique advantages which can overcome or minimize many of the issues of the beam-line (BL) based implants. In this talk, I present developments of PLAD on both planar and non-planar 3D device structures. Comparing with the conventional BL implants, PLAD shows not only a significant production enhancement, but also a significant device performance improvement and 3D structure doping capability, including an 80% contact resistance reduction, more than 25% drive current increase on planar devices, and 23% series resistance reduction, 25% drive current increase on non-planar 3D devices.
Article
Full-text available
In complex networks, it is of great theoretical and practical significance to identify a set of critical spreaders which help to control the spreading process. Some classic methods are proposed to identify multiple spreaders. However, they sometimes have limitations for the networks with community structure because many chosen spreaders may be clustered in a community. In this paper, we suggest a novel method to identify multiple spreaders from communities in a balanced way. The network is first divided into a great many super nodes and then k spreaders are selected from these super nodes. Experimental results on real and synthetic networks with community structure show that our method outperforms the classic methods for degree centrality, k-core and ClusterRank in most cases.
Article
Full-text available
Similarity is a fundamental measure in network analyses and machine learning algorithms, with wide applications ranging from personalized recommendation to socio-economic dynamics. We argue that an effective similarity measurement should guarantee the stability even under some information loss. With six bipartite networks, we investigate the stabilities of fifteen similarity measurements by comparing the similarity matrixes of two data samples which are randomly divided from original data sets. Results show that, the fifteen measurements can be well classified into three clusters according to their stabilities, and measurements in the same cluster have similar mathematical definitions. In addition, we develop a top-$n$-stability method for personalized recommendation, and find that the unstable similarities would recommend false information to users, and the performance of recommendation would be largely improved by using stable similarity measurements. This work provides a novel dimension to analyze and evaluate similarity measurements, which can further find applications in link prediction, personalized recommendation, clustering algorithms, community detection and so on.
Article
Full-text available
Background: Computational approaches aided by computer science have been used to predict essential proteins and are faster than expensive, time-consuming, laborious experimental approaches. However, the performance of such approaches is still poor, making practical applications of computational approaches difficult in some fields. Hence, the development of more suitable and efficient computing methods is necessary for identification of essential proteins. Method: In this paper, we propose a new method for predicting essential proteins in a protein interaction network, local interaction density combined with protein complexes (LIDC), based on statistical analyses of essential proteins and protein complexes. First, we introduce a new local topological centrality, local interaction density (LID), of the yeast PPI network; second, we discuss a new integration strategy for multiple bioinformatics. The LIDC method was then developed through a combination of LID and protein complex information based on our new integration strategy. The purpose of LIDC is discovery of important features of essential proteins with their neighbors in real protein complexes, thereby improving the efficiency of identification. Results: Experimental results based on three different PPI(protein-protein interaction) networks of Saccharomyces cerevisiae and Escherichia coli showed that LIDC outperformed classical topological centrality measures and some recent combinational methods. Moreover, when predicting MIPS datasets, the better improvement of performance obtained by LIDC is over all nine reference methods (i.e., DC, BC, NC, LID, PeC, CoEWC, WDC, ION, and UC). Conclusions: LIDC is more effective for the prediction of essential proteins than other recently developed methods.
Article
Full-text available
Social networks constitute a new platform for information propagation, but its success is crucially dependent on the choice of spreaders who initiate the spreading of information. In this paper, we remove edges in a network at random and the network segments into isolated clusters. The most important nodes in each cluster then form a group of influential spreaders, such that news propagating from them would lead to an extensive coverage and minimal redundancy. The method well utilizes the similarities between the pre-percolated state and the coverage of information propagation in each social cluster to obtain a set of distributed and coordinated spreaders. Our tests on the Facebook networks show that this method outperforms conventional methods based on centrality. The suggested way of identifying influential spreaders thus sheds light on a new paradigm of information propagation on social networks.
Article
Full-text available
Most centralities proposed for identifying influential spreaders on social networks to either spread a message or to stop an epidemic require the full topological information of the network on which spreading occurs. In practice, however, collecting all connections between agents in social networks can be hardly achieved. As a result, such metrics could be difficult to apply to real social networks. Consequently, a new approach for identifying influential people without the explicit network information is demanded in order to provide an efficient immunization or spreading strategy, in a practical sense. In this study, we seek a possible way for finding influential spreaders by using the social mechanisms of how social connections are formed in real networks. We find that a reliable immunization scheme can be achieved by asking people how they interact with each other. From these surveys we find that the probabilistic tendency to connect to a hub has the strongest predictive power for influential spreaders among tested social mechanisms. Our observation also suggests that people who connect different communities is more likely to be an influential spreader when a network has a strong modular structure. Our finding implies that not only the effect of network location but also the behavior of individuals is important to design optimal immunization or spreading schemes.
Article
Full-text available
The whole frame of interconnections in complex networks hinges on a specific set of structural nodes, much smaller than the total size, which, if activated, would cause the spread of information to the whole network [1]; or, if immunized, would prevent the diffusion of a large scale epidemic [2,3]. Localizing this optimal, i.e. minimal, set of structural nodes, called influencers, is one of the most important problems in network science [4,5]. Despite the vast use of heuristic strategies to identify influential spreaders [6-14], the problem remains unsolved. Here, we map the problem onto optimal percolation in random networks to identify the minimal set of influencers, which arises by minimizing the energy of a many-body system, where the form of the interactions is fixed by the non-backtracking matrix [15] of the network. Big data analyses reveal that the set of optimal influencers is much smaller than the one predicted by previous heuristic centralities. Remarkably, a large number of previously neglected weakly-connected nodes emerges among the optimal influencers. These are topologically tagged as low-degree nodes surrounded by hierarchical coronas of hubs, and are uncovered only through the optimal collective interplay of all the influencers in the network. Eventually, the present theoretical framework may hold a larger degree of universality, being applicable to other hard optimization problems exhibiting a continuous transition from a known phase [16].
Article
Full-text available
Recent study shows that the accuracy of the k-shell method in determining node coreness in a spreading process is largely impacted due to the existence of core-like group, which has a large k-shell index but a low spreading efficiency. Based on analysis of the structure of core-like groups in real-world networks, we discover that nodes in the core-like group are mutually densely connected with very few out-leaving links from the group. By defining a measure of diffusion importance for each edge based on the number of out-leaving links of its both ends, we are able to identify redundant links in the spreading process, which have a relatively low diffusion importance but lead to form the locally densely connected core-like group. After filtering out the redundant links and applying the k-shell method to the residual network, we obtain a renewed coreness for each node which is a more accurate index to indicate its location importance and spreading influence in the original network. Moreover, we find that the performance of the ranking algorithms based on the renewed coreness are also greatly enhanced. Our findings help to more accurately decompose the network core structure and identify influential nodes in spreading processes.
Article
Full-text available
Identifying the most influential spreaders is an important issue in understanding and controlling spreading processes on complex networks. Recent studies showed that nodes located in the core of a network as identified by the k-shell decomposition are the most influential spreaders. However, through a great deal of numerical simulations, we observe that not in all real networks do nodes in high shells are very influential: in some networks the core nodes are the most influential which we call true core, while in others nodes in high shells, even the innermost core, are not good spreaders which we call core-like group. By analyzing the k-core structure of the networks, we find that the true core of a network links diversely to the shells of the network, while the core-like group links very locally within the group. For nodes in the core-like group, the k-shell index cannot reflect their location importance in the network. We further introduce a measure based on the link diversity of shells to effectively distinguish the true core and core-like group, and identify core-like groups throughout the networks. Our findings help to better understand the structural features of real networks and influential nodes.
Article
Nodes in real-world networks organize into densely linked communities where edges appear with high concentration among the members of the community. Identifying such communities of nodes has proven to be a challenging task due to a plethora of definitions of network communities, intractability of methods for detecting them, and the issues with evaluation which stem from the lack of a reliable gold-standard ground-truth. In this paper, we distinguish between structural and functional definitions of network communities. Structural definitions of communities are based on connectivity patterns, like the density of connections between the community members, while functional definitions are based on (often unobserved) common function or role of the community members in the network. We argue that the goal of network community detection is to extract functional communities based on the connectivity structure of the nodes in the network. We then identify networks with explicitly labeled functional communities to which we refer as ground-truth communities. In particular, we study a set of 230 large real-world social, collaboration, and information networks where nodes explicitly state their community memberships. For example, in social networks, nodes explicitly join various interest-based social groups. We use such social groups to define a reliable and robust notion of ground-truth communities. We then propose a methodology, which allows us to compare and quantitatively evaluate how different structural definitions of communities correspond to ground-truth functional communities. We study 13 commonly used structural definitions of communities and examine their sensitivity, robustness and performance in identifying the ground-truth. We show that the 13 structural definitions are heavily correlated and naturally group into four classes. We find that two of these definitions, Conductance and Triad participation ratio, consistently give the best performance in identifying ground-truth communities. We also investigate a task of detecting communities given a single seed node. We extend the local spectral clustering algorithm into a heuristic parameter-free community detection method that easily scales to networks with more than 100 million nodes. The proposed method achieves 30 % relative improvement over current local clustering methods.
Book
Complex networks such as the Internet, WWW, transportation networks, power grids, biological neural networks, and scientific cooperation networks of all kinds provide challenges for future technological development. The first systematic presentation of dynamical evolving networks, with many up-to-date applications and homework projects to enhance study. The authors are all very active and well-known in the rapidly evolving field of complex networks. Complex networks are becoming an increasingly important area of research. Presented in a logical, constructive style, from basic through to complex, examining algorithms, through to construct networks and research challenges of the future.
Article
In this paper, we propose a network performance/efficiency measure for the evaluation of financial networks with intermediation. The measure captures risk, transaction cost, price, transaction flow, revenue, and demand information in the context of the decision-makers' behavior in multitiered financial networks that also allow for electronic transactions. The measure is then utilized to define the importance of a financial network component, that is, a node or a link, or a combination of nodes and links. Numerical examples are provided in which the efficiency of the financial network is computed along with the importance ranking of the nodes and links. The results in this paper can be used to assess which nodes and links in financial networks are the most vulnerable in the sense that their removal will impact the efficiency of the network in the most significant way. Hence, the results in this paper have relevance to national security as well as implications for the insurance industry.
Article
The book that launched the Dempster–Shafer theory of belief functions appeared 40 years ago. This intellectual autobiography looks back on how I came to write the book and how its ideas played out in my later work.
Article
The intuitive background for measures of structural centrality in social networks is reviewed and existing measures are evaluated in terms of their consistency with intuitions and their interpretability.
Article
We implement a novel method to detect systemically important financial institutions in a network. The method consists in a simple model of distress and losses redistribution derived from the interaction of banks' balance-sheets through bilateral exposures. The algorithm goes beyond the traditional default-cascade mechanism, according to which contagion propagates only through banks that actually default. We argue that even in the absence of other defaults, distressed-but-non-defaulting institutions transmit the contagion through channels other than solvency: weakness in their balance sheet reduces the value of their liabilities, thereby negatively affecting their interbank lenders even before a credit event occurs. In this paper, we apply the methodology to a unique dataset covering bilateral exposures among all Italian banks in the period 2008-2012. We find that the systemic impact of individual banks has decreased over time since 2008. The result can be traced back to decreasing volumes in the interbank market and to an intense recapitalization process. We show that the marginal effect of a bank's capital on its contribution to systemic risk in the network is considerably larger when interconnectedness is high (good times): this finding supports the regulatory work on counter-cyclical (macroprudential) capital buffers.
Article
The structure characters of weighted complex networks are analysed. The effect of the edge-weight on estimation of node importance is calculated. A new definition of weighted node importance is proposed, and an improved node contraction method in weighted networks is given based on the evaluation criterion, i.e. the most important node is the one whose contraction results are the largest increase of the weighted networks agglomeration. The time complexity of this algorithm is O(n 3), and the improved evaluation method can help exactly to find some critical nodes in complex networks. Final experiments verify the efficiency and feasibility of the proposed method.
Article
In order to quantitatively calculate the invulnerability of the communication network, taking fully connected network as a reference, an evaluation method based on disjoint paths in topology is proposed to define the index of the invulnerability and the vitality of node and link. Meanwhile, a method for calculating the disjoint paths is proposed. The index of the invulnerability is obtained by calculating the ratio of the disjomt paths of the nodes for both target network and fully connected network. Furthermore, according to the size of the value of the invulnerability index in condition of node or link failure, the importance of node and link is evaluated. The correctness and the time and space complexity of the proposed method are discussed. By giving an example and comparing with the evaluation method based on the shortest paths, it is indicated that the proposed method is more reasonable and is better for reflecting the actual communication network performance.
Article
Online social networks became a remarkable development with wonderful social as well as economic impact within the last decade. Currently the most famous online social network, Facebook, counts more than one billion monthly active users across the globe. Therefore, online social networks attract a great deal of attention among practitioners as well as research communities. Taken together with the huge value of information that online social networks hold, numerous online social networks have been consequently valued at billions of dollars. Hence, a combination of this technical and social phenomenon has evolved worldwide with increasing socioeconomic impact. Online social networks can play important role in viral marketing techniques, due to their power in increasing the functioning of web search, recommendations in various filtering systems, scattering a technology (product) very quickly in the market. In online social networks, among all nodes, it is interesting and important to identify a node which can affect the behaviour of their neighbours; we call such node as Influential node. The main objective of this paper is to provide an overview of various techniques for Influential User identification. The paper also includes some techniques that are based on structural properties of online social networks and those techniques based on content published by the users of social network.
Article
Large-scale websites are predominantly built as a service-oriented architecture. Here, services are specialized for a certain task, run on multiple machines, and communicate with each other to serve a user's request. An anomalous change in a metric of one service can propagate to other services during this communication, resulting in overall degradation of the request. As any such degradation is revenue impacting, maintaining correct functionality is of paramount concern: it is important to find the root cause of any anomaly as quickly as possible. This is challenging because there are numerous metrics or sensors for a given service, and a modern website is usually composed of hundreds of services running on thousands of machines in multiple data centers. This paper introduces MonitorRank, an algorithm that can reduce the time, domain knowledge, and human effort required to find the root causes of anomalies in such service-oriented architectures. In the event of an anomaly, MonitorRank provides a ranked order list of possible root causes for monitoring teams to investigate. MonitorRank uses the historical and current time-series metrics of each sensor as its input, along with the call graph generated between sensors to build an unsupervised model for ranking. Experiments on real production outage data from LinkedIn, one of the largest online social networks, shows a 26% to 51% improvement in mean average precision in finding root causes compared to baseline and current state-of-the-art methods.
Article
Systems as diverse as genetic networks or the World Wide Web are best described as networks with complex topology. A common property of many large networks is that the vertex connectivities follow a scale-free power-law distribution. This feature was found to be a consequence of two generic mech-anisms: (i) networks expand continuously by the addition of new vertices, and (ii) new vertices attach preferentially to sites that are already well connected. A model based on these two ingredients reproduces the observed stationary scale-free distributions, which indicates that the development of large networks is governed by robust self-organizing phenomena that go beyond the particulars of the individual systems.
Article
It is a fundamental and important issue to identify influential, nodes in complex network. In the existing evidential, semi-local centrality, it, modified the evidential centrality according to the actual, degree distribution, but, the topological, connections among the neighbors of a. node in weighted network are not taken into account. In this paper, a novel measure called evidential local structure centrality is proposed to identify influential nodes. Firstly, the value of modified evidential centrality is calculated by taking actual, degree distribution. Secondly, local structure centrality combined with modified evidential centrality is extended to be applied in weighted networks. Then, in order to evaluate the performance of the proposed method, we use the susceptible-infected-recovered (SIR) model, and susceptible-infected (SI) model to simulate the spreading process on real networks. Experiment results show that, our method is effective and efficient, to identify influential nodes.
Chapter
Introduction Voltage Delivered from a Source to a Load Power Delivered from a Source to a Load Impedance Conjugate Matching Additional Effect of Impedance Matching Appendices Reference Further Reading Exercises Answers
Article
Structural hole nodes in complex networks play important roles in the network information diffusion. Unfortunately, most of the existing methods of ranking key nodes do not integrate structural hole nodes and other key nodes. According to the relevant research on structural hole theory as well as the key node ranking methods, network constraint coefficient, betweenness centrality, hierarchy, efficiently, network size, PageRank and clustering coefficient, 7 metrics are selected to rank the key nodes. Based on the 7 metrics, a ranking learning method based on ListNet is introduced to solve ranking key nodes by multi metrics. Comprehensive experiments are conducted based on different artificial networks and real complex networks. Experimental results with manual annotation show that the ranking method can comprehensively consider the structural hole nodes and other nodes with different important features. The ranking results on different networks are highly consistent with the manual ranking results. The spreading experiment results using signed to interference ratio propagation model show that SIR model can reach a maximum propagating ratio in a shorter propagating time initiated by TOP-K key nodes selected by our method than TOP-K key nodes selected by other methods. ©, 2015, Institute of Physics, Chinese Academy of Sciences. All right reserved.