首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background

Numerous centrality measures have been introduced to identify “central” nodes in large networks. The availability of a wide range of measures for ranking influential nodes leaves the user to decide which measure may best suit the analysis of a given network. The choice of a suitable measure is furthermore complicated by the impact of the network topology on ranking influential nodes by centrality measures. To approach this problem systematically, we examined the centrality profile of nodes of yeast protein-protein interaction networks (PPINs) in order to detect which centrality measure is succeeding in predicting influential proteins. We studied how different topological network features are reflected in a large set of commonly used centrality measures.

Results

We used yeast PPINs to compare 27 common of centrality measures. The measures characterize and assort influential nodes of the networks. We applied principal component analysis (PCA) and hierarchical clustering and found that the most informative measures depend on the network’s topology. Interestingly, some measures had a high level of contribution in comparison to others in all PPINs, namely Latora closeness, Decay, Lin, Freeman closeness, Diffusion, Residual closeness and Average distance centralities.

Conclusions

The choice of a suitable set of centrality measures is crucial for inferring important functional properties of a network. We concluded that undertaking data reduction using unsupervised machine learning methods helps to choose appropriate variables (centrality measures). Hence, we proposed identifying the contribution proportions of the centrality measures with PCA as a prerequisite step of network analysis before inferring functional consequences, e.g., essentiality of a node.
  相似文献   

2.
In complex networks, it is of great theoretical and practical significance to identify a set of critical spreaders which help to control the spreading process. Some classic methods are proposed to identify multiple spreaders. However, they sometimes have limitations for the networks with community structure because many chosen spreaders may be clustered in a community. In this paper, we suggest a novel method to identify multiple spreaders from communities in a balanced way. The network is first divided into a great many super nodes and then k spreaders are selected from these super nodes. Experimental results on real and synthetic networks with community structure show that our method outperforms the classic methods for degree centrality, k-core and ClusterRank in most cases.  相似文献   

3.
4.
Protein networks, describing physical interactions as well as functional associations between proteins, have been unravelled for many organisms in the recent past. Databases such as the STRING provide excellent resources for the analysis of such networks. In this contribution, we revisit the organisation of protein networks, particularly the centrality–lethality hypothesis, which hypothesises that nodes with higher centrality in a network are more likely to produce lethal phenotypes on removal, compared to nodes with lower centrality. We consider the protein networks of a diverse set of 20 organisms, with essentiality information available in the Database of Essential Genes and assess the relationship between centrality measures and lethality. For each of these organisms, we obtained networks of high-confidence interactions from the STRING database, and computed network parameters such as degree, betweenness centrality, closeness centrality and pairwise disconnectivity indices. We observe that the networks considered here are predominantly disassortative. Further, we observe that essential nodes in a network have a significantly higher average degree and betweenness centrality, compared to the network average. Most previous studies have evaluated the centrality–lethality hypothesis for Saccharomyces cerevisiae and Escherichia coli; we here observe that the centrality–lethality hypothesis hold goods for a large number of organisms, with certain limitations. Betweenness centrality may also be a useful measure to identify essential nodes, but measures like closeness centrality and pairwise disconnectivity are not significantly higher for essential nodes.  相似文献   

5.

Background

Living systems are associated with Social networks — networks made up of nodes, some of which may be more important in various aspects as compared to others. While different quantitative measures labeled as “centralities” have previously been used in the network analysis community to find out influential nodes in a network, it is debatable how valid the centrality measures actually are. In other words, the research question that remains unanswered is: how exactly do these measures perform in the real world? So, as an example, if a centrality of a particular node identifies it to be important, is the node actually important?

Purpose

The goal of this paper is not just to perform a traditional social network analysis but rather to evaluate different centrality measures by conducting an empirical study analyzing exactly how do network centralities correlate with data from published multidisciplinary network data sets.

Method

We take standard published network data sets while using a random network to establish a baseline. These data sets included the Zachary''s Karate Club network, dolphin social network and a neural network of nematode Caenorhabditis elegans. Each of the data sets was analyzed in terms of different centrality measures and compared with existing knowledge from associated published articles to review the role of each centrality measure in the determination of influential nodes.

Results

Our empirical analysis demonstrates that in the chosen network data sets, nodes which had a high Closeness Centrality also had a high Eccentricity Centrality. Likewise high Degree Centrality also correlated closely with a high Eigenvector Centrality. Whereas Betweenness Centrality varied according to network topology and did not demonstrate any noticeable pattern. In terms of identification of key nodes, we discovered that as compared with other centrality measures, Eigenvector and Eccentricity Centralities were better able to identify important nodes.  相似文献   

6.
Identifying influential spreaders in networks, which contributes to optimizing the use of available resources and efficient spreading of information, is of great theoretical significance and practical value. A random-walk-based algorithm LeaderRank has been shown as an effective and efficient method in recognizing leaders in social network, which even outperforms the well-known PageRank method. As LeaderRank is initially developed for binary directed networks, further extensions should be studied in weighted networks. In this paper, a generalized algorithm PhysarumSpreader is proposed by combining LeaderRank with a positive feedback mechanism inspired from an amoeboid organism called Physarum Polycephalum. By taking edge weights into consideration and adding the positive feedback mechanism, PhysarumSpreader is applicable in both directed and undirected networks with weights. By taking two real networks for examples, the effectiveness of the proposed method is demonstrated by comparing with other standard centrality measures.  相似文献   

7.
Studies of complex networks show that nodes with high centrality scores are important to network structure and stability. Following this rationale, centrality measures can be used to (i) identify keystone species in ecological networks, a major issue in community ecology, and (ii) differentiate the keystone species concept, e.g. species may play a key role in a network for different topological reasons. In 34 pollination communities we examine the relationship between the generalization level of species (ND) and two complementary centrality indices: closeness (CC) and betweenness centrality (BC). CC measures the proximity of a species to all other species in the community, while BC describes the importance of a species as a connector. Most networks had a linear NDCC relationship with a minimum CC value of 0.41. Hence, species were close to each and will be likely to be rapidly affected by disturbances. Contrarily, in most networks, the NDBC relationships were power-law distributed with exponents larger than one. Only 59% of the species were connectors (BC > 0). In particular, there was a connector threshold value of ND = 0.46. Species above this threshold represent ~40%, almost all of which were connectors. These results indicate that in pollination systems the most generalized species are usually network keystone species, playing at least two roles: (i) interact closely with most other species (high CC) and (ii) connect otherwise unconnected subnetworks (high BC). We discuss the implications of centrality measures to community-based conservation ecology.  相似文献   

8.
Essential proteins are indispensable for living organisms to maintain life activities and play important roles in the studies of pathology, synthetic biology, and drug design. Therefore, besides experiment methods, many computational methods are proposed to identify essential proteins. Based on the centrality-lethality rule, various centrality methods are employed to predict essential proteins in a Protein-protein Interaction Network (PIN). However, neglecting the temporal and spatial features of protein-protein interactions, the centrality scores calculated by centrality methods are not effective enough for measuring the essentiality of proteins in a PIN. Moreover, many methods, which overfit with the features of essential proteins for one species, may perform poor for other species. In this paper, we demonstrate that the centrality-lethality rule also exists in Protein Subcellular Localization Interaction Networks (PSLINs). To do this, a method based on Localization Specificity for Essential protein Detection (LSED), was proposed, which can be combined with any centrality method for calculating the improved centrality scores by taking into consideration PSLINs in which proteins play their roles. In this study, LSED was combined with eight centrality methods separately to calculate Localization-specific Centrality Scores (LCSs) for proteins based on the PSLINs of four species (Saccharomyces cerevisiae, Homo sapiens, Mus musculus and Drosophila melanogaster). Compared to the proteins with high centrality scores measured from the global PINs, more proteins with high LCSs measured from PSLINs are essential. It indicates that proteins with high LCSs measured from PSLINs are more likely to be essential and the performance of centrality methods can be improved by LSED. Furthermore, LSED provides a wide applicable prediction model to identify essential proteins for different species.  相似文献   

9.
Disease epidemic outbreaks on human metapopulation networks are often driven by a small number of superspreader nodes, which are primarily responsible for spreading the disease throughout the network. Superspreader nodes typically are characterized either by their locations within the network, by their degree of connectivity and centrality, or by their habitat suitability for the disease, described by their reproduction number (R). Here we introduce a model that considers simultaneously the effects of network properties and R on superspreaders, as opposed to previous research which considered each factor separately. This type of model is applicable to diseases for which habitat suitability varies by climate or land cover, and for direct transmitted diseases for which population density and mitigation practices influences R. We present analytical models that quantify the superspreader capacity of a population node by two measures: probability-dependent superspreader capacity, the expected number of neighboring nodes to which the node in consideration will randomly spread the disease per epidemic generation, and time-dependent superspreader capacity, the rate at which the node spreads the disease to each of its neighbors. We validate our analytical models with a Monte Carlo analysis of repeated stochastic Susceptible-Infected-Recovered (SIR) simulations on randomly generated human population networks, and we use a random forest statistical model to relate superspreader risk to connectivity, R, centrality, clustering, and diffusion. We demonstrate that either degree of connectivity or R above a certain threshold are sufficient conditions for a node to have a moderate superspreader risk factor, but both are necessary for a node to have a high-risk factor. The statistical model presented in this article can be used to predict the location of superspreader events in future epidemics, and to predict the effectiveness of mitigation strategies that seek to reduce the value of R, alter host movements, or both.  相似文献   

10.
Identifying influential nodes in very large-scale directed networks is a big challenge relevant to disparate applications, such as accelerating information propagation, controlling rumors and diseases, designing search engines, and understanding hierarchical organization of social and biological networks. Known methods range from node centralities, such as degree, closeness and betweenness, to diffusion-based processes, like PageRank and LeaderRank. Some of these methods already take into account the influences of a node’s neighbors but do not directly make use of the interactions among it’s neighbors. Local clustering is known to have negative impacts on the information spreading. We further show empirically that it also plays a negative role in generating local connections. Inspired by these facts, we propose a local ranking algorithm named ClusterRank, which takes into account not only the number of neighbors and the neighbors’ influences, but also the clustering coefficient. Subject to the susceptible-infected-recovered (SIR) spreading model with constant infectivity, experimental results on two directed networks, a social network extracted from delicious.com and a large-scale short-message communication network, demonstrate that the ClusterRank outperforms some benchmark algorithms such as PageRank and LeaderRank. Furthermore, ClusterRank can also be applied to undirected networks where the superiority of ClusterRank is significant compared with degree centrality and k-core decomposition. In addition, ClusterRank, only making use of local information, is much more efficient than global methods: It takes only 191 seconds for a network with about nodes, more than 15 times faster than PageRank.  相似文献   

11.
Different species are of different importance in maintaining ecosystem functions in natural communities. Quantitative approaches are needed to identify unusually important or influential, ‘keystone’ species particularly for conservation purposes. Since the importance of some species may largely be the consequence of their rich interaction structure, one possible quantitative approach to identify the most influential species is to study their position in the network of interspecific interactions. In this paper, I discuss the role of network analysis (and centrality indices in particular) in this process and present a new and simple approach to characterizing the interaction structures of each species in a complex network. Understanding the linkage between structure and dynamics is a condition to test the results of topological studies, I briefly overview our current knowledge on this issue. The study of key nodes in networks has become an increasingly general interest in several disciplines: I will discuss some parallels. Finally, I will argue that conservation biology needs to devote more attention to identify and conserve keystone species and relatively less attention to rarity.  相似文献   

12.
Yang J  Chen Y 《PloS one》2011,6(7):e22557
Betweenness centrality is an essential index for analysis of complex networks. However, the calculation of betweenness centrality is quite time-consuming and the fastest known algorithm uses O(N(M + N log N)) time and O(N + M) space for weighted networks, where N and M are the number of nodes and edges in the network, respectively. By inserting virtual nodes into the weighted edges and transforming the shortest path problem into a breadth-first search (BFS) problem, we propose an algorithm that can compute the betweenness centrality in O(wDN2) time for integer-weighted networks, where w is the average weight of edges and D is the average degree in the network. Considerable time can be saved with the proposed algorithm when w < log N/D + 1, indicating that it is suitable for lightly weighted large sparse networks. A similar concept of virtual node transformation can be used to calculate other shortest path based indices such as closeness centrality, graph centrality, stress centrality, and so on. Numerical simulations on various randomly generated networks reveal that it is feasible to use the proposed algorithm in large network analysis.  相似文献   

13.
It is a classic topic of social network analysis to evaluate the importance of nodes and identify the node that takes on the role of core or bridge in a network. Because a single indicator is not sufficient to analyze multiple characteristics of a node, it is a natural solution to apply multiple indicators that should be selected carefully. An intuitive idea is to select some indicators with weak correlations to efficiently assess different characteristics of a node. However, this paper shows that it is much better to select the indicators with strong correlations. Because indicator correlation is based on the statistical analysis of a large number of nodes, the particularity of an important node will be outlined if its indicator relationship doesn''t comply with the statistical correlation. Therefore, the paper selects the multiple indicators including degree, ego-betweenness centrality and eigenvector centrality to evaluate the importance and the role of a node. The importance of a node is equal to the normalized sum of its three indicators. A candidate for core or bridge is selected from the great degree nodes or the nodes with great ego-betweenness centrality respectively. Then, the role of a candidate is determined according to the difference between its indicators'' relationship with the statistical correlation of the overall network. Based on 18 real networks and 3 kinds of model networks, the experimental results show that the proposed methods perform quite well in evaluating the importance of nodes and in identifying the node role.  相似文献   

14.
15.

Background

Experimental methods for the identification of essential proteins are always costly, time-consuming, and laborious. It is a challenging task to find protein essentiality only through experiments. With the development of high throughput technologies, a vast amount of protein-protein interactions are available, which enable the identification of essential proteins from the network level. Many computational methods for such task have been proposed based on the topological properties of protein-protein interaction (PPI) networks. However, the currently available PPI networks for each species are not complete, i.e. false negatives, and very noisy, i.e. high false positives, network topology-based centrality measures are often very sensitive to such noise. Therefore, exploring robust methods for identifying essential proteins would be of great value.

Method

In this paper, a new essential protein discovery method, named CoEWC (Co-Expression Weighted by Clustering coefficient), has been proposed. CoEWC is based on the integration of the topological properties of PPI network and the co-expression of interacting proteins. The aim of CoEWC is to capture the common features of essential proteins in both date hubs and party hubs. The performance of CoEWC is validated based on the PPI network of Saccharomyces cerevisiae. Experimental results show that CoEWC significantly outperforms the classical centrality measures, and that it also outperforms PeC, a newly proposed essential protein discovery method which outperforms 15 other centrality measures on the PPI network of Saccharomyces cerevisiae. Especially, when predicting no more than 500 proteins, even more than 50% improvements are obtained by CoEWC over degree centrality (DC), a better centrality measure for identifying protein essentiality.

Conclusions

We demonstrate that more robust essential protein discovery method can be developed by integrating the topological properties of PPI network and the co-expression of interacting proteins. The proposed centrality measure, CoEWC, is effective for the discovery of essential proteins.  相似文献   

16.
In this paper, we present algorithms to find near-optimal sets of epidemic spreaders in complex networks. We extend the notion of local-centrality, a centrality measure previously shown to correspond with a node''s ability to spread an epidemic, to sets of nodes by introducing combinatorial local centrality. Though we prove that finding a set of nodes that maximizes this new measure is NP-hard, good approximations are available. We show that a strictly greedy approach obtains the best approximation ratio unless P = NP and then formulate a modified version of this approach that leverages qualities of the network to achieve a faster runtime while maintaining this theoretical guarantee. We perform an experimental evaluation on samples from several different network structures which demonstrate that our algorithm maximizes combinatorial local centrality and consistently chooses the most effective set of nodes to spread infection under the SIR model, relative to selecting the top nodes using many common centrality measures. We also demonstrate that the optimized algorithm we develop scales effectively.  相似文献   

17.
Recently, the dependence group has been proposed to study the robustness of networks with interdependent nodes. A dependence group means that a failed node in the group can lead to the failures of the whole group. Considering the situation of real networks that one failed node may not always break the functionality of a dependence group, we study a cascading failure model that a dependence group fails only when more than a fraction β of nodes of the group fail. We find that the network becomes more robust with the increasing of the parameter β. However, the type of percolation transition is always first order unless the model reduces to the classical network percolation model, which is independent of the degree distribution of the network. Furthermore, we find that a larger dependence group size does not always make the networks more fragile. We also present exact solutions to the size of the giant component and the critical point, which are in agreement with the simulations well.  相似文献   

18.
19.
A core comprises of a group of central and densely connected nodes which governs the overall behaviour of a network. It is recognised as one of the key meso-scale structures in complex networks. Profiling this meso-scale structure currently relies on a limited number of methods which are often complex and parameter dependent or require a null model. As a result, scalability issues are likely to arise when dealing with very large networks together with the need for subjective adjustment of parameters. The notion of a rich-club describes nodes which are essentially the hub of a network, as they play a dominating role in structural and functional properties. The definition of a rich-club naturally emphasises high degree nodes and divides a network into two subgroups. Here, we develop a method to characterise a rich-core in networks by theoretically coupling the underlying principle of a rich-club with the escape time of a random walker. The method is fast, scalable to large networks and completely parameter free. In particular, we show that the evolution of the core in World Trade and C. elegans networks correspond to responses to historical events and key stages in their physical development, respectively.  相似文献   

20.
Statistical properties of the static networks have been extensively studied. However, online social networks are evolving dynamically, understanding the evolving characteristics of the core is one of major concerns in online social networks. In this paper, we empirically investigate the evolving characteristics of the Facebook core. Firstly, we separate the Facebook-link(FL) and Facebook-wall(FW) datasets into 28 snapshots in terms of timestamps. By employing the k-core decomposition method to identify the core of each snapshot, we find that the core sizes of the FL and FW networks approximately contain about 672 and 373 nodes regardless of the exponential growth of the network sizes. Secondly, we analyze evolving topological properties of the core, including the k-core value, assortative coefficient, clustering coefficient and the average shortest path length. Empirical results show that nodes in the core are getting more interconnected in the evolving process. Thirdly, we investigate the life span of nodes belonging to the core. More than 50% nodes stay in the core for more than one year, and 19% nodes always stay in the core from the first snapshot. Finally, we analyze the connections between the core and the whole network, and find that nodes belonging to the core prefer to connect nodes with high k-core values, rather than the high degrees ones. This work could provide new insights into the online social network analysis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号