首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Information spreading in online social communities has attracted tremendous attention due to its utmost practical values in applications. Despite that several individual-level diffusion data have been investigated, we still lack the detailed understanding of the spreading pattern of information. Here, by comparing information flows and social links in a blog community, we find that the diffusion processes are induced by three different spreading mechanisms: social spreading, self-promotion and broadcast. Although numerous previous studies have employed epidemic spreading models to simulate information diffusion, we observe that such models fail to reproduce the realistic diffusion pattern. In respect to users behaviors, strikingly, we find that most users would stick to one specific diffusion mechanism. Moreover, our observations indicate that the social spreading is not only crucial for the structure of diffusion trees, but also capable of inducing more subsequent individuals to acquire the information. Our findings suggest new directions for modeling of information diffusion in social systems, and could inform design of efficient propagation strategies based on users behaviors.  相似文献   

2.
Inferring disease transmission networks is important in epidemiology in order to understand and prevent the spread of infectious diseases. Reconstruction of the infection transmission networks requires insight into viral genome data as well as social interactions. For the HIV-1 epidemic, current research either uses genetic information of patients'' virus to infer the past infection events or uses statistics of sexual interactions to model the network structure of viral spreading. Methods for a reliable reconstruction of HIV-1 transmission dynamics, taking into account both molecular and societal data are still lacking. The aim of this study is to combine information from both genetic and epidemiological scales to characterize and analyse a transmission network of the HIV-1 epidemic in central Italy.We introduce a novel filter-reduction method to build a network of HIV infected patients based on their social and treatment information. The network is then combined with a genetic network, to infer a hypothetical infection transmission network. We apply this method to a cohort study of HIV-1 infected patients in central Italy and find that patients who are highly connected in the network have longer untreated infection periods. We also find that the network structures for homosexual males and heterosexual populations are heterogeneous, consisting of a majority of ‘peripheral nodes’ that have only a few sexual interactions and a minority of ‘hub nodes’ that have many sexual interactions. Inferring HIV-1 transmission networks using this novel combined approach reveals remarkable correlations between high out-degree individuals and longer untreated infection periods. These findings signify the importance of early treatment and support the potential benefit of wide population screening, management of early diagnoses and anticipated antiretroviral treatment to prevent viral transmission and spread. The approach presented here for reconstructing HIV-1 transmission networks can have important repercussions in the design of intervention strategies for disease control.  相似文献   

3.
The number of people using online social networks in their everyday life is continuously growing at a pace never saw before. This new kind of communication has an enormous impact on opinions, cultural trends, information spreading and even in the commercial success of new products. More importantly, social online networks have revealed as a fundamental organizing mechanism in recent country-wide social movements. In this paper, we provide a quantitative analysis of the structural and dynamical patterns emerging from the activity of an online social network around the ongoing May 15th (15M) movement in Spain. Our network is made up by users that exchanged tweets in a time period of one month, which includes the birth and stabilization of the 15M movement. We characterize in depth the growth of such dynamical network and find that it is scale-free with communities at the mesoscale. We also find that its dynamics exhibits typical features of critical systems such as robustness and power-law distributions for several quantities. Remarkably, we report that the patterns characterizing the spreading dynamics are asymmetric, giving rise to a clear distinction between information sources and sinks. Our study represents a first step towards the use of data from online social media to comprehend modern societal dynamics.  相似文献   

4.
The rapid growth of social network data has given rise to high security awareness among users, especially when they exchange and share their personal information. However, because users have different feelings about sharing their information, they are often puzzled about who their partners for exchanging information can be and what information they can share. Is it possible to assist users in forming a partnership network in which they can exchange and share information with little worry? We propose a modified information sharing behavior prediction (ISBP) model that can help in understanding the underlying rules by which users share their information with partners in light of three common aspects: what types of items users are likely to share, what characteristics of users make them likely to share information, and what features of users’ sharing behavior are easy to predict. This model is applied with machine learning techniques in WEKA to predict users’ decisions pertaining to information sharing behavior and form them into trustable partnership networks by learning their features. In the experiment section, by using two real-life datasets consisting of citizens’ sharing behavior, we identify the effect of highly sensitive requests on sharing behavior adjacent to individual variables: the younger participants’ partners are more difficult to predict than those of the older participants, whereas the partners of people who are not computer majors are easier to predict than those of people who are computer majors. Based on these findings, we believe that it is necessary and feasible to offer users personalized suggestions on information sharing decisions, and this is pioneering work that could benefit college researchers focusing on user-centric strategies and website owners who want to collect more user information without raising their privacy awareness or losing their trustworthiness.  相似文献   

5.
As an archive of sequence data for over 165,000 species, GenBank is an indispensable resource for phylogenetic inference. Here we describe an informatics processing pipeline and online database, the PhyLoTA Browser (http://loco.biosci.arizona.edu/pb), which offers a view of GenBank tailored for molecular phylogenetics. The first release of the Browser is computed from 2.6 million sequences representing the taxonomically enriched subset of GenBank sequences for eukaryotes (excluding most genome survey sequences, ESTs, and other high-throughput data). In addition to summarizing sequence diversity and species diversity across nodes in the NCBI taxonomy, it reports 87,000 potentially phylogenetically informative clusters of homologous sequences, which can be viewed or downloaded, along with provisional alignments and coarse phylogenetic trees. At each node in the NCBI hierarchy, the user can display a "data availability matrix" of all available sequences for entries in a subtaxa-by-clusters matrix. This matrix provides a guidepost for subsequent assembly of multigene data sets or supertrees. The database allows for comparison of results from previous GenBank releases, highlighting recent additions of either sequences or taxa to GenBank and letting investigators track progress on data availability worldwide. Although the reported alignments and trees are extremely approximate, the database reports several statistics correlated with alignment quality to help users choose from alternative data sources.  相似文献   

6.
Most centralities proposed for identifying influential spreaders on social networks to either spread a message or to stop an epidemic require the full topological information of the network on which spreading occurs. In practice, however, collecting all connections between agents in social networks can be hardly achieved. As a result, such metrics could be difficult to apply to real social networks. Consequently, a new approach for identifying influential people without the explicit network information is demanded in order to provide an efficient immunization or spreading strategy, in a practical sense. In this study, we seek a possible way for finding influential spreaders by using the social mechanisms of how social connections are formed in real networks. We find that a reliable immunization scheme can be achieved by asking people how they interact with each other. From these surveys we find that the probabilistic tendency to connect to a hub has the strongest predictive power for influential spreaders among tested social mechanisms. Our observation also suggests that people who connect different communities is more likely to be an influential spreader when a network has a strong modular structure. Our finding implies that not only the effect of network location but also the behavior of individuals is important to design optimal immunization or spreading schemes.  相似文献   

7.
Dehnert M  Helm WE  Hütt MT 《Gene》2005,345(1):81-90
We study short-range correlations in DNA sequences with methods from information theory and statistics. We find a persisting degree of identity between the correlation patterns of different chromosomes of a species. Except for the case of human and chimpanzee inter-species differences in this correlation pattern allow robust species distinction: in a clustering tree based upon the correlation curves on the level of individual chromosomes distinct clusters for the individual species are found. This capacity of distinguishing species persists, even when the length of the underlying sequences is drastically reduced. In comparison to the standard tool for studying symbol correlations in DNA sequences, namely the mutual information function, we find that an autoregressive model for higher order Markov processes significantly improves species distinction due to an implicit subtraction of random background.  相似文献   

8.
Online social networks have become increasingly ubiquitous and understanding their structural, dynamical, and scaling properties not only is of fundamental interest but also has a broad range of applications. Such networks can be extremely dynamic, generated almost instantaneously by, for example, breaking-news items. We investigate a common class of online social networks, the user-user retweeting networks, by analyzing the empirical data collected from Sina Weibo (a massive twitter-like microblogging social network in China) with respect to the topic of the 2011 Japan earthquake. We uncover a number of algebraic scaling relations governing the growth and structure of the network and develop a probabilistic model that captures the basic dynamical features of the system. The model is capable of reproducing all the empirical results. Our analysis not only reveals the basic mechanisms underlying the dynamics of the retweeting networks, but also provides general insights into the control of information spreading on such networks.  相似文献   

9.
We present an event tree analysis of studying the dynamics of the Hodgkin-Huxley (HH) neuronal networks. Our study relies on a coarse-grained projection to event trees and to the event chains that comprise these trees by using a statistical collection of spatial-temporal sequences of relevant physiological observables (such as sequences of spiking multiple neurons). This projection can retain information about network dynamics that covers multiple features, swiftly and robustly. We demonstrate that for even small differences in inputs, some dynamical regimes of HH networks contain sufficiently higher order statistics as reflected in event chains within the event tree analysis. Therefore, this analysis is effective in discriminating small differences in inputs. Moreover, we use event trees to analyze the results computed from an efficient library-based numerical method proposed in our previous work, where a pre-computed high resolution data library of typical neuronal trajectories during the interval of an action potential (spike) allows us to avoid resolving the spikes in detail. In this way, we can evolve the HH networks using time steps one order of magnitude larger than the typical time steps used for resolving the trajectories without the library, while achieving comparable statistical accuracy in terms of average firing rate and power spectra of voltage traces. Our numerical simulation results show that the library method is efficient in the sense that the results generated by using this numerical method with much larger time steps contain sufficiently high order statistical structure of firing events that are similar to the ones obtained using a regular HH solver. We use our event tree analysis to demonstrate these statistical similarities.  相似文献   

10.
In many applications, one is interested in determining which of the properties of a network module change across conditions. For example, to validate the existence of a module, it is desirable to show that it is reproducible (or preserved) in an independent test network. Here we study several types of network preservation statistics that do not require a module assignment in the test network. We distinguish network preservation statistics by the type of the underlying network. Some preservation statistics are defined for a general network (defined by an adjacency matrix) while others are only defined for a correlation network (constructed on the basis of pairwise correlations between numeric variables). Our applications show that the correlation structure facilitates the definition of particularly powerful module preservation statistics. We illustrate that evaluating module preservation is in general different from evaluating cluster preservation. We find that it is advantageous to aggregate multiple preservation statistics into summary preservation statistics. We illustrate the use of these methods in six gene co-expression network applications including 1) preservation of cholesterol biosynthesis pathway in mouse tissues, 2) comparison of human and chimpanzee brain networks, 3) preservation of selected KEGG pathways between human and chimpanzee brain networks, 4) sex differences in human cortical networks, 5) sex differences in mouse liver networks. While we find no evidence for sex specific modules in human cortical networks, we find that several human cortical modules are less preserved in chimpanzees. In particular, apoptosis genes are differentially co-expressed between humans and chimpanzees. Our simulation studies and applications show that module preservation statistics are useful for studying differences between the modular structure of networks. Data, R software and accompanying tutorials can be downloaded from the following webpage: http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/ModulePreservation.  相似文献   

11.
New technologies make it possible to measure activity from many neurons simultaneously. One approach is to analyze simultaneously recorded neurons individually, then group together neurons which increase their activity during similar behaviors into an “ensemble.” However, this notion of an ensemble ignores the ability of neurons to act collectively and encode and transmit information in ways that are not reflected by their individual activity levels. We used microendoscopic GCaMP imaging to measure prefrontal activity while mice were either alone or engaged in social interaction. We developed an approach that combines a neural network classifier and surrogate (shuffled) datasets to characterize how neurons synergistically transmit information about social behavior. Notably, unlike optimal linear classifiers, a neural network classifier with a single linear hidden layer can discriminate network states which differ solely in patterns of coactivity, and not in the activity levels of individual neurons. Using this approach, we found that surrogate datasets which preserve behaviorally specific patterns of coactivity (correlations) outperform those which preserve behaviorally driven changes in activity levels but not correlated activity. Thus, social behavior elicits increases in correlated activity that are not explained simply by the activity levels of the underlying neurons, and prefrontal neurons act collectively to transmit information about socialization via these correlations. Notably, this ability of correlated activity to enhance the information transmitted by neuronal ensembles is diminished in mice lacking the autism-associated gene Shank3. These results show that synergy is an important concept for the coding of social behavior which can be disrupted in disease states, reveal a specific mechanism underlying this synergy (social behavior increases correlated activity within specific ensembles), and outline methods for studying how neurons within an ensemble can work together to encode information.

Behaviorally-specific patterns of correlated activity between prefrontal neurons normally enhance the information that neuronal ensembles transmit about social behavior. This study shows that in a mouse model of autism, individual neurons continue to encode social information, but this additional information carried by patterns of correlated activity is lost.  相似文献   

12.
MOTIVATION: B cells responding to antigenic stimulation can fine-tune their binding properties through a process of affinity maturation composed of somatic hypermutation, affinity-selection and clonal expansion. The mutation rate of the B cell receptor DNA sequence, and the effect of these mutations on affinity and specificity, are of critical importance for understanding immune and autoimmune processes. Unbiased estimates of these properties are currently lacking due to the short time-scales involved and the small numbers of sequences available. RESULTS: We have developed a bioinformatic method based on a maximum likelihood analysis of phylogenetic lineage trees to estimate the parameters of a B cell clonal expansion model, which includes somatic hypermutation with the possibility of lethal mutations. Lineage trees are created from clonally related B cell receptor DNA sequences. Important links between tree shapes and underlying model parameters are identified using mutual information. Parameters are estimated using a likelihood function based on the joint distribution of several tree shapes, without requiring a priori knowledge of the number of generations in the clone (which is not available for rapidly dividing populations in vivo). A systematic validation on synthetic trees produced by a mutating birth-death process simulation shows that our estimates are precise and robust to several underlying assumptions. These methods are applied to experimental data from autoimmune mice to demonstrate the existence of hypermutating B cells in an unexpected location in the spleen.  相似文献   

13.
The analysis of expressed sequences from a diverse set of plant species has fueled the increase in understanding of the complex molecular mechanisms underlying plant growth regulation. While representative data sets can be found for the major branches of plant evolution, fern species data are lacking. To further the availability of genetic information in pteridophytes, a normalized cDNA library of Adiantum capillus-veneris was constructed from prothallia grown under white light. A total of 10,420 expressed sequence tags (ESTs) were obtained and clustering of these sequences resulted in 7,100 nonredundant clusters. Of these, 1,608 EST clusters were found to be similar to sequences of known function and 1,092 EST clusters showed similarity to sequences of unknown function. Given the usefulness of Adiantum for developmental studies, the sequence data represented in this report stand to make a significant contribution to the understanding of plant growth regulation, particularly for pteridophytes.  相似文献   

14.
del Campo J  Massana R 《Protist》2011,162(3):435-448
In recent years, a substantial amount of data on aquatic protists has been obtained from culture-independent molecular approaches, unveiling a large diversity and the existence of new lineages. However, sequences affiliated with minor groups (in terms of clonal abundance) have often been under-analyzed, and this hides a potentially relevant source of phylogenetic information. Here we have searched public databases for 18S rDNA sequences of chrysophytes, choanoflagellates and bicosoecids retrieved from molecular surveys of protists. These three groups are often considered to account for most of the heterotrophic flagellates, an important functional component in microbial food webs. They represented a significant fraction of clones in freshwater studies, whereas their relative clonal abundance was low in marine studies. The novelty displayed by this dataset was notable. Most environmental sequences were distant to sequences of cultured organisms, indicating a significant bias in the representation of taxa in culture. Moreover, they were often distant to sequences from other molecular surveys, suggesting an insufficient sequencing effort to characterize the in situ diversity of these groups. Phylogenetic trees with complete sequences present the most accurate representation of the diversity of these groups, with the emergence of several new clades formed exclusively by environmental sequences. Exhaustive data mining in sequence databases allowed the identification of new diversity hidden inside chrysophytes, choanoflagellates and bicosoecids.  相似文献   

15.
Phylogenetic inference based on matrix representation of trees.   总被引:14,自引:0,他引:14  
Rooted phylogenetic trees can be represented as matrices in which the rows correspond to termini, and columns correspond to internal nodes (elements of the n-tree). Parsimony analysis of such a matrix will fully recover the topology of the original tree. The maximum size of the represented matrix depends only on the number of termini in the tree; for a tree derived from molecular sequences, the represented matrix may be orders of magnitude smaller than the original data matrix. Representations of multiple trees (which may or may not have identical termini) can readily be combined into a single matrix; columns of discrete-character-state data can be added and, if desired, weighted differentially. Parsimony analysis of the resulting composite matrix yields a hybrid supertree which typically provides greater resolution than conventional consensus trees. Use of this method is illustrated with examples involving multiple tRNA genes in organelles and multiple protein-coding genes in eukaryotes.  相似文献   

16.

Background  

Lateral genetic transfer can lead to disagreements among phylogenetic trees comprising sequences from the same set of taxa. Where topological discordance is thought to have arisen through genetic transfer events, tree comparisons can be used to identify the lineages that may have shared genetic information. An 'edit path' of one or more transfer events can be represented with a series of subtree prune and regraft (SPR) operations, but finding the optimal such set of operations is NP-hard for comparisons between rooted trees, and may be so for unrooted trees as well.  相似文献   

17.
Ferretti L  Raineri E  Ramos-Onsins S 《Genetics》2012,191(4):1397-1401
Missing data are common in DNA sequences obtained through high-throughput sequencing. Furthermore, samples of low quality or problems in the experimental protocol often cause a loss of data even with traditional sequencing technologies. Here we propose modified estimators of variability and neutrality tests that can be naturally applied to sequences with missing data, without the need to remove bases or individuals from the analysis. Modified statistics include the Watterson estimator θ(W), Tajima's D, Fay and Wu's H, and HKA. We develop a general framework to take missing data into account in frequency spectrum-based neutrality tests and we derive the exact expression for the variance of these statistics under the neutral model. The neutrality tests proposed here can also be used as summary statistics to describe the information contained in other classes of data like DNA microarrays.  相似文献   

18.
Correlations in spike-train ensembles can seriously impair the encoding of information by their spatio-temporal structure. An inevitable source of correlation in finite neural networks is common presynaptic input to pairs of neurons. Recent studies demonstrate that spike correlations in recurrent neural networks are considerably smaller than expected based on the amount of shared presynaptic input. Here, we explain this observation by means of a linear network model and simulations of networks of leaky integrate-and-fire neurons. We show that inhibitory feedback efficiently suppresses pairwise correlations and, hence, population-rate fluctuations, thereby assigning inhibitory neurons the new role of active decorrelation. We quantify this decorrelation by comparing the responses of the intact recurrent network (feedback system) and systems where the statistics of the feedback channel is perturbed (feedforward system). Manipulations of the feedback statistics can lead to a significant increase in the power and coherence of the population response. In particular, neglecting correlations within the ensemble of feedback channels or between the external stimulus and the feedback amplifies population-rate fluctuations by orders of magnitude. The fluctuation suppression in homogeneous inhibitory networks is explained by a negative feedback loop in the one-dimensional dynamics of the compound activity. Similarly, a change of coordinates exposes an effective negative feedback loop in the compound dynamics of stable excitatory-inhibitory networks. The suppression of input correlations in finite networks is explained by the population averaged correlations in the linear network model: In purely inhibitory networks, shared-input correlations are canceled by negative spike-train correlations. In excitatory-inhibitory networks, spike-train correlations are typically positive. Here, the suppression of input correlations is not a result of the mere existence of correlations between excitatory (E) and inhibitory (I) neurons, but a consequence of a particular structure of correlations among the three possible pairings (EE, EI, II).  相似文献   

19.
The primary goal of this article is to infer genetic interactions based on gene expression data. A new method for multiorganism Bayesian gene network estimation is presented based on multitask learning. When the input datasets are sparse, as is the case in microarray gene expression data, it becomes difficult to separate random correlations from true correlations that would lead to actual edges when modeling the gene interactions as a Bayesian network. Multitask learning takes advantage of the similarity between related tasks, in order to construct a more accurate model of the underlying relationships represented by the Bayesian networks. The proposed method is tested on synthetic data to illustrate its validity. Then it is iteratively applied on real gene expression data to learn the genetic regulatory networks of two organisms with homologous genes.  相似文献   

20.
Ontogenetic sequences are a pervasive aspect of development and are used extensively by biologists for intra- and interspecific comparisons. A tacit assumption behind most such analyses is that sequence is largely invariant within a species. However, recent embryological and experimental work emphasizes that ontogenetic sequences can be variable and that sequence polymorphism may be far more prevalent than is generally realized. We present a method that uses parsimony algorithms to map hierarchic developmental patterns that capture variability within a sample. This technique for discovering and formalizing sequences is called the "Ontogenetic Sequence Analysis" (OSA). Results of OSA include formalized diagrams of reticulating networks, describe all most parsimonious sequences, and can be used to develop statistics and metrics for comparison of both intraspecific and interspecific sequence variation. The method is tested with examples of human postnatal skeletal ossification, comprising a time-calibrated data set of human hand and wrist epiphyseal unions, and a longitudinal data set of human wrist ossification. Results illustrate the validity of the method for discovering sequence patterns and for predicting morphologies not represented in analytic samples. OSA demonstrates the potential and challenges of incorporating ontogenetic sequences of morphological information into evolutionary analyses.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号