首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 717 毫秒
1.
Researchers interested in studying populations that are difficult to reach through traditional survey methods can now draw on a range of methods to access these populations. Yet many of these methods are more expensive and difficult to implement than studies using conventional sampling frames and trusted sampling methods. The network scale-up method (NSUM) provides a middle ground for researchers who wish to estimate the size of a hidden population, but lack the resources to conduct a more specialized hidden population study. Through this method it is possible to generate population estimates for a wide variety of groups that are perhaps unwilling to self-identify as such (for example, users of illegal drugs or other stigmatized populations) via traditional survey tools such as telephone or mail surveys—by asking a representative sample to estimate the number of people they know who are members of such a “hidden” subpopulation. The original estimator is formulated to minimize the weight a single scaling variable can exert upon the estimates. We argue that this introduces hidden and difficult to predict biases, and instead propose a series of methodological advances on the traditional scale-up estimation procedure, including a new estimator. Additionally, we formalize the incorporation of sample weights into the network scale-up estimation process, and propose a recursive process of back estimation “trimming” to identify and remove poorly performing predictors from the estimation process. To demonstrate these suggestions we use data from a network scale-up mail survey conducted in Nebraska during 2014. We find that using the new estimator and recursive trimming process provides more accurate estimates, especially when used in conjunction with sampling weights.  相似文献   

2.
Node-Link diagrams make it possible to take a quick glance at how nodes (or actors) in a network are connected by edges (or ties). A conventional network diagram of a “contact tree” maps out a root and branches that represent the structure of nodes and edges, often without further specifying leaves or fruits that would have grown from small branches. By furnishing such a network structure with leaves and fruits, we reveal details about “contacts” in our ContactTrees upon which ties and relationships are constructed. Our elegant design employs a bottom-up approach that resembles a recent attempt to understand subjective well-being by means of a series of emotions. Such a bottom-up approach to social-network studies decomposes each tie into a series of interactions or contacts, which can help deepen our understanding of the complexity embedded in a network structure. Unlike previous network visualizations, ContactTrees highlight how relationships form and change based upon interactions among actors, as well as how relationships and networks vary by contact attributes. Based on a botanical tree metaphor, the design is easy to construct and the resulting tree-like visualization can display many properties at both tie and contact levels, thus recapturing a key ingredient missing from conventional techniques of network visualization. We demonstrate ContactTrees using data sets consisting of up to three waves of 3-month contact diaries over the 2004-2012 period, and discuss how this design can be applied to other types of datasets.  相似文献   

3.
Subcellular localization of a protein is important to understand proteins’ functions and interactions. There are many techniques based on computational methods to predict protein subcellular locations, but it has been shown that many prediction tasks have a training data shortage problem. This paper introduces a new method to mine proteins with non-experimental annotations, which are labeled by non-experimental evidences of protein databases to overcome the training data shortage problem. A novel active sample selection strategy is designed, taking advantage of active learning technology, to actively find useful samples from the entire data pool of candidate proteins with non-experimental annotations. This approach can adequately estimate the “value” of each sample, automatically select the most valuable samples and add them into the original training set, to help to retrain the classifiers. Numerical experiments with for four popular multi-label classifiers on three benchmark datasets show that the proposed method can effectively select the valuable samples to supplement the original training set and significantly improve the performances of predicting classifiers.  相似文献   

4.
Recently, there has been much interest in the community structure or mesoscale organization of complex networks. This structure is characterised either as a set of sparsely inter-connected modules or as a highly connected core with a sparsely connected periphery. However, it is often difficult to disambiguate these two types of mesoscale structure or, indeed, to summarise the full network in terms of the relationships between its mesoscale constituents. Here, we estimate a community structure with a stochastic blockmodel approach, the Erdős-Rényi Mixture Model, and compare it to the much more widely used deterministic methods, such as the Louvain and Spectral algorithms. We used the Caenorhabditis elegans (C. elegans) nervous system (connectome) as a model system in which biological knowledge about each node or neuron can be used to validate the functional relevance of the communities obtained. The deterministic algorithms derived communities with 4–5 modules, defined by sparse inter-connectivity between all modules. In contrast, the stochastic Erdős-Rényi Mixture Model estimated a community with 9 blocks or groups which comprised a similar set of modules but also included a clearly defined core, made of 2 small groups. We show that the “core-in-modules” decomposition of the worm brain network, estimated by the Erdős-Rényi Mixture Model, is more compatible with prior biological knowledge about the C. elegans nervous system than the purely modular decomposition defined deterministically. We also show that the blockmodel can be used both to generate stochastic realisations (simulations) of the biological connectome, and to compress network into a small number of super-nodes and their connectivity. We expect that the Erdős-Rényi Mixture Model may be useful for investigating the complex community structures in other (nervous) systems.  相似文献   

5.
It is widely believed that the modular organization of cellular function is reflected in a modular structure of molecular networks. A common view is that a “module” in a network is a cohesively linked group of nodes, densely connected internally and sparsely interacting with the rest of the network. Many algorithms try to identify functional modules in protein-interaction networks (PIN) by searching for such cohesive groups of proteins. Here, we present an alternative approach independent of any prior definition of what actually constitutes a “module”. In a self-consistent manner, proteins are grouped into “functional roles” if they interact in similar ways with other proteins according to their functional roles. Such grouping may well result in cohesive modules again, but only if the network structure actually supports this. We applied our method to the PIN from the Human Protein Reference Database (HPRD) and found that a representation of the network in terms of cohesive modules, at least on a global scale, does not optimally represent the network''s structure because it focuses on finding independent groups of proteins. In contrast, a decomposition into functional roles is able to depict the structure much better as it also takes into account the interdependencies between roles and even allows groupings based on the absence of interactions between proteins in the same functional role. This, for example, is the case for transmembrane proteins, which could never be recognized as a cohesive group of nodes in a PIN. When mapping experimental methods onto the groups, we identified profound differences in the coverage suggesting that our method is able to capture experimental bias in the data, too. For example yeast-two-hybrid data were highly overrepresented in one particular group. Thus, there is more structure in protein-interaction networks than cohesive modules alone and we believe this finding can significantly improve automated function prediction algorithms.  相似文献   

6.
Complex systems have attracted considerable interest because of their wide range of applications, and are often studied via a “classic” approach: study a specific system, find a complex network behind it, and analyze the corresponding properties. This simple methodology has produced a great deal of interesting results, but relies on an often implicit underlying assumption: the level of detail on which the system is observed. However, in many situations, physical or abstract, the level of detail can be one out of many, and might also depend on intrinsic limitations in viewing the data with a different level of abstraction or precision. So, a fundamental question arises: do properties of a network depend on its level of observability, or are they invariant? If there is a dependence, then an apparently correct network modeling could in fact just be a bad approximation of the true behavior of a complex system. In order to answer this question, we propose a novel micro-macro analysis of complex systems that quantitatively describes how the structure of complex networks varies as a function of the detail level. To this extent, we have developed a new telescopic algorithm that abstracts from the local properties of a system and reconstructs the original structure according to a fuzziness level. This way we can study what happens when passing from a fine level of detail (“micro”) to a different scale level (“macro”), and analyze the corresponding behavior in this transition, obtaining a deeper spectrum analysis. The obtained results show that many important properties are not universally invariant with respect to the level of detail, but instead strongly depend on the specific level on which a network is observed. Therefore, caution should be taken in every situation where a complex network is considered, if its context allows for different levels of observability.  相似文献   

7.
Inferring connectivity in neuronal networks remains a key challenge in statistical neuroscience. The “common input” problem presents a major roadblock: it is difficult to reliably distinguish causal connections between pairs of observed neurons versus correlations induced by common input from unobserved neurons. Available techniques allow us to simultaneously record, with sufficient temporal resolution, only a small fraction of the network. Consequently, naive connectivity estimators that neglect these common input effects are highly biased. This work proposes a “shotgun” experimental design, in which we observe multiple sub-networks briefly, in a serial manner. Thus, while the full network cannot be observed simultaneously at any given time, we may be able to observe much larger subsets of the network over the course of the entire experiment, thus ameliorating the common input problem. Using a generalized linear model for a spiking recurrent neural network, we develop a scalable approximate expected loglikelihood-based Bayesian method to perform network inference given this type of data, in which only a small fraction of the network is observed in each time bin. We demonstrate in simulation that the shotgun experimental design can eliminate the biases induced by common input effects. Networks with thousands of neurons, in which only a small fraction of the neurons is observed in each time bin, can be quickly and accurately estimated, achieving orders of magnitude speed up over previous approaches.  相似文献   

8.
Functional magnetic resonance data acquired in a task-absent condition (“resting state”) require new data analysis techniques that do not depend on an activation model. In this work, we introduce an alternative assumption- and parameter-free method based on a particular form of node centrality called eigenvector centrality. Eigenvector centrality attributes a value to each voxel in the brain such that a voxel receives a large value if it is strongly correlated with many other nodes that are themselves central within the network. Google''s PageRank algorithm is a variant of eigenvector centrality. Thus far, other centrality measures - in particular “betweenness centrality” - have been applied to fMRI data using a pre-selected set of nodes consisting of several hundred elements. Eigenvector centrality is computationally much more efficient than betweenness centrality and does not require thresholding of similarity values so that it can be applied to thousands of voxels in a region of interest covering the entire cerebrum which would have been infeasible using betweenness centrality. Eigenvector centrality can be used on a variety of different similarity metrics. Here, we present applications based on linear correlations and on spectral coherences between fMRI times series. This latter approach allows us to draw conclusions of connectivity patterns in different spectral bands. We apply this method to fMRI data in task-absent conditions where subjects were in states of hunger or satiety. We show that eigenvector centrality is modulated by the state that the subjects were in. Our analyses demonstrate that eigenvector centrality is a computationally efficient tool for capturing intrinsic neural architecture on a voxel-wise level.  相似文献   

9.
The efficacy of DNA extraction protocols can be highly dependent upon both the type of sample being investigated and the types of downstream analyses performed. Considering that the use of new bacterial community analysis techniques (e.g., microbiomics, metagenomics) is becoming more prevalent in the agricultural and environmental sciences and many environmental samples within these disciplines can be physiochemically and microbiologically unique (e.g., fecal and litter/bedding samples from the poultry production spectrum), appropriate and effective DNA extraction methods need to be carefully chosen. Therefore, a novel semi-automated hybrid DNA extraction method was developed specifically for use with environmental poultry production samples. This method is a combination of the two major types of DNA extraction: mechanical and enzymatic. A two-step intense mechanical homogenization step (using bead-beating specifically formulated for environmental samples) was added to the beginning of the “gold standard” enzymatic DNA extraction method for fecal samples to enhance the removal of bacteria and DNA from the sample matrix and improve the recovery of Gram-positive bacterial community members. Once the enzymatic extraction portion of the hybrid method was initiated, the remaining purification process was automated using a robotic workstation to increase sample throughput and decrease sample processing error. In comparison to the strict mechanical and enzymatic DNA extraction methods, this novel hybrid method provided the best overall combined performance when considering quantitative (using 16S rRNA qPCR) and qualitative (using microbiomics) estimates of the total bacterial communities when processing poultry feces and litter samples.  相似文献   

10.
Graph representations of brain connectivity have attracted a lot of recent interest, but existing methods for dividing such graphs into connected subnetworks have a number of limitations in the context of neuroimaging. This is an important problem because most cognitive functions would be expected to involve some but not all brain regions. In this paper we outline a simple approach for decomposing graphs, which may be based on any measure of interregional association, into coherent “principal networks”. The technique is based on an eigendecomposition of the association matrix, and is closely related to principal components analysis. We demonstrate the technique using cortical thickness and diffusion tractography data, showing that the subnetworks which emerge are stable, meaningful and reproducible. Graph-theoretic measures of network cost and efficiency may be calculated separately for each principal network. Unlike some other approaches, all available connectivity information is taken into account, and vertices may appear in none or several of the subnetworks. Subject-by-subject “scores” for each principal network may also be obtained, under certain circumstances, and related to demographic or cognitive variables of interest.  相似文献   

11.
Many results have been obtained when studying scientific papers citations databases in a network perspective. Articles can be ranked according to their current in-degree and their future popularity or citation counts can even be predicted. The dynamical properties of such networks and the observation of the time evolution of their nodes started more recently. This work adopts an evolutionary perspective and proposes an original algorithm for the construction of genealogical trees of scientific papers on the basis of their citation count evolution in time. The fitness of a paper now amounts to its in-degree growing trend and a “dying” paper will suddenly see this trend declining in time. It will give birth and be taken over by some of its most prevalent citing “offspring”. Practically, this might be used to trace the successive published milestones of a research field.  相似文献   

12.
The world's oceans contain a complex mixture of micro-organisms that are for the most part, uncharacterized both genetically and biochemically. We report here a metagenomic study of the marine planktonic microbiota in which surface (mostly marine) water samples were analyzed as part of the Sorcerer II Global Ocean Sampling expedition. These samples, collected across a several-thousand km transect from the North Atlantic through the Panama Canal and ending in the South Pacific yielded an extensive dataset consisting of 7.7 million sequencing reads (6.3 billion bp). Though a few major microbial clades dominate the planktonic marine niche, the dataset contains great diversity with 85% of the assembled sequence and 57% of the unassembled data being unique at a 98% sequence identity cutoff. Using the metadata associated with each sample and sequencing library, we developed new comparative genomic and assembly methods. One comparative genomic method, termed “fragment recruitment,” addressed questions of genome structure, evolution, and taxonomic or phylogenetic diversity, as well as the biochemical diversity of genes and gene families. A second method, termed “extreme assembly,” made possible the assembly and reconstruction of large segments of abundant but clearly nonclonal organisms. Within all abundant populations analyzed, we found extensive intra-ribotype diversity in several forms: (1) extensive sequence variation within orthologous regions throughout a given genome; despite coverage of individual ribotypes approaching 500-fold, most individual sequencing reads are unique; (2) numerous changes in gene content some with direct adaptive implications; and (3) hypervariable genomic islands that are too variable to assemble. The intra-ribotype diversity is organized into genetically isolated populations that have overlapping but independent distributions, implying distinct environmental preference. We present novel methods for measuring the genomic similarity between metagenomic samples and show how they may be grouped into several community types. Specific functional adaptations can be identified both within individual ribotypes and across the entire community, including proteorhodopsin spectral tuning and the presence or absence of the phosphate-binding gene PstS.  相似文献   

13.
Microbiome-based stratification of healthy individuals into compositional categories, referred to as “enterotypes” or “community types”, holds promise for drastically improving personalized medicine. Despite this potential, the existence of community types and the degree of their distinctness have been highly debated. Here we adopted a dynamic systems approach and found that heterogeneity in the interspecific interactions or the presence of strongly interacting species is sufficient to explain community types, independent of the topology of the underlying ecological network. By controlling the presence or absence of these strongly interacting species we can steer the microbial ecosystem to any desired community type. This open-loop control strategy still holds even when the community types are not distinct but appear as dense regions within a continuous gradient. This finding can be used to develop viable therapeutic strategies for shifting the microbial composition to a healthy configuration.  相似文献   

14.
G Wang  Y Rong  H Chen  C Pearson  C Du  R Simha  C Zeng 《PloS one》2012,7(7):e40330
A common problem in molecular biology is to use experimental data, such as microarray data, to infer knowledge about the structure of interactions between important molecules in subsystems of the cell. By approximating the state of each molecule as “on” or “off”, it becomes possible to simplify the problem, and exploit the tools of Boolean analysis for such inference. Amongst Boolean techniques, the process-driven approach has shown promise in being able to identify putative network structures, as well as stability and modularity properties. This paper examines the process-driven approach more formally, and makes four contributions about the computational complexity of the inference problem, under the “dominant inhibition” assumption of molecular interactions. The first is a proof that the feasibility problem (does there exist a network that explains the data?) can be solved in polynomial-time. Second, the minimality problem (what is the smallest network that explains the data?) is shown to be NP-hard, and therefore unlikely to result in a polynomial-time algorithm. Third, a simple polynomial-time heuristic is shown to produce near-minimal solutions, as demonstrated by simulation. Fourth, the theoretical framework explains how multiplicity (the number of network solutions to realize a given biological process), which can take exponential-time to compute, can instead be accurately estimated by a fast, polynomial-time heuristic.  相似文献   

15.
In the network approach to psychopathology, disorders are conceptualized as networks of mutually interacting symptoms (e.g., depressed mood) and transdiagnostic factors (e.g., rumination). This suggests that it is necessary to study how symptoms dynamically interact over time in a network architecture. In the present paper, we show how such an architecture can be constructed on the basis of time-series data obtained through Experience Sampling Methodology (ESM). The proposed methodology determines the parameters for the interaction between nodes in the network by estimating a multilevel vector autoregression (VAR) model on the data. The methodology allows combining between-subject and within-subject information in a multilevel framework. The resulting network architecture can subsequently be analyzed through network analysis techniques. In the present study, we apply the method to a set of items that assess mood-related factors. We show that the analysis generates a plausible and replicable network architecture, the structure of which is related to variables such as neuroticism; that is, for subjects who score high on neuroticism, worrying plays a more central role in the network. Implications and extensions of the methodology are discussed.  相似文献   

16.
We introduce a model for the adaptive evolution of a network of company ownerships. In a recent work it has been shown that the empirical global network of corporate control is marked by a central, tightly connected “core” made of a small number of large companies which control a significant part of the global economy. Here we show how a simple, adaptive “rich get richer” dynamics can account for this characteristic, which incorporates the increased buying power of more influential companies, and in turn results in even higher control. We conclude that this kind of centralized structure can emerge without it being an explicit goal of these companies, or as a result of a well-organized strategy.  相似文献   

17.
Single-molecule techniques make it possible to investigate the behavior of individual biological molecules in solution in real time. These techniques include so-called force spectroscopy approaches such as atomic force microscopy, optical tweezers, flow stretching, and magnetic tweezers. Amongst these approaches, magnetic tweezers have distinguished themselves by their ability to apply torque while maintaining a constant stretching force. Here, it is illustrated how such a “conventional” magnetic tweezers experimental configuration can, through a straightforward modification of its field configuration to minimize the magnitude of the transverse field, be adapted to measure the degree of twist in a biological molecule. The resulting configuration is termed the freely-orbiting magnetic tweezers. Additionally, it is shown how further modification of the field configuration can yield a transverse field with a magnitude intermediate between that of the “conventional” magnetic tweezers and the freely-orbiting magnetic tweezers, which makes it possible to directly measure the torque stored in a biological molecule. This configuration is termed the magnetic torque tweezers. The accompanying video explains in detail how the conversion of conventional magnetic tweezers into freely-orbiting magnetic tweezers and magnetic torque tweezers can be accomplished, and demonstrates the use of these techniques. These adaptations maintain all the strengths of conventional magnetic tweezers while greatly expanding the versatility of this powerful instrument.  相似文献   

18.
In this study, we investigated the microbial community (bacteria and fungi) colonising an oil painting on canvas, which showed visible signs of biodeterioration. A combined strategy, comprising culture-dependent and -independent techniques, was selected. The results derived from the two techniques were disparate. Most of the isolated bacterial strains belonged to related species of the phylum Firmicutes, as Bacillus sp. and Paenisporosarcina sp., whereas the majority of the non-cultivable members of the bacterial community were shown to be related to species of the phylum Proteobacteria, as Stenotrophomonas sp. Fungal communities also showed discrepancies: the isolated fungal strains belonged to different genera of the order Eurotiales, as Penicillium and Eurotium, and the non-cultivable belonged to species of the order Pleosporales and Saccharomycetales. The cultivable microorganisms, which exhibited enzymatic activities related to the deterioration processes, were selected to evaluate their biodeteriorative potential on canvas paintings; namely Arthrobacter sp. as the representative bacterium and Penicillium sp. as the representative fungus. With this aim, a sample taken from the painting studied in this work was examined to determine the stratigraphic sequence of its cross-section. From this information, “mock paintings,” simulating the structure of the original painting, were prepared, inoculated with the selected bacterial and fungal strains, and subsequently examined by micro-Fourier Transform Infrared spectroscopy, in order to determine their potential susceptibility to microbial degradation. The FTIR-spectra revealed that neither Arthrobacter sp. nor Penicillium sp. alone, were able to induce chemical changes on the various materials used to prepare “mock paintings.” Only when inoculated together, could a synergistic effect on the FTIR-spectra be observed, in the form of a variation in band position on the spectrum.  相似文献   

19.
In the multidisciplinary field of Network Science, optimization of procedures for efficiently breaking complex networks is attracting much attention from a practical point of view. In this contribution, we present a module-based method to efficiently fragment complex networks. The procedure firstly identifies topological communities through which the network can be represented using a well established heuristic algorithm of community finding. Then only the nodes that participate of inter-community links are removed in descending order of their betweenness centrality. We illustrate the method by applying it to a variety of examples in the social, infrastructure, and biological fields. It is shown that the module-based approach always outperforms targeted attacks to vertices based on node degree or betweenness centrality rankings, with gains in efficiency strongly related to the modularity of the network. Remarkably, in the US power grid case, by deleting 3% of the nodes, the proposed method breaks the original network in fragments which are twenty times smaller in size than the fragments left by betweenness-based attack.  相似文献   

20.
Information flow during catastrophic events is a critical aspect of disaster management. Modern communication platforms, in particular online social networks, provide an opportunity to study such flow and derive early-warning sensors, thus improving emergency preparedness and response. Performance of the social networks sensor method, based on topological and behavioral properties derived from the “friendship paradox”, is studied here for over 50 million Twitter messages posted before, during, and after Hurricane Sandy. We find that differences in users’ network centrality effectively translate into moderate awareness advantage (up to 26 hours); and that geo-location of users within or outside of the hurricane-affected area plays a significant role in determining the scale of such an advantage. Emotional response appears to be universal regardless of the position in the network topology, and displays characteristic, easily detectable patterns, opening a possibility to implement a simple “sentiment sensing” technique that can detect and locate disasters.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号