共查询到20条相似文献,搜索用时 0 毫秒
1.
Wei Dai Mona A Sheikh Olgica Milenkovic Richard G Baraniuk 《EURASIP Journal on Bioinformatics and Systems Biology》2009,2009(1):162824
Compressive sensing microarrays (CSMs) are DNA-based sensors that operate using group testing and compressive sensing (CS) principles. In contrast to conventional DNA microarrays, in which each genetic sensor is designed to respond to a single target, in a CSM, each sensor responds to a set of targets. We study the problem of designing CSMs that simultaneously account for both the constraints from CS theory and the biochemistry of probe-target DNA hybridization. An appropriate cross-hybridization model is proposed for CSMs, and several methods are developed for probe design and CS signal recovery based on the new model. Lab experiments suggest that in order to achieve accurate hybridization profiling, consensus probe sequences are required to have sequence homology of at least 80% with all targets to be detected. Furthermore, out-of-equilibrium datasets are usually as accurate as those obtained from equilibrium conditions. Consequently, one can use CSMs in applications in which only short hybridization times are allowed. 相似文献
2.
Veit B 《Plant molecular biology》2006,60(6):793-810
The essential nature of meristematic tissues is addressed with reference to conceptual frameworks that have been developed
to explain the behaviour of animal stem cells. Comparisons are made between different types of plant meristems with the objective
of highlighting common themes that might illuminate underlying mechanisms. A more in depth comparison of the root and shoot
apical meristems is made which suggests a common mechanism for maintaining stem cells. The relevance of organogenesis to stem
cell maintenance is discussed, along with the nature of underlying mechanisms which help ensure that stem cell production
is balanced with the depletion of cells through differentiation. Mechanisms that integrate stem cell behaviour in the whole
plant are considered, with a focus on the roles of auxin and cytokinin. The review concludes with a brief discussion of epigenetic
mechanisms that act to stabilise and maintain stem cell populations. 相似文献
3.
Recent years have witnessed a rapid development of network reconstruction approaches, especially for a series of methods based on compressed sensing. Although compressed-sensing based methods require much less data than conventional approaches, the compressed sensing for reconstructing heterogeneous networks has not been fully exploited because of hubs. Hub neighbors require much more data to be inferred than small-degree nodes, inducing a cask effect for the reconstruction of heterogeneous networks. Here, a conflict-based method is proposed to overcome the cast effect to considerably reduce data amounts for achieving accurate reconstruction. Moreover, an element elimination method is presented to use the partially available structural information to reduce data requirements. The integration of both methods can further improve the reconstruction performance than separately using each technique. These methods are validated by exploring two evolutionary games taking place in scale-free networks, where individual information is accessible and an attempt to decode the network structure from measurable data is made. The results demonstrate that for all of the cases, much data are saved compared to that in the absence of these two methods. Due to the prevalence of heterogeneous networks in nature and society and the high cost of data acquisition in large-scale networks, these approaches have wide applications in many fields and are valuable for understanding and controlling the collective dynamics of a variety of heterogeneous networked systems. 相似文献
4.
High-throughput sequencing based techniques, such as 16S rRNA gene profiling, have the potential to elucidate the complex inner workings of natural microbial communities - be they from the world''s oceans or the human gut. A key step in exploring such data is the identification of dependencies between members of these communities, which is commonly achieved by correlation analysis. However, it has been known since the days of Karl Pearson that the analysis of the type of data generated by such techniques (referred to as compositional data) can produce unreliable results since the observed data take the form of relative fractions of genes or species, rather than their absolute abundances. Using simulated and real data from the Human Microbiome Project, we show that such compositional effects can be widespread and severe: in some real data sets many of the correlations among taxa can be artifactual, and true correlations may even appear with opposite sign. Additionally, we show that community diversity is the key factor that modulates the acuteness of such compositional effects, and develop a new approach, called SparCC (available at https://bitbucket.org/yonatanf/sparcc), which is capable of estimating correlation values from compositional data. To illustrate a potential application of SparCC, we infer a rich ecological network connecting hundreds of interacting species across 18 sites on the human body. Using the SparCC network as a reference, we estimated that the standard approach yields 3 spurious species-species interactions for each true interaction and misses 60% of the true interactions in the human microbiome data, and, as predicted, most of the erroneous links are found in the samples with the lowest diversity. 相似文献
5.
基于相互作用的蛋白质功能预测 总被引:1,自引:0,他引:1
蛋白质功能预测是后基因时代研究的热点问题。基于相互作用的蛋白质功能预测方法目前应用比较广泛,但是当"伙伴蛋白质"(interacting partners)数目k较小时,其预测准确率不高。从蛋白质相互作用网络入手,结合"小世界网络"特性,有效解决了k较小时预测准确率不高的问题。对酵母(Saccharomyces cerevisiae)蛋白质的相互作用网络进行预测,当k≤4时其预测准确率比相同条件下的GO(global optimization)方法有一定提高。实验结果表明:该方法能够有效的应用于伙伴蛋白质数目较小时的蛋白质功能预测。 相似文献
6.
Community structure detection is an important tool in graph analysis. This can be done, among other ways, by solving for the partition set which optimizes the modularity scores . Here it is shown that topological constraints in correlation graphs induce over-fragmentation of community structures. A refinement step to this optimization based on Linear Discriminant Analysis (LDA) and a statistical test for significance is proposed. In structured simulation constrained by topology, this novel approach performs better than the optimization of modularity alone. This method was also tested with two empirical datasets: the Roll-Call voting in the 110th US Senate constrained by geographic adjacency, and a biological dataset of 135 protein structures constrained by inter-residue contacts. The former dataset showed sub-structures in the communities that revealed a regional bias in the votes which transcend party affiliations. This is an interesting pattern given that the 110th Legislature was assumed to be a highly polarized government. The -amylase catalytic domain dataset (biological dataset) was analyzed with and without topological constraints (inter-residue contacts). The results without topological constraints showed differences with the topology constrained one, but the LDA filtering did not change the outcome of the latter. This suggests that the LDA filtering is a robust way to solve the possible over-fragmentation when present, and that this method will not affect the results where there is no evidence of over-fragmentation. 相似文献
7.
The incoherence between measurement and sparsifying transform matrices and the restricted isometry property (RIP) of measurement matrix are two of the key factors in determining the performance of compressive sensing (CS). In CS-MRI, the randomly under-sampled Fourier matrix is used as the measurement matrix and the wavelet transform is usually used as sparsifying transform matrix. However, the incoherence between the randomly under-sampled Fourier matrix and the wavelet matrix is not optimal, which can deteriorate the performance of CS-MRI. Using the mathematical result that noiselets are maximally incoherent with wavelets, this paper introduces the noiselet unitary bases as the measurement matrix to improve the incoherence and RIP in CS-MRI. Based on an empirical RIP analysis that compares the multichannel noiselet and multichannel Fourier measurement matrices in CS-MRI, we propose a multichannel compressive sensing (MCS) framework to take the advantage of multichannel data acquisition used in MRI scanners. Simulations are presented in the MCS framework to compare the performance of noiselet encoding reconstructions and Fourier encoding reconstructions at different acceleration factors. The comparisons indicate that multichannel noiselet measurement matrix has better RIP than that of its Fourier counterpart, and that noiselet encoded MCS-MRI outperforms Fourier encoded MCS-MRI in preserving image resolution and can achieve higher acceleration factors. To demonstrate the feasibility of the proposed noiselet encoding scheme, a pulse sequences with tailored spatially selective RF excitation pulses was designed and implemented on a 3T scanner to acquire the data in the noiselet domain from a phantom and a human brain. The results indicate that noislet encoding preserves image resolution better than Fouirer encoding. 相似文献
8.
9.
10.
11.
One of the pressing open problems of computational systems biology is the elucidation of the topology of genetic regulatory networks (GRNs) using high throughput genomic data, in particular microarray gene expression data. The Dialogue for Reverse Engineering Assessments and Methods (DREAM) challenge aims to evaluate the success of GRN inference algorithms on benchmarks of simulated data. In this article, we present GENIE3, a new algorithm for the inference of GRNs that was best performer in the DREAM4 In Silico Multifactorial challenge. GENIE3 decomposes the prediction of a regulatory network between p genes into p different regression problems. In each of the regression problems, the expression pattern of one of the genes (target gene) is predicted from the expression patterns of all the other genes (input genes), using tree-based ensemble methods Random Forests or Extra-Trees. The importance of an input gene in the prediction of the target gene expression pattern is taken as an indication of a putative regulatory link. Putative regulatory links are then aggregated over all genes to provide a ranking of interactions from which the whole network is reconstructed. In addition to performing well on the DREAM4 In Silico Multifactorial challenge simulated data, we show that GENIE3 compares favorably with existing algorithms to decipher the genetic regulatory network of Escherichia coli. It doesn''t make any assumption about the nature of gene regulation, can deal with combinatorial and non-linear interactions, produces directed GRNs, and is fast and scalable. In conclusion, we propose a new algorithm for GRN inference that performs well on both synthetic and real gene expression data. The algorithm, based on feature selection with tree-based ensemble methods, is simple and generic, making it adaptable to other types of genomic data and interactions. 相似文献
12.
Evan J. Molinelli Anil Korkut Weiqing Wang Martin L. Miller Nicholas P. Gauthier Xiaohong Jing Poorvi Kaushik Qin He Gordon Mills David B. Solit Christine A. Pratilas Martin Weigt Alfredo Braunstein Andrea Pagnani Riccardo Zecchina Chris Sander 《PLoS computational biology》2013,9(12)
We present a powerful experimental-computational technology for inferring network models that predict the response of cells to perturbations, and that may be useful in the design of combinatorial therapy against cancer. The experiments are systematic series of perturbations of cancer cell lines by targeted drugs, singly or in combination. The response to perturbation is quantified in terms of relative changes in the measured levels of proteins, phospho-proteins and cellular phenotypes such as viability. Computational network models are derived de novo, i.e., without prior knowledge of signaling pathways, and are based on simple non-linear differential equations. The prohibitively large solution space of all possible network models is explored efficiently using a probabilistic algorithm, Belief Propagation (BP), which is three orders of magnitude faster than standard Monte Carlo methods. Explicit executable models are derived for a set of perturbation experiments in SKMEL-133 melanoma cell lines, which are resistant to the therapeutically important inhibitor of RAF kinase. The resulting network models reproduce and extend known pathway biology. They empower potential discoveries of new molecular interactions and predict efficacious novel drug perturbations, such as the inhibition of PLK1, which is verified experimentally. This technology is suitable for application to larger systems in diverse areas of molecular biology. 相似文献
13.
Background
The transmission networks of Plasmodium vivax characterize how the parasite transmits from one location to another, which are informative and insightful for public health policy makers to accurately predict the patterns of its geographical spread. However, such networks are not apparent from surveillance data because P. vivax transmission can be affected by many factors, such as the biological characteristics of mosquitoes and the mobility of human beings. Here, we pay special attention to the problem of how to infer the underlying transmission networks of P. vivax based on available tempo-spatial patterns of reported cases.Methodology
We first define a spatial transmission model, which involves representing both the heterogeneous transmission potential of P. vivax at individual locations and the mobility of infected populations among different locations. Based on the proposed transmission model, we further introduce a recurrent neural network model to infer the transmission networks from surveillance data. Specifically, in this model, we take into account multiple real-world factors, including the length of P. vivax incubation period, the impact of malaria control at different locations, and the total number of imported cases.Principal Findings
We implement our proposed models by focusing on the P. vivax transmission among 62 towns in Yunnan province, People''s Republic China, which have been experiencing high malaria transmission in the past years. By conducting scenario analysis with respect to different numbers of imported cases, we can (i) infer the underlying P. vivax transmission networks, (ii) estimate the number of imported cases for each individual town, and (iii) quantify the roles of individual towns in the geographical spread of P. vivax.Conclusion
The demonstrated models have presented a general means for inferring the underlying transmission networks from surveillance data. The inferred networks will offer new insights into how to improve the predictability of P. vivax transmission. 相似文献14.
Background
The recent emergence of high-throughput automated image acquisition technologies has forever changed how cell biologists collect and analyze data. Historically, the interpretation of cellular phenotypes in different experimental conditions has been dependent upon the expert opinions of well-trained biologists. Such qualitative analysis is particularly effective in detecting subtle, but important, deviations in phenotypes. However, while the rapid and continuing development of automated microscope-based technologies now facilitates the acquisition of trillions of cells in thousands of diverse experimental conditions, such as in the context of RNA interference (RNAi) or small-molecule screens, the massive size of these datasets precludes human analysis. Thus, the development of automated methods which aim to identify novel and biological relevant phenotypes online is one of the major challenges in high-throughput image-based screening. Ideally, phenotype discovery methods should be designed to utilize prior/existing information and tackle three challenging tasks, i.e. restoring pre-defined biological meaningful phenotypes, differentiating novel phenotypes from known ones and clarifying novel phenotypes from each other. Arbitrarily extracted information causes biased analysis, while combining the complete existing datasets with each new image is intractable in high-throughput screens.Results
Here we present the design and implementation of a novel and robust online phenotype discovery method with broad applicability that can be used in diverse experimental contexts, especially high-throughput RNAi screens. This method features phenotype modelling and iterative cluster merging using improved gap statistics. A Gaussian Mixture Model (GMM) is employed to estimate the distribution of each existing phenotype, and then used as reference distribution in gap statistics. This method is broadly applicable to a number of different types of image-based datasets derived from a wide spectrum of experimental conditions and is suitable to adaptively process new images which are continuously added to existing datasets. Validations were carried out on different dataset, including published RNAi screening using Drosophila embryos [Additional files 1, 2], dataset for cell cycle phase identification using HeLa cells [Additional files 1, 3, 4] and synthetic dataset using polygons, our methods tackled three aforementioned tasks effectively with an accuracy range of 85%–90%. When our method is implemented in the context of a Drosophila genome-scale RNAi image-based screening of cultured cells aimed to identifying the contribution of individual genes towards the regulation of cell-shape, it efficiently discovers meaningful new phenotypes and provides novel biological insight. We also propose a two-step procedure to modify the novelty detection method based on one-class SVM, so that it can be used to online phenotype discovery. In different conditions, we compared the SVM based method with our method using various datasets and our methods consistently outperformed SVM based method in at least two of three tasks by 2% to 5%. These results demonstrate that our methods can be used to better identify novel phenotypes in image-based datasets from a wide range of conditions and organisms.Conclusion
We demonstrate that our method can detect various novel phenotypes effectively in complex datasets. Experiment results also validate that our method performs consistently under different order of image input, variation of starting conditions including the number and composition of existing phenotypes, and dataset from different screens. In our findings, the proposed method is suitable for online phenotype discovery in diverse high-throughput image-based genetic and chemical screens. 相似文献15.
《Cell Adhesion & Migration》2013,7(3):156-158
Exosomes are small vesicles of endosomal origin that can be released by many different cells to the microenvironment. Exosomes have been shown to participate in the immune system, by mediating antigen presentation. We have recently shown the presence of both mRNA and microRNA in exosomes, specifically in exosomes derived from mast cells. This RNA can be transferred between one mast cell to another, most likely through fusion of the exosome to the recipient cell membrane. The delivered RNA is functional, as the mRNA can lead to translation of new proteins in a recipient cell. The RNA shuttled between cells via exosomes is called esRNA. We propose that several types of exosomes may exist, and that an additional function of exosomes is to communicate to neighboring cells through delivery of RNA-signals. 相似文献
16.
Phylogenetic networks are necessary to represent the tree of life expanded by edges to represent events such as horizontal gene transfers, hybridizations or gene flow. Not all species follow the paradigm of vertical inheritance of their genetic material. While a great deal of research has flourished into the inference of phylogenetic trees, statistical methods to infer phylogenetic networks are still limited and under development. The main disadvantage of existing methods is a lack of scalability. Here, we present a statistical method to infer phylogenetic networks from multi-locus genetic data in a pseudolikelihood framework. Our model accounts for incomplete lineage sorting through the coalescent model, and for horizontal inheritance of genes through reticulation nodes in the network. Computation of the pseudolikelihood is fast and simple, and it avoids the burdensome calculation of the full likelihood which can be intractable with many species. Moreover, estimation at the quartet-level has the added computational benefit that it is easily parallelizable. Simulation studies comparing our method to a full likelihood approach show that our pseudolikelihood approach is much faster without compromising accuracy. We applied our method to reconstruct the evolutionary relationships among swordtails and platyfishes (Xiphophorus: Poeciliidae), which is characterized by widespread hybridizations. 相似文献
17.
This paper presents a computationally efficient algorithm for image compressive sensing reconstruction using a second degree total variation (HDTV2) regularization. Firstly, a preferably equivalent formulation of the HDTV2 functional is derived, which can be formulated as a weighted L
1-L
2 mixed norm of second degree image derivatives under the spectral decomposition framework. Secondly, using the equivalent formulation of HDTV2, we introduce an efficient forward-backward splitting (FBS) scheme to solve the HDTV2-based image reconstruction model. Furthermore, from the averaged non-expansive operator point of view, we make a detailed analysis on the convergence of the proposed FBS algorithm. Experiments on medical images demonstrate that the proposed method outperforms several fast algorithms of the TV and HDTV2 reconstruction models in terms of peak signal to noise ratio (PSNR), structural similarity index (SSIM) and convergence speed. 相似文献
18.
Background
Phenotypically similar diseases have been found to be caused by functionally related genes, suggesting a modular organization of the genetic landscape of human diseases that mirrors the modularity observed in biological interaction networks. Protein complexes, as molecular machines that integrate multiple gene products to perform biological functions, express the underlying modular organization of protein-protein interaction networks. As such, protein complexes can be useful for interrogating the networks of phenome and interactome to elucidate gene-phenotype associations of diseases.Methodology/Principal Findings
We proposed a technique called RWPCN (Random Walker on Protein Complex Network) for predicting and prioritizing disease genes. The basis of RWPCN is a protein complex network constructed using existing human protein complexes and protein interaction network. To prioritize candidate disease genes for the query disease phenotypes, we compute the associations between the protein complexes and the query phenotypes in their respective protein complex and phenotype networks. We tested RWPCN on predicting gene-phenotype associations using leave-one-out cross-validation; our method was observed to outperform existing approaches. We also applied RWPCN to predict novel disease genes for two representative diseases, namely, Breast Cancer and Diabetes.Conclusions/Significance
Guilt-by-association prediction and prioritization of disease genes can be enhanced by fully exploiting the underlying modular organizations of both the disease phenome and the protein interactome. Our RWPCN uses a novel protein complex network as a basis for interrogating the human phenome-interactome network. As the protein complex network can capture the underlying modularity in the biological interaction networks better than simple protein interaction networks, RWPCN was found to be able to detect and prioritize disease genes better than traditional approaches that used only protein-phenotype associations. 相似文献19.
Jin Lu Jiangwen Sun Xinyu Wang Henry Kranzler Joel Gelernter Jinbo Bi 《BMC systems biology》2018,12(6):104
Background
Although substance use disorders (SUDs) are heritable, few genetic risk factors for them have been identified, in part due to the small sample sizes of study populations. To address this limitation, researchers have aggregated subjects from multiple existing genetic studies, but these subjects can have missing phenotypic information, including diagnostic criteria for certain substances that were not originally a focus of study. Recent advances in addiction neurobiology have shown that comorbid SUDs (e.g., the abuse of multiple substances) have similar genetic determinants, which makes it possible to infer missing SUD diagnostic criteria using criteria from another SUD and patient genotypes through statistical modeling.Results
We propose a new approach based on matrix completion techniques to integrate features of comorbid health conditions and individual’s genotypes to infer unreported diagnostic criteria for a disorder. This approach optimizes a bi-linear model that uses the interactions between known disease correlations and candidate genes to impute missing criteria. An efficient stochastic and parallel algorithm was developed to optimize the model with a speed 20 times greater than the classic sequential algorithm. It was tested on 3441 subjects who had both cocaine and opioid use disorders and successfully inferred missing diagnostic criteria with consistently better accuracy than other recent statistical methods.Conclusions
The proposed matrix completion imputation method is a promising tool to impute unreported or unobserved symptoms or criteria for disease diagnosis. Integrating data at multiple scales or from heterogeneous sources may help improve the accuracy of phenotype imputation.20.
The reconstruction of gene regulatory networks (GRNs) from high-throughput experimental data has been considered one of the most important issues in systems biology research. With the development of high-throughput technology and the complexity of biological problems, we need to reconstruct GRNs that contain thousands of genes. However, when many existing algorithms are used to handle these large-scale problems, they will encounter two important issues: low accuracy and high computational cost. To overcome these difficulties, the main goal of this study is to design an effective parallel algorithm to infer large-scale GRNs based on high-performance parallel computing environments. In this study, we proposed a novel asynchronous parallel framework to improve the accuracy and lower the time complexity of large-scale GRN inference by combining splitting technology and ordinary differential equation (ODE)-based optimization. The presented algorithm uses the sparsity and modularity of GRNs to split whole large-scale GRNs into many small-scale modular subnetworks. Through the ODE-based optimization of all subnetworks in parallel and their asynchronous communications, we can easily obtain the parameters of the whole network. To test the performance of the proposed approach, we used well-known benchmark datasets from Dialogue for Reverse Engineering Assessments and Methods challenge (DREAM), experimentally determined GRN of Escherichia coli and one published dataset that contains more than 10 thousand genes to compare the proposed approach with several popular algorithms on the same high-performance computing environments in terms of both accuracy and time complexity. The numerical results demonstrate that our parallel algorithm exhibits obvious superiority in inferring large-scale GRNs. 相似文献