期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

New components of the <Emphasis Type="Italic">Dictyostelium</Emphasis> PKA pathway revealed by Bayesian analysis of expression data

Anup Parikh Eryong Huang Christopher Dinh Blaz Zupan Adam Kuspa Devika Subramanian Gad Shaulsky 《BMC bioinformatics》2010,11(1):163

Background

Identifying candidate genes in genetic networks is important for understanding regulation and biological function. Large gene expression datasets contain relevant information about genetic networks, but mining the data is not a trivial task. Algorithms that infer Bayesian networks from expression data are powerful tools for learning complex genetic networks, since they can incorporate prior knowledge and uncover higher-order dependencies among genes. However, these algorithms are computationally demanding, so novel techniques that allow targeted exploration for discovering new members of known pathways are essential. 相似文献

2.

An empirical Bayesian approach for model-based inference of cellular signaling networks

David J Klinke II 《BMC bioinformatics》2009,10(1):371-18

相似文献

3.

Accurate Inference of Subtle Population Structure (and Other Genetic Discontinuities) Using Principal Coordinates

Patrick A. Reeves Christopher M. Richards 《PloS one》2009,4(1)

Background

Accurate inference of genetic discontinuities between populations is an essential component of intraspecific biodiversity and evolution studies, as well as associative genetics. The most widely-used methods to infer population structure are model-based, Bayesian MCMC procedures that minimize Hardy-Weinberg and linkage disequilibrium within subpopulations. These methods are useful, but suffer from large computational requirements and a dependence on modeling assumptions that may not be met in real data sets. Here we describe the development of a new approach, PCO-MC, which couples principal coordinate analysis to a clustering procedure for the inference of population structure from multilocus genotype data.

Methodology/Principal Findings

PCO-MC uses data from all principal coordinate axes simultaneously to calculate a multidimensional “density landscape”, from which the number of subpopulations, and the membership within subpopulations, is determined using a valley-seeking algorithm. Using extensive simulations, we show that this approach outperforms a Bayesian MCMC procedure when many loci (e.g. 100) are sampled, but that the Bayesian procedure is marginally superior with few loci (e.g. 10). When presented with sufficient data, PCO-MC accurately delineated subpopulations with population F_st values as low as 0.03 (G''_st>0.2), whereas the limit of resolution of the Bayesian approach was F_st = 0.05 (G''_st>0.35).

Conclusions/Significance

We draw a distinction between population structure inference for describing biodiversity as opposed to Type I error control in associative genetics. We suggest that discrete assignments, like those produced by PCO-MC, are appropriate for circumscribing units of biodiversity whereas expression of population structure as a continuous variable is more useful for case-control correction in structured association studies. 相似文献

4.

Inference of gene pathways using mixture Bayesian networks 总被引：1，自引：0，他引：1

Younhee Ko ChengXiang Zhai Sandra Rodriguez-Zas 《BMC systems biology》2009,3(1):54

Background

Inference of gene networks typically relies on measurements across a wide range of conditions or treatments. Although one network structure is predicted, the relationship between genes could vary across conditions. A comprehensive approach to infer general and condition-dependent gene networks was evaluated. This approach integrated Bayesian network and Gaussian mixture models to describe continuous microarray gene expression measurements, and three gene networks were predicted. 相似文献

5.

Assessing population genetic structure via the maximisation of genetic distance

Silvia T Rodríguez-Ramilo Miguel A Toro Jesús Fernández 《遗传、选种与进化》2009,41(1):49

Background

The inference of the hidden structure of a population is an essential issue in population genetics. Recently, several methods have been proposed to infer population structure in population genetics.

Methods

In this study, a new method to infer the number of clusters and to assign individuals to the inferred populations is proposed. This approach does not make any assumption on Hardy-Weinberg and linkage equilibrium. The implemented criterion is the maximisation (via a simulated annealing algorithm) of the averaged genetic distance between a predefined number of clusters. The performance of this method is compared with two Bayesian approaches: STRUCTURE and BAPS, using simulated data and also a real human data set.

Results

The simulations show that with a reduced number of markers, BAPS overestimates the number of clusters and presents a reduced proportion of correct groupings. The accuracy of the new method is approximately the same as for STRUCTURE. Also, in Hardy-Weinberg and linkage disequilibrium cases, BAPS performs incorrectly. In these situations, STRUCTURE and the new method show an equivalent behaviour with respect to the number of inferred clusters, although the proportion of correct groupings is slightly better with the new method. Re-establishing equilibrium with the randomisation procedures improves the precision of the Bayesian approaches. All methods have a good precision for F_ST≥ 0.03, but only STRUCTURE estimates the correct number of clusters for F_STas low as 0.01. In situations with a high number of clusters or a more complex population structure, MGD performs better than STRUCTURE and BAPS. The results for a human data set analysed with the new method are congruent with the geographical regions previously found.

Conclusion

This new method used to infer the hidden structure in a population, based on the maximisation of the genetic distance and not taking into consideration any assumption about Hardy-Weinberg and linkage equilibrium, performs well under different simulated scenarios and with real data. Therefore, it could be a useful tool to determine genetically homogeneous groups, especially in those situations where the number of clusters is high, with complex population structure and where Hardy-Weinberg and/or linkage equilibrium are present. 相似文献

6.

TimeDelay-ARACNE: Reverse engineering of gene networks from time-course data by an information theoretic approach

Pietro Zoppoli Sandro Morganella Michele Ceccarelli 《BMC bioinformatics》2010,11(1):154

Background

One of main aims of Molecular Biology is the gain of knowledge about how molecular components interact each other and to understand gene function regulations. Using microarray technology, it is possible to extract measurements of thousands of genes into a single analysis step having a picture of the cell gene expression. Several methods have been developed to infer gene networks from steady-state data, much less literature is produced about time-course data, so the development of algorithms to infer gene networks from time-series measurements is a current challenge into bioinformatics research area. In order to detect dependencies between genes at different time delays, we propose an approach to infer gene regulatory networks from time-series measurements starting from a well known algorithm based on information theory. 相似文献

7.

Statistical modeling of biomedical corpora: mining the Caenorhabditis Genetic Center Bibliography for genes related to life span

DM Blei K Franks MI Jordan IS Mian 《BMC bioinformatics》2006,7(1):250

Background

The statistical modeling of biomedical corpora could yield integrated, coarse-to-fine views of biological phenomena that complement discoveries made from analysis of molecular sequence and profiling data. Here, the potential of such modeling is demonstrated by examining the 5,225 free-text items in the Caenorhabditis Genetic Center (CGC) Bibliography using techniques from statistical information retrieval. Items in the CGC biomedical text corpus were modeled using the Latent Dirichlet Allocation (LDA) model. LDA is a hierarchical Bayesian model which represents a document as a random mixture over latent topics; each topic is characterized by a distribution over words. 相似文献

8.

Evolutionary history and molecular epidemiology of rabbit haemorrhagic disease virus in the Iberian Peninsula and Western Europe

Fernando Alda Tania Gaitero Mónica Suárez Tomás Merchán Gregorio Rocha Ignacio Doadrio 《BMC evolutionary biology》2010,10(1):347

Background

Rabbit haemorrhagic disease virus (RHDV) is a highly virulent calicivirus, first described in domestic rabbits in China in 1984. RHDV appears to be a mutant form of a benign virus that existed in Europe long before the first outbreak. In the Iberian Peninsula, the first epidemic in 1988 severely reduced the populations of autochthonous European wild rabbit. To examine the evolutionary history of RHDV in the Iberian Peninsula, we collected virus samples from wild rabbits and sequenced a fragment of the capsid protein gene VP60. These data together with available sequences from other Western European countries, were analyzed following Bayesian Markov chain Monte Carlo methods to infer their phylogenetic relationships, evolutionary rates and demographic history. 相似文献

9.

Evolution of testicular architecture in the Drosophilidae: A role for sperm length

Lukas Schärer Jean-Luc Da Lage Dominique Joly 《BMC evolutionary biology》2008,8(1):143

Background

Evolutionary biologists have so far largely treated the testis as a black box with a certain size, a matching resource demand and a resulting sperm output. A better understanding of the way that the testis responds to selection may come from recent developments in theoretical biology aimed at understanding the factors that influence the evolution of tissue architecture (i.e. the logical organisation of a tissue). Here we perform a comparative analysis of aspects of testicular architecture of the fruit fly family Drosophilidae. Specifically, we collect published information on the number of first (or primary) spermatocytes in spermatogenesis, which allows to infer an important aspect of testicular architecture. 相似文献

10.

Occupancy classification of position weight matrix-inferred transcription factor binding sites

Wright H Cohen A Sönmez K Yochum G McWeeney S 《PloS one》2011,6(11):e26160

相似文献

11.

MTML-msBayes: Approximate Bayesian comparative phylogeographic inference from multiple taxa and multiple loci with rate heterogeneity

Wen Huang Naoki Takebayashi Yan Qi Michael J Hickerson 《BMC bioinformatics》2011,12(1):1

Background

MTML-msBayes uses hierarchical approximate Bayesian computation (HABC) under a coalescent model to infer temporal patterns of divergence and gene flow across codistributed taxon-pairs. Under a model of multiple codistributed taxa that diverge into taxon-pairs with subsequent gene flow or isolation, one can estimate hyper-parameters that quantify the mean and variability in divergence times or test models of migration and isolation. The software uses multi-locus DNA sequence data collected from multiple taxon-pairs and allows variation across taxa in demographic parameters as well as heterogeneity in DNA mutation rates across loci. The method also allows a flexible sampling scheme: different numbers of loci of varying length can be sampled from different taxon-pairs. 相似文献

12.

An FPT haplotyping algorithm on pedigrees with a small number of sites

Duong D Doan Patricia A Evans 《Algorithms for molecular biology : AMB》2011,6(1):8

Background

Genetic disease studies investigate relationships between changes in chromosomes and genetic diseases. Single haplotypes provide useful information for these studies but extracting single haplotypes directly by biochemical methods is expensive. A computational method to infer haplotypes from genotype data is therefore important. We investigate the problem of computing the minimum number of recombination events for general pedigrees with a small number of sites for all members. 相似文献

13.

Iterative Bayesian Model Averaging: a method for the application of survival analysis to high-dimensional microarray data

Amalia Annest Roger E Bumgarner Adrian E Raftery Ka Yee Yeung 《BMC bioinformatics》2009,10(1):72

Background

Microarray technology is increasingly used to identify potential biomarkers for cancer prognostics and diagnostics. Previously, we have developed the iterative Bayesian Model Averaging (BMA) algorithm for use in classification. Here, we extend the iterative BMA algorithm for application to survival analysis on high-dimensional microarray data. The main goal in applying survival analysis to microarray data is to determine a highly predictive model of patients' time to event (such as death, relapse, or metastasis) using a small number of selected genes. Our multivariate procedure combines the effectiveness of multiple contending models by calculating the weighted average of their posterior probability distributions. Our results demonstrate that our iterative BMA algorithm for survival analysis achieves high prediction accuracy while consistently selecting a small and cost-effective number of predictor genes. 相似文献

14.

Molecular evolutionary rates predict both extinction and speciation in temperate angiosperm lineages

Lesley T Lancaster 《BMC evolutionary biology》2010,10(1):162

Background

A positive relationship between diversification (i.e., speciation) and nucleotide substitution rates is commonly reported for angiosperm clades. However, the underlying cause of this relationship is often unknown because multiple intrinsic and extrinsic factors can affect the relationship, and these have confounded previous attempts infer causation. Determining which factor drives this oft-reported correlation can lend insight into the macroevolutionary process. 相似文献

15.

Inference of domain-disease associations from domain-protein,protein-disease and disease-disease relationships

Wangshu Zhang Marcelo P. Coba Fengzhu Sun 《BMC systems biology》2016,10(Z1):S4

Background

Protein domains can be viewed as portable units of biological function that defines the functional properties of proteins. Therefore, if a protein is associated with a disease, protein domains might also be associated and define disease endophenotypes. However, knowledge about such domain-disease relationships is rarely available. Thus, identification of domains associated with human diseases would greatly improve our understandingof the mechanism of human complex diseases and further improve the prevention, diagnosis and treatment of these diseases.

Methods

Based on phenotypic similarities among diseases, we first group diseases into overlapping modules. We then develop a framework to infer associations between domains and diseases through known relationships between diseases and modules, domains and proteins, as well as proteins and disease modules. Different methods including Association, Maximum likelihood estimation (MLE), Domain-disease pair exclusion analysis (DPEA), Bayesian, and Parsimonious explanation (PE) approaches are developed to predict domain-disease associations.

Results

We demonstrate the effectiveness of all the five approaches via a series of validation experiments, and show the robustness of the MLE, Bayesian and PE approaches to the involved parameters. We also study the effects of disease modularization in inferring novel domain-disease associations. Through validation, the AUC (Area Under the operating characteristic Curve) scores for Bayesian, MLE, DPEA, PE, and Association approaches are 0.86, 0.84, 0.83, 0.83 and 0.79, respectively, indicating the usefulness of these approaches for predicting domain-disease relationships. Finally, we choose the Bayesian approach to infer domains associated with two common diseases, Crohn’s disease and type 2 diabetes.

Conclusions

The Bayesian approach has the best performance for the inference of domain-disease relationships. The predicted landscape between domains and diseases provides a more detailed view about the disease mechanisms.

相似文献

16.

Species delimitation and phylogeography of Aphonopelma hentzi (Araneae, Mygalomorphae, Theraphosidae): cryptic diversity in North American tarantulas

Hamilton CA Formanowicz DR Bond JE 《PloS one》2011,6(10):e26207

Background

The primary objective of this study is to reconstruct the phylogeny of the hentzi species group and sister species in the North American tarantula genus, Aphonopelma, using a set of mitochondrial DNA markers that include the animal “barcoding gene”. An mtDNA genealogy is used to consider questions regarding species boundary delimitation and to evaluate timing of divergence to infer historical biogeographic events that played a role in shaping the present-day diversity and distribution. We aimed to identify potential refugial locations, directionality of range expansion, and test whether A. hentzi post-glacial expansion fit a predicted time frame.

Methods and Findings

A Bayesian phylogenetic approach was used to analyze a 2051 base pair (bp) mtDNA data matrix comprising aligned fragments of the gene regions CO1 (1165 bp) and ND1-16S (886 bp). Multiple species delimitation techniques (DNA tree-based methods, a “barcode gap” using percent of pairwise sequence divergence (uncorrected p-distances), and the GMYC method) consistently recognized a number of divergent and genealogically exclusive groups.

Conclusions

The use of numerous species delimitation methods, in concert, provide an effective approach to dissecting species boundaries in this spider group; as well they seem to provide strong evidence for a number of nominal, previously undiscovered, and cryptic species. Our data also indicate that Pleistocene habitat fragmentation and subsequent range expansion events may have shaped contemporary phylogeographic patterns of Aphonopelma diversity in the southwestern United States, particularly for the A. hentzi species group. These findings indicate that future species delimitation approaches need to be analyzed in context of a number of factors, such as the sampling distribution, loci used, biogeographic history, breadth of morphological variation, ecological factors, and behavioral data, to make truly integrative decisions about what constitutes an evolutionary lineage recognized as a “species”. 相似文献

17.

Reconstruction of Gene Regulatory Modules in Cancer Cell Cycle by Multi-Source Data Integration

Yuji Zhang Jianhua Xuan Benildo G. de los Reyes Robert Clarke Habtom W. Ressom 《PloS one》2010,5(4)

相似文献

18.

Inferring a transcriptional regulatory network of the cytokinesis-related genes by network component analysis

Shun-Fu Chen Yue-Li Juang Wei-Kang Chou Jin-Mei Lai Chi-Ying F Huang Cheng-Yan Kao Feng-Sheng Wang 《BMC systems biology》2009,3(1):110-12

相似文献

19.

The identification of informative genes from multiple datasets with increasing complexity

S Yahya Anvar Peter AC 't Hoen Allan Tucker 《BMC bioinformatics》2010,11(1):32

Background

In microarray data analysis, factors such as data quality, biological variation, and the increasingly multi-layered nature of more complex biological systems complicates the modelling of regulatory networks that can represent and capture the interactions among genes. We believe that the use of multiple datasets derived from related biological systems leads to more robust models. Therefore, we developed a novel framework for modelling regulatory networks that involves training and evaluation on independent datasets. Our approach includes the following steps: (1) ordering the datasets based on their level of noise and informativeness; (2) selection of a Bayesian classifier with an appropriate level of complexity by evaluation of predictive performance on independent data sets; (3) comparing the different gene selections and the influence of increasing the model complexity; (4) functional analysis of the informative genes. 相似文献

20.

Bayesian non-negative factor analysis for reconstructing transcription factor mediated regulatory networks

Meng J Zhang JM Chen Y Huang Y 《Proteome science》2011,9(Z1):S9

相似文献