共查询到20条相似文献,搜索用时 0 毫秒
1.
Anand K. Gavai Yury Tikunov Remco Ursem Arnaud Bovy Fred van Eeuwijk Harm Nijveen Peter J. F. Lucas Jack A. M. Leunissen 《Metabolomics : Official journal of the Metabolomic Society》2009,5(4):419-428
Clustering and correlation analysis techniques have become popular tools for the analysis of data produced by metabolomics experiments. The results obtained from these approaches provide an overview of the interactions between objects of interest. Often in these experiments, one is more interested in information about the nature of these relationships, e.g., cause-effect relationships, than in the actual strength of the interactions. Finding such relationships is of crucial importance as most biological processes can only be understood in this way. Bayesian networks allow representation of these cause-effect relationships among variables of interest in terms of whether and how they influence each other given that a third, possibly empty, group of variables is known. This technique also allows the incorporation of prior knowledge as established from the literature or from biologists. The representation as a directed graph of these relationship is highly intuitive and helps to understand these processes. This paper describes how constraint-based Bayesian networks can be applied to metabolomics data and can be used to uncover the important pathways which play a significant role in the ripening of fresh tomatoes. We also show here how this methods of reconstructing pathways is intuitive and performs better than classical techniques. Methods for learning Bayesian network models are powerful tools for the analysis of data of the magnitude as generated by metabolomics experiments. It allows one to model cause-effect relationships and helps in understanding the underlying processes. 相似文献
2.
MOTIVATION: Perhaps the greatest challenge of modern biology is to develop accurate in silico models of cells. To do this we require computational formalisms for both simulation (how according to the model the state of the cell evolves over time) and identification (learning a model cell from observation of states). We propose the use of qualitative reasoning (QR) as a unified formalism for both tasks. The two most commonly used alternative methods of modelling biochemical pathways are ordinary differential equations (ODEs), and logical/graph-based (LG) models. RESULTS: The QR formalism we use is an abstraction of ODEs. It enables the behaviour of many ODEs, with different functional forms and parameters, to be captured in a single QR model. QR has the advantage over LG models of explicitly including dynamics. To simulate biochemical pathways we have developed 'enzyme' and 'metabolite' QR building blocks that fit together to form models. These models are finite, directly executable, easy to interpret and robust. To identify QR models we have developed heuristic chemoinformatics graph analysis and machine learning procedures. The graph analysis procedure is a series of constraints and heuristics that limit the number of ways metabolites can combine to form pathways. The machine learning procedure is generate-and-test inductive logic programming. We illustrate the use of QR for modelling and simulation using the example of glycolysis. AVAILABILITY: All data and programs used are available on request. 相似文献
4.
5.
Zhen Wang Guohui Ding Zhonghao Yu Lei Liu Yixue Li 《Algorithms for molecular biology : AMB》2009,4(1):2-7
Background
The identification of chromosomal homologous segments (CHS) within and between genomes is essential for comparative genomics. Various processes including insertion/deletion and inversion could cause the degeneration of CHSs. 相似文献6.
7.
8.
One of the primary aims of synthetic biology is to (re)design metabolic pathways towards the production of desired chemicals. The fast pace of developments in molecular biology increasingly makes it possible to experimentally redesign existing pathways and implement de novo ones in microbes or using in vitro platforms. For such experimental studies, the bottleneck is shifting from implementation of pathways towards their initial design. Here, we present an online tool called ‘Metabolic Tinker’, which aims to guide the design of synthetic metabolic pathways between any two desired compounds. Given two user-defined ‘target’ and ‘source’ compounds, Metabolic Tinker searches for thermodynamically feasible paths in the entire known metabolic universe using a tailored heuristic search strategy. Compared with similar graph-based search tools, Metabolic Tinker returns a larger number of possible paths owing to its broad search base and fast heuristic, and provides for the first time thermodynamic feasibility information for the discovered paths. Metabolic Tinker is available as a web service at http://osslab.ex.ac.uk/tinker.aspx. The same website also provides the source code for Metabolic Tinker, allowing it to be developed further or run on personal machines for specific applications. 相似文献
9.
Background
Targeting conserved proteins of bacteria through antibacterial medications has resulted in both the development of resistant strains and changes to human health by destroying beneficial microbes which eventually become breeding grounds for the evolution of resistances. Despite the availability of more than 800 genomes sequences, 430 pathways, 4743 enzymes, 9257 metabolic reactions and protein (three-dimensional) 3D structures in bacteria, no pathogen-specific computational drug target identification tool has been developed.Methods
A web server, UniDrug-Target, which combines bacterial biological information and computational methods to stringently identify pathogen-specific proteins as drug targets, has been designed. Besides predicting pathogen-specific proteins essentiality, chokepoint property, etc., three new algorithms were developed and implemented by using protein sequences, domains, structures, and metabolic reactions for construction of partial metabolic networks (PMNs), determination of conservation in critical residues, and variation analysis of residues forming similar cavities in proteins sequences. First, PMNs are constructed to determine the extent of disturbances in metabolite production by targeting a protein as drug target. Conservation of pathogen-specific protein''s critical residues involved in cavity formation and biological function determined at domain-level with low-matching sequences. Last, variation analysis of residues forming similar cavities in proteins sequences from pathogenic versus non-pathogenic bacteria and humans is performed.Results
The server is capable of predicting drug targets for any sequenced pathogenic bacteria having fasta sequences and annotated information. The utility of UniDrug-Target server was demonstrated for Mycobacterium tuberculosis (H37Rv). The UniDrug-Target identified 265 mycobacteria pathogen-specific proteins, including 17 essential proteins which can be potential drug targets.Conclusions/Significance
UniDrug-Target is expected to accelerate pathogen-specific drug targets identification which will increase their success and durability as drugs developed against them have less chance to develop resistances and adverse impact on environment. The server is freely available at http://117.211.115.67/UDT/main.html. The standalone application (source codes) is available at http://www.bioinformatics.org/ftp/pub/bioinfojuit/UDT.rar. 相似文献10.
Jové M Serrano JC Ortega N Ayala V Anglès N Reguant J Morelló JR Romero MP Motilva MJ Prat J Pamplona R Portero-Otín M 《Journal of proteome research》2011,10(8):3501-3512
Metabonomics has recently been used to study the physiological response to a given nutritional intervention, but such studies have usually been restricted to changes in either plasma or urine. In the present study, we demonstrate that the use of LC-Q-TOF-based metabolome analyses (foodstuff, plasma, urine, and caecal content metabolomes) in mice offer higher order information, including intra- and intercompartment relationships. To illustrate this, we performed an intervention study with three different phenolic-rich extracts in mice over 3 weeks. Both unsupervised (PCA) and supervised (PLS-DA) multivariate analyses used for pattern recognition revealed marked effects of diet in each compartment (plasma, urine, and caecal contents). Specifically, dietary intake of phenolic-rich extract affects pathways such as bile acid and taurine metabolism. Q-TOF-based metabonomics demonstrated that the number of correlations is higher in caecal contents and urine than in plasma. Moreover, intercompartment correlations showed that caecal contents-plasma correlations are the most frequent in mice, followed by plasma-urine ones. The number of inter- and intracompartment correlations is significantly affected by diet. These analyses reveal the complexity of interorgan metabolic relationships and their sensitivity to dietary changes. 相似文献
11.
Restriction enzyme site-directed amplification PCR: a tool to identify regions flanking a marker DNA
González-Ballester D de Montaigu A Galván A Fernández E 《Analytical biochemistry》2005,340(2):330-335
An innovative combination of various recently described molecular methods was set up to efficiently identify regions flanking a marker DNA in insertional mutants of Chlamydomonas. The technique is named restriction enzyme site-directed amplification PCR (RESDA-PCR) and is based on the random distribution of frequent restriction sites in a genome and on a special design of primers. The primer design is based on the presence of a restriction site included in a low degenerated sequence at the 3' end and of a specific adapter sequence at the 5' end, with the two ends being linked by a polyinosine bridge. Specific primers of the marker DNA combined with the degenerated primers allow amplification of DNA fragments adjacent to the marker insertion by using two rounds of either short or long cycling procedures. Amplified fragments from 0.3 to 2 kb or more are routinely obtained at sufficient purity and quantity for direct sequencing. This method is fast, is reliable (87% success rate), and can be easily extrapolated to any organism and marker DNA by designing the appropriate primers. A procedure involving the PCR over enzyme digest fragments is also proposed for when, exceptionally, positive results are not obtained. 相似文献
12.
13.
Vêncio RZ Patrão DF Baptista CS Pereira CA Zingales B 《Genetics and molecular research : GMR》2006,5(1):138-142
One of the goals of gene expression experiments is the identification of differentially expressed genes among populations that could be used as markers. For this purpose, we implemented a model-free Bayesian approach in a user-friendly and freely available web-based tool called BayBoots. In spite of a common misunderstanding that Bayesian and model-free approaches are incompatible, we merged them in the BayBoots implementation using the Kernel density estimator and Rubin 's Bayesian Bootstrap. We used the Bayes error rate (BER) instead of the usual P values as an alternative statistical index to rank a class marker's discriminative potential, since it can be visualized by a simple graphical representation and has an intuitive interpretation. Subsequently, Bayesian Bootstrap was used to assess BER 's credibility. We tested BayBoots on microarray data to look for markers for Trypanosoma cruzi strains isolated from cardiac and asymptomatic patients. We found that the three most frequently used methods in microarray analysis: t-test, non-parametric Wilcoxon test and correlation methods, yielded several markers that were discarded by a time-consuming visual check. On the other hand, the BayBoots graphical output and ranking was able to automatically identify markers for which classification performance was consistent. BayBoots is available at: http://www.vision.ime.usp.br/~rvencio/BayBoots. 相似文献
14.
Gene expression data can provide a very rich source of information for elucidating the biological function on the pathway level if the experimental design considers the needs of the statistical analysis methods. The purpose of this paper is to provide a comparative analysis of statistical methods for detecting the differentially expression of pathways (DEP). In contrast to many other studies conducted so far, we use three novel simulation types, producing a more realistic correlation structure than previous simulation methods. This includes also the generation of surrogate data from two large-scale microarray experiments from prostate cancer and ALL. As a result from our comprehensive analysis of 41,004 parameter configurations, we find that each method should only be applied if certain conditions of the data from a pathway are met. Further, we provide method-specific estimates for the optimal sample size for microarray experiments aiming to identify DEP in order to avoid an underpowered design. Our study highlights the sensitivity of the studied methods on the parameters of the system. 相似文献
15.
16.
With the recent quick expansion of DNA and protein sequence databases, intensive efforts are underway to interpret the linear genetic information of DNA in terms of function, structure, and control of biological processes. The systematic identification and quantification of expressed proteins has proven particularly powerful in this regard. Large-scale protein identification is usually achieved by automated liquid chromatography-tandem mass spectrometry of complex peptide mixtures and sequence database searching of the resulting spectra [Aebersold and Goodlett, Chem. Rev. 2001, 101, 269-295]. As generating large numbers of sequence-specific mass spectra (collision-induced dissociation/CID) spectra has become a routine operation, research has shifted from the generation of sequence database search results to their validation. Here we describe in detail a novel probabilistic model and score function that ranks the quality of the match between tandem mass spectral data and a peptide sequence in a database. We document the performance of the algorithm on a reference data set and in comparison with another sequence database search tool. The software is publicly available for use and evaluation at http://www.systemsbiology.org/research/software/proteomics/ProbID. 相似文献
17.
M J Kuhar 《Life sciences》1973,13(12):1623-1634
18.
Drosophila parthenogenesis: a tool to decipher centrosomal vs acentrosomal spindle assembly pathways 总被引:1,自引:0,他引:1
Development of unfertilized eggs in the parthenogenetic strain K23-O-im of Drosophila mercatorum requires the stochastic interactions of self-assembled centrosomes with the female chromatin. In a portion of the unfertilized eggs that do not assemble centrosomes, microtubules organize a bipolar anastral mitotic spindle around the chromatin like the one formed during the first female meiosis, suggesting that similar pathways may be operative. In the cytoplasm of eggs in which centrosomes do form, monastral and biastral spindles are found. Analysis by laser scanning confocal microscopy suggests that these spindles are derived from the stochastic interaction of astral microtubules directly with kinetochore regions or indirectly with kinetochore microtubules. Our findings are consistent with the idea that mitotic spindle assembly requires both acentrosomal and centrosomal pathways, strengthening the hypothesis that astral microtubules can dictate the organization of the spindle by capturing kinetochore microtubules. 相似文献
19.
The sequencing and analysis of multiple housekeeping genes has been routinely used to phylogenetically compare closely related bacterial isolates. Recent studies using whole-genome alignment (WGA) and phylogenetics from >100 Escherichia coli genomes has demonstrated that tree topologies from WGA and multilocus sequence typing (MLST) markers differ significantly. A nonrepresentative phylogeny can lead to incorrect conclusions regarding important evolutionary relationships. In this study, the Phylomark algorithm was developed to identify a minimal number of useful phylogenetic markers that recapitulate the WGA phylogeny. To test the algorithm, we used a set of diverse draft and complete E. coli genomes. The algorithm identified more than 100,000 potential markers of different fragment lengths (500 to 900 nucleotides). Three molecular markers were ultimately chosen to determine the phylogeny based on a low Robinson-Foulds (RF) distance compared to the WGA phylogeny. A phylogenetic analysis demonstrated that a more representative phylogeny was inferred for a concatenation of these markers compared to all other MLST schemes for E. coli. As a functional test of the algorithm, the three markers (genomic guided E. coli markers, or GIG-EM) were amplified and sequenced from a set of environmental E. coli strains (ECOR collection) and informatically extracted from a set of 78 diarrheagenic E. coli strains (DECA collection). In the instances of the 40-genome test set and the DECA collection, the GIG-EM system outperformed other E. coli MLST systems in terms of recapitulating the WGA phylogeny. This algorithm can be employed to determine the minimal marker set for any organism that has sufficient genome sequencing. 相似文献
20.
A statistical challenge in community ecology is to identify segregated and aggregated pairs of species from a binary presence–absence matrix, which often contains hundreds or thousands of such potential pairs. A similar challenge is found in genomics and proteomics, where the expression of thousands of genes in microarrays must be statistically analyzed. Here we adapt the empirical Bayes method to identify statistically significant species pairs in a binary presence–absence matrix. We evaluated the performance of a simple confidence interval, a sequential Bonferroni test, and two tests based on the mean and the confidence interval of an empirical Bayes method. Observed patterns were compared to patterns generated from null model randomizations that preserved matrix row and column totals. We evaluated these four methods with random matrices and also with random matrices that had been seeded with an additional segregated or aggregated species pair. The Bayes methods and Bonferroni corrections reduced the frequency of false-positive tests (type I error) in random matrices, but did not always correctly identify the non-random pair in a seeded matrix (type II error). All of the methods were vulnerable to identifying spurious secondary associations in the seeded matrices. When applied to a set of 272 published presence–absence matrices, even the most conservative tests indicated a fourfold increase in the frequency of perfectly segregated “checkerboard” species pairs compared to the null expectation, and a greater predominance of segregated versus aggregated species pairs. The tests did not reveal a large number of significant species pairs in the Vanuatu bird matrix, but in the much smaller Galapagos bird matrix they correctly identified a concentration of segregated species pairs in the genus Geospiza. The Bayesian methods provide for increased selectivity in identifying non-random species pairs, but the analyses will be most powerful if investigators can use a priori biological criteria to identify potential sets of interacting species. 相似文献