首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Inferring protein interactions from phylogenetic distance matrices   总被引:2,自引:0,他引:2  
Finding the interacting pairs of proteins between two different protein families whose members are known to interact is an important problem in molecular biology. We developed and tested an algorithm that finds optimal matches between two families of proteins by comparing their distance matrices. A distance matrix provides a measure of the sequence similarity of proteins within a family. Since the protein sets of interest may have dozens of proteins each, the use of an efficient approximate solution is necessary. Therefore the approach we have developed consists of a Metropolis Monte Carlo optimization algorithm which explores the search space of possible matches between two distance matrices. We demonstrate that by using this algorithm we are able to accurately match chemokines and chemokine-receptors as well as the tgfbeta family of ligands and their receptors.  相似文献   

2.
The prospect of understanding the relationship between the genome and the physiology of an organism is an important incentive to reconstruct metabolic networks. The first steps in the process can be automated and it does not take much effort to obtain an initial metabolic reconstruction from a genome sequence. However, such a reconstruction is certainly not flawless and correction of the many imperfections is laborious. It requires the combined analysis of the available information on protein sequence, phylogeny, gene-context and co-occurrence but is also aided by high-throughput experimental data. Simultaneously, the reconstructed network provides the opportunity to visualize the "omics" data within a relevant biological functional context and thus aids the interpretation of those data.  相似文献   

3.
For a given set L of species and a set T of triplets on L, we seek to construct a phylogenetic network which is consistent with T i.e. which represents all triplets of T. The level of a network is defined as the maximum number of hybrid vertices in its biconnected components. When T is dense, there exist polynomial time algorithms to construct level-0,1 and 2 networks (Aho et al., 1981; Jansson, Nguyen and Sung, 2006; Jansson and Sung, 2006; Iersel et al., 2009). For higher levels, partial answers were obtained in the paper by Iersel and Kelk (2008), with a polynomial time algorithm for simple networks. In this paper, we detail the first complete answer for the general case, solving a problem proposed in Jansson and Sung (2006) and Iersel et al. (2009). For any k fixed, it is possible to construct a level-k network having the minimum number of hybrid vertices and consistent with T, if there is any, in time O(T(k+1)n([4k/3]+1)).  相似文献   

4.

Background  

In bio-systems, genes, proteins and compounds are related to each other, thus forming complex networks. Although each organism has its individual network, some organisms contain common sub-networks based on function. Given a certain sub-network, the distribution of organisms common to it represents the diversity of its function.  相似文献   

5.
We studied reproductive performance in two flea species (Parapulex chephrenis and Xenopsylla ramesis) exploiting either a principal or one of eight auxiliary host species. We predicted that fleas would produce more eggs and adult offspring when exploiting (i) a principal host than an auxiliary host and (ii) an auxiliary host phylogenetically close to a principal host than an auxiliary host phylogenetically distant from a principal host. In both flea species, egg production per female after one feeding and production of new imago after a timed period of an uninterrupted stay on a host differed significantly between host species. In general, egg and/or new imago production in fleas feeding on an auxiliary host was lower than in fleas feeding on the principal host, except for the auxiliary host that was the closest relative of the principal host. When all auxiliary host species were considered, we did not find any significant relationship between either egg or new imago production in fleas exploiting an auxiliary host and phylogenetic distance between this host and the principal host. However, when the analyses were restricted to auxiliary hosts belonging to the same family as the principal host (Muridae), new imago production (for P. chephrenis) or both egg and new imago production (for X. ramesis) in an auxiliary host decreased significantly with an increase in phylogenetic distance between the auxiliary and principal host. Our results demonstrated that a parasite achieves higher fitness in auxiliary hosts that are either the most closely related to or the most distant from its principal host. This may affect host associations of a parasite invading new areas.  相似文献   

6.
7.
Few issues in evolutionary biology have received as much attention over the years or have generated as much controversy as those involving evolutionary rates. One unresolved issue is whether or not shifts in speclation and/or extinction rates are closely tied to the origin of 'key' innovations in evolution. This discussion has long been dominated by 'time-based' methods using data from the fossil record. Recently, however, attention has shifted to 'tree-based' methods, in which time, if It plays any role at all, is incorporated secondarily, usually based on molecular data. Tests of hypotheses about key innovations do require Information about phylogenetic relationships, and some of these tests can be implemented without any information about time. However, every effort should be made to obtain information about time, which greatly increases the power of such tests.  相似文献   

8.
This paper proposes a new method to identify communities in generally weighted complex networks and apply it to phylogenetic analysis. In this case, weights correspond to the similarity indexes among protein sequences, which can be used for network construction so that the network structure can be analyzed to recover phylogenetically useful information from its properties. The analyses discussed here are mainly based on the modular character of protein similarity networks, explored through the Newman-Girvan algorithm, with the help of the neighborhood matrix . The most relevant networks are found when the network topology changes abruptly revealing distinct modules related to the sets of organisms to which the proteins belong. Sound biological information can be retrieved by the computational routines used in the network approach, without using biological assumptions other than those incorporated by BLAST. Usually, all the main bacterial phyla and, in some cases, also some bacterial classes corresponded totally (100%) or to a great extent (>70%) to the modules. We checked for internal consistency in the obtained results, and we scored close to 84% of matches for community pertinence when comparisons between the results were performed. To illustrate how to use the network-based method, we employed data for enzymes involved in the chitin metabolic pathway that are present in more than 100 organisms from an original data set containing 1,695 organisms, downloaded from GenBank on May 19, 2007. A preliminary comparison between the outcomes of the network-based method and the results of methods based on Bayesian, distance, likelihood, and parsimony criteria suggests that the former is as reliable as these commonly used methods. We conclude that the network-based method can be used as a powerful tool for retrieving modularity information from weighted networks, which is useful for phylogenetic analysis.  相似文献   

9.
A method is described that allows the assessment of treelikeness of phylogenetic distance data before tree estimation. This method is related to statistical geometry as introduced by Eigen, Winkler-Oswatitsch, and Dress (1988 [Proc. Natl. Acad. Sci. USA. 85:5913-5917]), and in essence, displays a measure for treelikeness of quartets in terms of a histogram that we call a delta plot. This allows identification of nontreelike data and analysis of noisy data sets arising from processes such as, for example, parallel evolution, recombination, or lateral gene transfer. In addition to an overall assessment of treelikeness, individual taxa can be ranked by reference to the treelikeness of the quartets to which they belong. Removal of taxa on the basis of this ranking results in an increase in accuracy of tree estimation. Recombinant data sets are simulated, and the method is shown to be capable of identifying single recombinant taxa on the basis of distance information alone, provided the parents of the recombinant sequence are sufficiently divergent and the mixture of tree histories is not strongly skewed toward a single tree. delta Plots and taxon rankings are applied to three biological data sets using distances derived from sequence alignment, gene order, and fragment length polymorphism.  相似文献   

10.
11.
In many phylogenetic problems, assuming that species have evolved from a common ancestor by a simple branching process is unrealistic. Reticulate phylogenetic models, however, have been largely neglected because the concept of reticulate evolution have not been supported by using appropriate analytical tools and software. The reticulate model can adequately describe such complicated mechanisms as hybridization between species or lateral gene transfer in bacteria. In this paper, we describe a new algorithm for inferring reticulate phylogenies from evolutionary distances among species. The algorithm is capable of detecting contradictory signals encompassed in a phylogenetic tree and identifying possible reticulate events that may have occurred during evolution. The algorithm produces a reticulate phylogeny by gradually improving upon the initial solution provided by a phylogenetic tree model. The new algorithm is compared to the popular SplitsGraph method in a reanalysis of the evolution of photosynthetic organisms. A computer program to construct and visualize reticulate phylogenies, called T-Rex (Tree and Reticulogram Reconstruction), is available to researchers at the following URL: www.fas.umontreal.ca/biol/casgrain/en/labo/t-rex.  相似文献   

12.
This research proposes a simplified method for estimating the mesohabitat composition that would favour members of a given set of aquatic species. The simulated composition of four types of mesohabitat units (deep pool, shallow pool, deep riffle and shallow riffle) could guide the design of in‐stream structures in creating pool‐riffle systems with ecological reference. Fish community data and an autecology matrix are used to support the development of a stream mesohabitat simulation based on regression models for reaches in mid to upper‐order streams. The fish community‐mesohabitat model results constitute a reference condition that can be used to guide stream restoration and ecological engineering decisions aimed at maintaining the natural ecological integrity and diversity of rivers.  相似文献   

13.
The small parsimony problem is studied for reconstructing recombination networks from sequence data. The small parsimony problem is polynomial-time solvable for phylogenetic trees. However, the problem is proved NP-hard even for galled recombination networks. A dynamic programming algorithm is also developed to solve the small parsimony problem. It takes O(dn2(3h)) time on an input recombination network over length-d sequences in which there are h recombination and n - h tree nodes.  相似文献   

14.
The first sequenced marsupial genome promises to reveal unparalleled insights into mammalian evolution. We have used theMonodelphis domestica (gray short-tailed opossum) sequence to construct the first map of a marsupial major histocompatibility complex (MHC). The MHC is the most gene-dense region of the mammalian genome and is critical to immunity and reproductive success. The marsupial MHC bridges the phylogenetic gap between the complex MHC of eutherian mammals and the minimal essential MHC of birds. Here we show that the opossum MHC is gene dense and complex, as in humans, but shares more organizational features with non-mammals. The Class I genes have amplified within the Class II region, resulting in a unique Class I/II region. We present a model of the organization of the MHC in ancestral mammals and its elaboration during mammalian evolution. The opossum genome, together with other extant genomes, reveals the existence of an ancestral “immune supercomplex” that contained genes of both types of natural killer receptors together with antigen processing genes and MHC genes.  相似文献   

15.
16.
We present a dimensionless fit index for phylogenetic trees that have been constructed from distance matrices. It is designed to measure the quality of the fit of the data to a tree in absolute terms, independent of linear transformations on the distance matrix. The index can be used as an absolute measure to evaluate how well a set of data fits to a tree, or as a relative measure to compare different methods that are expected to produce the same tree. The usefulness of the index is demonstrated in three examples.  相似文献   

17.
Summary The concept of phylogenetic denseness bears critically on the accuracy of evolutionary pathways inferred from experimentally sequenced proteins isolated from extant species. In this paper I develop an objective measure,, of denseness to supplement previous intuitive concepts and which permits one to use this concept in comparing the quality of different evolutionary reconstructions. This measure is used to examine several published phylogenetic trees: insulin, a-hemoglobin,-hemoglobin, myoglobin, cytochromec, and the parvalbumin family. The paper emphasizes 1) the importance of denseness in accurately estimating the number of nucleotide replacements which separate homologous sequences when this estimation is made by the method of parsimony, 2) the value of this concept in assessing the quality of those estimates, and 3) the use of this concept as a biologically practical heuristic method for identifying poorly studied regions in a phylogenetic tree, whether or not the tree was obtained by the parsimony method.  相似文献   

18.
S Edmondson  N Khan  J Shriver  J Zdunek  A Gr?slund 《Biochemistry》1991,30(47):11271-11279
A model of the structure of the 22 amino acid residue gastrointestinal peptide hormone motilin in 30% hexafluoro-2-propanol has been obtained by using distance constraints obtained from two-dimensional nuclear Overhauser enhancements. A set of initial structures have been generated by using the distance geometry program DIANA, and 10 of these structures have been refined by using restrained molecular dynamics (AMBER). The resulting structures are virtually indistinguishable in terms of constraint violations and energies and display less than 0.5-A root mean square deviations (RMSD) of the backbone atom positions from Tyr7 to Lys20. A comparison of back-calculated and experimental NOE intensities indicates that RMSD's are not the best indicators of the goodness of fit or of the precision with which the structure is defined. The structure was further refined by fitting the experimental NOE data using an iterative full relaxation matrix analysis. The mean error between the observed and calculated backbone NOE intensities for the final refined structure was 0.23 for the full length of the molecule, 0.18 for the region from Glu9 to Lys20, and 0.29 for the region from Phe1 to Gly8. R factors for the same regions were 0.27, 0.19, and 0.43, respectively. All of the NOE-determined structures consistently display an alpha-helix which extends from Glu9 to Lys20. Considerable lack of definition of structure exists at the amino and carboxyl ends of the molecule and also in the vicinity of Thr6-Tyr7-Gly8. A tendency to form a wide turn appears to exist over the sequence Pro3-Ile4-Phe5-Thr6, but the structure in this region is not well defined by the NOE data.  相似文献   

19.
20.
Distance based algorithms are a common technique in the construction of phylogenetic trees from taxonomic sequence data. The first step in the implementation of these algorithms is the calculation of a pairwise distance matrix to give a measure of the evolutionary change between any pair of the extant taxa. A standard technique is to use the log det formula to construct pairwise distances from aligned sequence data. We review a distance measure valid for the most general models, and show how the log det formula can be used as an estimator thereof. We then show that the foundation upon which the log det formula is constructed can be generalized to produce a previously unknown estimator which improves the consistency of the distance matrices constructed from the log det formula. This distance estimator provides a consistent technique for constructing quartets from phylogenetic sequence data under the assumption of the most general Markov model of sequence evolution.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号