首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
The general Markov plus invariable sites (GM+I) model of biological sequence evolution is a two-class model in which an unknown proportion of sites are not allowed to change, while the remainder undergo substitutions according to a Markov process on a tree. For statistical use it is important to know if the model is identifiable; can both the tree topology and the numerical parameters be determined from a joint distribution describing sequences only at the leaves of the tree? We establish that for generic parameters both the tree and all numerical parameter values can be recovered, up to clearly understood issues of 'label swapping'. The method of analysis is algebraic, using phylogenetic invariants to study the variety defined by the model. Simple rational formulas, expressed in terms of determinantal ratios, are found for recovering numerical parameters describing the invariable sites.  相似文献   

2.
3.
A recently developed mathematical model for the analysis of phylogenetic trees is applied to comparative data for 48 species. The model represents a return to fundamentals and makes no hypothesis with respect to the reversibility of the process. The species have been analysed in all subsets of three, and a measure of reliability of the results is provided. The numerical results of the computations on 17,296 triples of species are made available on the Internet. These results are discussed and the development of reliable tree structures for several species is illustrated. It is shown that, indeed, the Markov model is capable of considerably more interesting predictions than has been recognized to date.  相似文献   

4.
Summary Sequences of 47 members of the Zn-containing alcohol dehydrogenase (ADH) family were aligned progressively, and an evolutionary tree with detailed branch order and branch lengths was produced. The alignment shows that only 9 amino acid residues (of 374 in the horse liver ADH sequence) are conserved in this family; these include eight Gly and one Val with structural roles. Three residues that bind the catalytic Zn and modulate its electrostatic environment are conserved in 45 members. Asp 223, which determines specificity for NAD, is found in all but the two NADP-dependent enzymes, which have Gly or Ala. Ser or Thr 48, which makes a hydrogen bond to the substrate, is present in 46 members. The four Cys ligands for the structural zinc are conserved except in -crystallin, the sorbitol dehydrogenases, and two bacterial enzymes. Analysis of the evolutionary tree gives estimates of the times of divergence for different animal ADHs. The human class II () and class III () ADHs probably diverged about 630 million years ago, and the newly identified human ADH6 appeared about 520 million years ago, implying that these classes of enzymes may exist or have existed in all vertebrates. The human class I ADH isoenzymes (, , and ) diverged about 80 million years ago, suggesting that these isoenzymes may exist or have existed in all primates. Analysis of branch lengths shows that these plant ADHs are more conserved than the animal ones and that class III ADHs are more conserved than class I ADHs. The rate of acceptance of point mutations (PAM units) shows that selection pressure has existed for ADHs, implying that these enzymes play definite metabolic roles.Offprint requests to: B.V. Plapp  相似文献   

5.
目的:构建节肢动物α-淀粉酶的系统进化树,探讨其进化关系,找出进化树中聚类在一起的α-淀粉酶的特异性序列。方法:在美国国立生物技术信息中心(National Center for Biotechnology Information,NCBI)数据库中选取了56个节肢动物的α-淀粉酶氨基酸序列,利用CLUSTALX2.0进行序列比对、MEGA6.0建立进化树,通过BOXSHADE找到聚类的α-淀粉酶特异性序列。结果:56个α-淀粉酶聚类成A、B、C、D四大簇,A簇特异性序列为\"VD NHD NQ\",B簇特异性序列为\"ID NHD NX\",C簇特异性序列为\"ID NHD NQ\",D簇特异性序列为\"XGN NHD X\"。A、B、C、D四簇都含有保守的NHD(天冬酰胺-组氨酸-天冬氨酸)序列,但序列两端氨基酸种类不同。结论:56个节肢动物α-淀粉酶分为4簇,每簇都有其特异性序列,但都含有保守序列NHD。  相似文献   

6.
陈兆斌 《生物信息学》2013,11(4):317-320
这篇文章要讨论的拽线法(DL)是贪婪算法的一种。和Fitch—Margoliash(FM)一样,DL也是基于距离矩阵构建系统发育树,但是和FM算法相比,DL具有低复杂度、较高的容错性和准确度高的优点。当存在误差时,DL算法只是加大了不在同一个父节点下的基因序列的距离,但能够准确的判断序列的亲缘关系,进而得到完美的进化树拓扑结构;相比之下,FM算法让各个基因序列间的距离均摊了这种误差,从而有可能将本应该具有相同父节点的基因序列分到不同的分支。  相似文献   

7.
Chromatin self-organization by mutation bias   总被引:3,自引:0,他引:3  
Proteins, on binding to a DNA sequence, alter the frequency and quality of mutations that occur in the sequence. This represents a reverse flow of information from proteins to DNA. Nucleosome binding causes patterns of UV-induced damage which, when converted to mutations by replication, will phase nucleosomes. We propose that DNA binding proteins create their own high- or low-affinity binding sites along DNA sequences by biased mutational pressure.  相似文献   

8.
An attempt to use phylogenetic invariants for tree reconstruction was made at the end of the 80s and the beginning of the 90s by several researchers (the initial idea due to Lake [1987] and Cavender and Felsenstein [1987]). However, the efficiency of methods based on invariants is still in doubt (Huelsenbeck 1995; Jin and Nei 1990). Probably because these methods only used few generators of the set of phylogenetic invariants. The method studied in this paper was first introduced in Casanellas et al. (2005) and it is the first method based on invariants that uses the "whole" set of generators for DNA data. The simulation studies performed in this paper prove that it is a very competitive and highly efficient phylogenetic reconstruction method, especially for nonhomogeneous models on phylogenetic trees.  相似文献   

9.
The models of nucleotide substitution used by most maximum likelihood-based methods assume that the evolutionary process is stationary, reversible, and homogeneous. We present an extension of the Barry and Hartigan model, which can be used to estimate parameters by maximum likelihood (ML) when the data contain invariant sites and there are violations of the assumptions of stationarity, reversibility, and homogeneity. Unlike most ML methods for estimating invariant sites, we estimate the nucleotide composition of invariant sites separately from that of variable sites. We analyze a bacterial data set where problems due to lack of stationarity and homogeneity have been previously well noted and use the parametric bootstrap to show that the data are consistent with our general Markov model. We also show that estimates of invariant sites obtained using our method are fairly accurate when applied to data simulated under the general Markov model.  相似文献   

10.
为阐明上海地区 H9N2亚型禽流感病毒分离株的遗传变异、分子特征和重组模式,选取2002和2006~2014年分离自活禽市场、家禽养殖场和生猪屠宰场的14株 H9N2亚型禽流感病毒进行分析。这14株病毒分别来源于鸡、鸭、鸽、野鸡咽喉和泄殖腔样品及猪肺脏样品,用 H9亚型荧光反转录‐聚合酶链反应(RT‐PCR)试剂盒检测后,阳性样品经无特定病原体(SPF)级鸡胚尿囊腔接种并分离病毒,用血凝抑制(HI)实验进一步确定其血凝素(HA)亚型。RT‐PCR分别扩增这14株病毒全基因并进行序列测定,分析8个基因片段的遗传发生关系,发现这些分离株主要由 F/98亚系、Y280亚系、G1亚系及未知亚系重组而成。根据8个基因片段的组合情况,这14株病毒可分成5个基因型。2002、2006~2008年分离的5株H9N2亚型禽流感病毒代表了4个不同基因型,2009~2014年分离的9株H9N2亚型禽流感病毒属第5种基因型,推测可能与疫苗免疫选择压力有关。因此,在以后工作中加强H9N2亚型禽流感分子流行病学监测是非常必要的。  相似文献   

11.
Systematics and evolution of Malagasy lemurs has been analyzed using morphological characters, fossil evidence, ecological/ethological data, and chromosomal banding patterns. Recent developments in DNA technology have provided evolutionary biologists with additional and powerful tools for making phylogenetic inference. In the last years several studies concerning highly repeated DNA sequences (hrDNA) provided new insights about the systematic relationships among the different species of Lemuridae and Cheirogaleidae. Here, a reconstruction of molecular phylogeny of extant Malagasy lemurs based on the comparison of cytochrome-b mitochondrial DNA sequences is presented. With the Polymerase Chain Reaction (PCR) and direct sequencing of amplified DNA fragments, both the phylogenetic range and resolving power of comparative analysis can be extended. These techniques allow to gather sequence data useful to evaluate the pattern of molecular evolution offering opportunities for phylogenetic purposes. A 290-bp fragment of cytochrome-b gene has been amplified and sequenced from the following species:Tupaia glis, Galago alleni, Daubentonia madagascariensis, Indri indri, Varecia variegata, Eulemur fulvus, Eulemur coronatus, Eulemur rubriventer, Eulemur mongoz, Eulemur macaco, Lemur catta, andHapalemur griseus griseus. The phylogenetic trees obtained show the relationships among the Eulemur species and confirm the karyological and hrDNA results of a separated clade forL. catta/Hapalemur. The separation ofVarecia variegata from the other genus of the family Lemuridae is discussed.  相似文献   

12.
The selection of an optimal model for data analysis is an important component of model-based molecular phylogenetic studies. Owing to the large number of Markov models that can be used for data analysis, model selection is a combinatorial problem that cannot be solved by performing an exhaustive search of all possible models. Currently, model selection is based on a small subset of the available Markov models, namely those that assume the evolutionary process to be globally stationary, reversible, and homogeneous. This forces the optimal model to be time reversible even though the actual data may not satisfy these assumptions. This problem can be alleviated by including more complex models during the model selection. We present a novel heuristic that evaluates a small fraction of these complex models and identifies the optimal model.  相似文献   

13.
Summary A measure of sequence similarity,dt, not requiring prior sequence alignment gave correct results for a variety of computer-generated model sequences without and with gaps for all degrees of substitution,s. Measured was the squared Euclidean distance between vectors of counts of t-tuplets of characters in the two sequences. In models without gaps and without Needleman-Wunsch alignment, averaged was very closely equal to twice average conventional mismatch counts,m. In these models one of each of the conditions on the Jukes-Cantor model was violated in turn: (1) both descendant lineages receive the same number of substitutions, (2) all sites are equally likely to be substituted, (3) all different replacement characters are equally likely to be chosen, and (4) all original characters are equally likely to be substituted. In Jukes-Cantor models with gaps Needleman-Wunsch alignment was necessarily performed, a procedure that generally produced incorrect values ofm. For these models averaged was found to be very closely equal to twice the averagem estimated from the known value ofs using the inverted Jukes-Cantor formula.  相似文献   

14.
Close and distant relationship among 31 strains of Orientia tsutsugamushi (20, two, one and eight strains were isolated in Japan, Korea, China and southeast Asia, respectively) were clarified using phylogenetic analyses based on homologies of 56-kDa type-specific antigen genes. Isolates in Japan, Korea and China were located in eight separate clusters in the phylogenetic tree, and each was designated as JG (Japanese Gilliam type), JP-1 and JP-2 (Japanese Karp 1 and 2 types), Kato, Kawasaki, Kuroki, Shimokoshi and LX-1 types. All isolates originated in southeast Asia, including the prototype Gilliam and Karp strains isolated in Burma and New Guinea, respectively, were distantly located in the phylogenetic tree from those isolates in Japan, Korea and China, indicating that strains of O. tsutsugamushi distributed in northeastern and southeastern Asia are different types.  相似文献   

15.
16.

Background

The increasing abundance of neuromorphological data provides both the opportunity and the challenge to compare massive numbers of neurons from a wide diversity of sources efficiently and effectively. We implemented a modified global alignment algorithm representing axonal and dendritic bifurcations as strings of characters. Sequence alignment quantifies neuronal similarity by identifying branch-level correspondences between trees.

Results

The space generated from pairwise similarities is capable of classifying neuronal arbor types as well as, or better than, traditional topological metrics. Unsupervised cluster analysis produces groups that significantly correspond with known cell classes for axons, dendrites, and pyramidal apical dendrites. Furthermore, the distinguishing consensus topology generated by multiple sequence alignment of a group of neurons reveals their shared branching blueprint. Interestingly, the axons of dendritic-targeting interneurons in the rodent cortex associates with pyramidal axons but apart from the (more topologically symmetric) axons of perisomatic-targeting interneurons.

Conclusions

Global pairwise and multiple sequence alignment of neurite topologies enables detailed comparison of neurites and identification of conserved topological features in alignment-defined clusters. The methods presented also provide a framework for incorporation of additional branch-level morphological features. Moreover, comparison of multiple alignment with motif analysis shows that the two techniques provide complementary information respectively revealing global and local features.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-015-0605-1) contains supplementary material, which is available to authorized users.  相似文献   

17.
Prevailing methods of measuring diet breadth of phytophagous insects are not consistent between studies and generally rely on counts of a variety of higher plant taxa (e.g. genera, families, orders). Results derived from them can be inconsistent if different taxonomic levels are used between studies. In any case, such indices do not include information from the whole branching structure of the host plant phylogeny, and do not address the fact that higher taxa are not necessarily phylogenetically equivalent. Here we present novel phylogeny-based methods which address these shortcomings. Although a previously proposed index (the Phylogenetic Diversity index) may be employed, it cannot be used to measure diets of strictly monophagous insects (i.e. those which utilise a single host species). We therefore introduce a modification of this index (the Root Phylogenetic Diversity index) which may be applied to all diets. In addition, we propose a Clade Dispersion index as a branch-length-independent measure of the degree to which hosts are scattered across the host phylogeny. We describe how these indices could be employed in studies of insect diet breadth and discuss potential problems which may be encountered in their use. Received: 16 November 1998 / Accepted: 10 February 1999  相似文献   

18.
Summary We present the ideas, and their motivation, at the basis of a simple model of nucleic acid evolution: thestationary Markov process, or Markov clock. After a brief review of its relevant mathematical properties, the Markov clock is applied to nucleotide sequences from mitochondrial and nuclear genes of different species. Particular emphasis is given to the necessity of carrying out a correct statistical analysis, which allows us to check quantitatively the applicability of our model. We find evidence that the Markov clock ticks in many different processes, and that its limitations can be understood in terms of a simple idea that we call the base-drift hypothesis. This hypothesis correlates the deviations from the stationarity of the Markov process to the evolutionary distanced AB (P) of two species A and B, relative to the processP. We conclude by discussing the implications of our findings for future work.  相似文献   

19.
Roth JR  Kofoid E  Roth FP  Berg OG  Seger J  Andersson DI 《Genetics》2003,163(4):1483-1496
In the lac adaptive mutation system of Cairns, selected mutant colonies but not unselected mutant types appear to arise from a nongrowing population of Escherichia coli. The general mutagenesis suffered by the selected mutants has been interpreted as support for the idea that E. coli possesses an evolved (and therefore beneficial) mechanism that increases the mutation rate in response to stress (the hypermutable state model, HSM). This mechanism is proposed to allow faster genetic adaptation to stressful conditions and to explain why mutations appear directed to useful sites. Analysis of the HSM reveals that it requires implausibly intense mutagenesis (10(5) times the unselected rate) and even then cannot account for the behavior of the Cairns system. The assumptions of the HSM predict that selected revertants will carry an average of eight deleterious null mutations and thus seem unlikely to be successful in long-term evolution. The experimentally observed 35-fold increase in the level of general mutagenesis cannot account for even one Lac(+) revertant from a mutagenized subpopulation of 10(5) cells (the number proposed to enter the hypermutable state). We conclude that temporary general mutagenesis during stress is unlikely to provide a long-term selective advantage in this or any similar genetic system.  相似文献   

20.
In this paper, we introduce a probabilistic measure for computing the similarity between two biological sequences without alignment. The computation of the similarity measure is based on the Kullback-Leibler divergence of two constructed Markov models. We firstly validate the method on clustering nine chromosomes from three species. Secondly, we give the result of similarity search based on our new method. We lastly apply the measure to the construction of phylogenetic tree of 48 HEV genome sequences. Our results indicate that the weighted relative entropy is an efficient and powerful alignment-free measure for the analysis of sequences in the genomic scale.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号