共查询到20条相似文献,搜索用时 15 毫秒
1.
WSE, a new sequence distance measure based on word frequencies 总被引:1,自引:0,他引:1
In this article, we present a new distance metric, the Weighted Sequence Entropy (WSE), based on the short word composition of biological sequences. As a revision of the classical relative entropy (RE), our metric (1) works equivalently with RE in the case of small k, (2) avoids the degeneracy when some word types are absent in one sequence but not in the other. Experiments on 25 viruses including SARS-CoVs show that our method and RE give exactly the same phylogenetic tree when word length k3. When k>3, our method still works and gets convergent phylogenetic topology but the RE gives degenerate results. 相似文献
2.
Hugo Naya Jorge I Urioste Yu-Mei Chang Mariana Rodrigues-Motta Roberto Kremer Daniel Gianola 《遗传、选种与进化》2008,40(4):379-394
Dark spots in the fleece area are often associated with dark fibres in wool, which limits its competitiveness with other textile fibres. Field data from a sheep experiment in Uruguay revealed an excess number of zeros for dark spots. We compared the performance of four Poisson and zero-inflated Poisson (ZIP) models under four simulation scenarios. All models performed reasonably well under the same scenario for which the data were simulated. The deviance information criterion favoured a Poisson model with residual, while the ZIP model with a residual gave estimates closer to their true values under all simulation scenarios. Both Poisson and ZIP models with an error term at the regression level performed better than their counterparts without such an error. Field data from Corriedale sheep were analysed with Poisson and ZIP models with residuals. Parameter estimates were similar for both models. Although the posterior distribution of the sire variance was skewed due to a small number of rams in the dataset, the median of this variance suggested a scope for genetic selection. The main environmental factor was the age of the sheep at shearing. In summary, age related processes seem to drive the number of dark spots in this breed of sheep. 相似文献
3.
4.
Based on a five-letter model of the 20 amino acids, we propose a new 2-D graphical representation of protein sequence. Then we transform the 2-D graphical representation into a numerical characterization that will facilitate quantitative comparisons of protein sequences. As an application, we construct the phylogenetic tree of 56 coronavirus spike proteins. The resulting tree agrees well with the established taxonomic groups. 相似文献
5.
An information-based sequence distance and its application to whole mitochondrial genome phylogeny 总被引:12,自引:0,他引:12
Li M Badger JH Chen X Kwong S Kearney P Zhang H 《Bioinformatics (Oxford, England)》2001,17(2):149-154
MOTIVATION: Traditional sequence distances require an alignment and therefore are not directly applicable to the problem of whole genome phylogeny where events such as rearrangements make full length alignments impossible. We present a sequence distance that works on unaligned sequences using the information theoretical concept of Kolmogorov complexity and a program to estimate this distance. RESULTS: We establish the mathematical foundations of our distance and illustrate its use by constructing a phylogeny of the Eutherian orders using complete unaligned mitochondrial genomes. This phylogeny is consistent with the commonly accepted one for the Eutherians. A second, larger mammalian dataset is also analyzed, yielding a phylogeny generally consistent with the commonly accepted one for the mammals. AVAILABILITY: The program to estimate our sequence distance, is available at http://www.cs.cityu.edu.hk/~cssamk/gencomp/GenCompress1.htm. The distance matrices used to generate our phylogenies are available at http://www.math.uwaterloo.ca/~mli/distance.html. 相似文献
6.
C. Bleidorn L. Vogt T. Bartolomaeus 《Journal of Zoological Systematics and Evolutionary Research》2003,41(3):186-195
The phylogenetic position of Annelida as well as its ingroup relationships are a matter of ongoing debate. A molecular phylogenetic study of sedentary polychaete relationships was conducted based on 70 sequences of 18S rRNA, including unpublished sequences of 18 polychaete species. The data set was analysed with maximum parsimony and maximum likelihood methods. Clade robustness was estimated by parsimony-bootstrapping and jackknifing, decay index, and clade support, as well as a posteriori probability tests using Bayesian inference. Irrespective of the applied method, some traditional sedentary polychaete taxa, such as Cirratulidae, Opheliidae, Orbiniidae, Siboglinidae and Spionidae, were recovered by our phylogenetic reconstruction. A close relationship between Orbiniidae and Questa received a particularly strong support. Echiura appears to be a polychaete ingroup taxon which is closely related to Dasybranchus (Capitellidae). As in previous molecular analyses, no support was found for the monophyly of Annelida nor for that of Polychaeta. However, we suggest that an increase in taxon sampling may yield additional resolution in the reconstruction of polychaete ingroup phylogeny, although the difficulties in reconstructing the basal phylogenetic relationships within Annelida may be due to their rapid radiation. 相似文献
7.
Summary Partial sequences of 18s rRNA were obtained for 2 gymnosperms and 12 angiosperms from a wide range of families and these were analyzed with 5 other published sequences to form a phylogenetic tree. Using 16 published sequences of the large subunit of rubisco (rbcL), also from a wide range of angiosperm families, another phylogenetic tree was derived and the two approaches were compared. Both phylogenetic trees gave good grouping within families but in neither case was there resolution of the branching order of major taxa. Superficially the long rbcL sequences (whose base composition was homogeneous among all species) seemed very promising, but analysis showed that a large proportion of the variation did not affect the amino acid sequence. Although silent substitution contained some phylogenetic information, at the level required to order major taxa, much of it was random and obfuscating. It was concluded that neither macromolecule alone was likely to yield a solution to the problem of angiosperm phylogeny and therefore that studies of both, at least, will be required. For this reason, a method wa described for obtaining both DNA and RNA of good quality from the same preparation and which had been used successfully with a wide range of species including many with pungent leaves. 相似文献
8.
Partial sequences (1032 bp) of the nuclear-encoded large ribosomal RNA gene (LSU) were determined for 16 gelidialean species,
and analyzed separately and in combination with plastid rbcL and nuclear SSU gene sequences. The number of informative characters
and levels of sequence divergence among taxa are intermediate in LSU sequences as compared to that for rbcL and SSU. Analyses
of the separate LSU, and a combined LSU, SSU, and rbcL data sets have identified early-diverging lineages within the Gelidiales
including Gelidiella, Pterocladia, Pterocladiella, and a lineage including Gelidium and species classified in other genera.
The relationships among most gelidialean taxa are well-resolved and well-supported by analyses of the combined data; however,
the relationships of Ptilophora and Capreolia remain unclear. It is speculated that these two lineages have diverged from
a common ancestor over an evolutionarily short period of time.
This revised version was published online in June 2006 with corrections to the Cover Date. 相似文献
9.
Prokaryote phylogeny meets taxonomy: An exhaustive comparison of composition vector trees with systematic bacteriology
下载免费PDF全文

We perform an exhaustive, taxon by taxon, comparison of the branchings in the composition vector trees (CVTrees) inferred
from 432 prokaryotic genomes available on 31 December 2006, with the bacteriologists’ taxonomy—primarily the latest online
Outline of the Bergey’s Manual of Systematic Bacteriology. The CVTree phylogeny agrees very well with the Bergey’s taxonomy in majority of fine branchings and overall structures.
At the same time most of the differences between the trees and the Manual have been known to biologists to some extent and may hint at taxonomic revisions. Instead of demonstrating the overwhelming
agreement this paper puts emphasis on the biological implications of the differences. 相似文献
10.
Qi-Guang Chen 《Biometrical journal. Biometrische Zeitschrift》1988,30(3):351-358
The Poisson regression model for the analysis of life table and follow-up data with covariates is presented. An example is presented to show how this technique can be used to construct a parsimonious model which describes a set of survival data. All parameters in the model, the hazard and survival functions are estimated by maximum likelihood. 相似文献
11.
Yoshio Tateno 《Journal of molecular evolution》1990,30(1):85-93
Summary A method for molecular phylogeny construction is newly developed. The method, called the stepwise ancestral sequence method, estimates molecular phylogenetic trees and ancestral sequences simultaneously on the basis of parsimony and sequence homology. For simplicity the emphasis is placed more on parsiomony than on sequence homology in the present study, though both are certainly important. Because parsimony alone will sometimes generate plural candidate trees, the method retains not one but five candidates from which one can then single out the final tree taking other criteria into account.The properties and performance of the method are then examined by simulating an evolving gene along a model phylogenetic tree. The estimated trees are found to lie in a narrow range of the parsimony criteria used in the present study. Thus, other criteria such as biological evidence and likelihood are necessary to single out the correct tree among them, with biological evidence taking precedence over any other criterion. The computer simulation also reveals that the method satisfactorily estimates both tree topology and ancestral sequences, at least for the evolutionary model used in the present study. 相似文献
12.
Standard methods of phylogenetic reconstruction are based on models that assume homogeneity of nucleotide composition among taxa. However, this assumption is often violated in biological data sets. In this study, we examine possible effects of nucleotide heterogeneity among lineages on the phylogenetic reconstruction of a bacterial group that spans a wide range of genomic nucleotide contents: obligately endosymbiotic bacteria and free-living or commensal species in the gamma-Proteobacteria. We focus on AT-rich primary endosymbionts to better understand the origins of obligately intracellular lifestyles. Previous phylogenetic analyses of this bacterial group point to the importance of accounting for base compositional variation in estimating relationships, particularly between endosymbiotic and free-living taxa. Here, we develop an approach to compare susceptibility of various phylogenetic reconstruction methods to the effects of nucleotide heterogeneity. First, we identify candidate trees of gamma-Proteobacteria groEL and 16S rRNA using approaches that assume homogeneous and stationary base composition, including Bayesian, maximum likelihood, parsimony, and distance methods. We then create permutations of the resulting candidate trees by varying the placement of the AT-rich endosymbiont Buchnera. These permutations are evaluated under the nonhomogeneous and nonstationary maximum likelihood model of Galtier and Gouy, which allows equilibrium base content to vary among examined lineages. Our results show that commonly used phylogenetic methods produce incongruent trees of the Enterobacteriales, and that the placement of Buchnera is especially unstable. However, under a nonhomogeneous model, various groEL and 16S rRNA phylogenies that separate Buchnera from other AT-rich endosymbionts (Blochmannia and Wigglesworthia) have consistently and significantly higher likelihood scores. Blochmannia and Wigglesworthia appear to have evolved from secondary endosymbionts, and represent an origin of primary endosymbiosis that is independent from Buchnera. This application of a nonhomogeneous model offers a computationally feasible way to test specific phylogenetic hypotheses for taxa with heterogeneous and nonstationary base composition. 相似文献
13.
Secondary structure analysis of 34 internal transcribed spacer 2 (ITS-2) sequences showed that the current model for the green algae Scenedesmus and Desmodesmus is not accurate. In particular, helix I of the currently used model showed considerable deviations from our new model. The newly proposed model is supported by many two-sided compensated base pair changes and fully compensated insertions in all four helices. Phylogenetic analysis by maximum parsimony based on the new alignment confirmed the recent division of the old genus Scenedesmus into the new genera Scenedesmus and Desmodesmus. However, the analysis was not able to show phylogenetic relationships within these two genera. Hence, the ITS-2 region alone is not suitable for clarifying the phylogeny of Scenedesmus and Desmodesmus and new regions have to be found for future sequence analyses. 相似文献
14.
Myzostomids are minute, soft-bodied, marine worms associated with echinoderms since the Carboniferous. Due to their long history
as host-specific symbionts, they have acquired a highly derived body plan that obscures their phylogenetic affinities to other
metazoans. Because certain organs are serially arranged a closer relationship between polychaetes and myzostomids has repeatedly
been discussion. We presented here a review on the ultrastructure of myzostomids with the most recent analyses that concern
their phylogenetic position. The ultrastructure of the integument, digestive system, excretory system and nervous system are
summarized. Unpublished information on the gametogenesis and reproductive systems of myzostomids are also exposed with a view
on their reproductive process. 相似文献
15.
Qingle Cai Xiaoju Qian Yongshan Lang Yadan Luo Jiaohui Xu Shengkai Pan Yuanyuan Hui Caiyun Gou Yue Cai Meirong Hao Jinyang Zhao Songbo Wang Zhaobao Wang Xinming Zhang Rongjun He Jinchao Liu Longhai Luo Yingrui Li Jun Wang 《Genome biology》2013,14(3):R29
Background
The mechanism of high-altitude adaptation has been studied in certain mammals. However, in avian species like the ground tit Pseudopodoces humilis, the adaptation mechanism remains unclear. The phylogeny of the ground tit is also controversial.Results
Using next generation sequencing technology, we generated and assembled a draft genome sequence of the ground tit. The assembly contained 1.04 Gb of sequence that covered 95.4% of the whole genome and had higher N50 values, at the level of both scaffolds and contigs, than other sequenced avian genomes. About 1.7 million SNPs were detected, 16,998 protein-coding genes were predicted and 7% of the genome was identified as repeat sequences. Comparisons between the ground tit genome and other avian genomes revealed a conserved genome structure and confirmed the phylogeny of ground tit as not belonging to the Corvidae family. Gene family expansion and positively selected gene analysis revealed genes that were related to cardiac function. Our findings contribute to our understanding of the adaptation of this species to extreme environmental living conditions.Conclusions
Our data and analysis contribute to the study of avian evolutionary history and provide new insights into the adaptation mechanisms to extreme conditions in animals. 相似文献16.
An essentially new method to relate a number of taxa on the basis of a predefined set of dichotomous properties (i.e. either present or not present) is described. The basic step of the analysis is the derivation of a sophisticated distance measure to describe the pairwise dissimilarities quantitatively on the basis of the individual properties. The presentation of the dissimilarity matrix by a tree-like structure is an obvious step implicated by the the distance measure and is related to the widely used method of successive joining of nearest neighbors with respect to the distances. The distance measure makes no use of stochastic or other mathematical models of evolutionary processes and can be interpreted best in terms of discrete information theory. 相似文献
17.
Protein-protein interactions are fundamentally important in many biological processes and it is in pressing need to understand the principles of protein-protein interactions. Mutagenesis studies have found that only a small fraction of surface residues, known as hot spots, are responsible for the physical binding in protein complexes. However, revealing hot spots by mutagenesis experiments are usually time consuming and expensive. In order to complement the experimental efforts, we propose a new computational approach in this paper to predict hot spots. Our method, Rough Set-based Multiple Criteria Linear Programming (RS-MCLP), integrates rough sets theory and multiple criteria linear programming to choose dominant features and computationally predict hot spots. Our approach is benchmarked by a dataset of 904 alanine-mutated residues and the results show that our RS-MCLP method performs better than other methods, e.g., MCLP, Decision Tree, Bayes Net, and the existing HotSprint database. In addition, we reveal several biological insights based on our analysis. We find that four features (the change of accessible surface area, percentage of the change of accessible surface area, size of a residue, and atomic contacts) are critical in predicting hot spots. Furthermore, we find that three residues (Tyr, Trp, and Phe) are abundant in hot spots through analyzing the distribution of amino acids. 相似文献
18.
Omar Rota-Stabelli Lahcen Campbell Henner Brinkmann Gregory D. Edgecombe Stuart J. Longhorn Kevin J. Peterson Davide Pisani Hervé Philippe Maximilian J. Telford 《Proceedings. Biological sciences / The Royal Society》2011,278(1703):298-306
While a unique origin of the euarthropods is well established, relationships between the four euarthropod classes—chelicerates, myriapods, crustaceans and hexapods—are less clear. Unsolved questions include the position of myriapods, the monophyletic origin of chelicerates, and the validity of the close relationship of euarthropods to tardigrades and onychophorans. Morphology predicts that myriapods, insects and crustaceans form a monophyletic group, the Mandibulata, which has been contradicted by many molecular studies that support an alternative Myriochelata hypothesis (Myriapoda plus Chelicerata). Because of the conflicting insights from published molecular datasets, evidence from nuclear-coding genes needs corroboration from independent data to define the relationships among major nodes in the euarthropod tree. Here, we address this issue by analysing two independent molecular datasets: a phylogenomic dataset of 198 protein-coding genes including new sequences for myriapods, and novel microRNA complements sampled from all major arthropod lineages. Our phylogenomic analyses strongly support Mandibulata, and show that Myriochelata is a tree-reconstruction artefact caused by saturation and long-branch attraction. The analysis of the microRNA dataset corroborates the Mandibulata, showing that the microRNAs miR-965 and miR-282 are present and expressed in all mandibulate species sampled, but not in the chelicerates. Mandibulata is further supported by the phylogenetic analysis of a comprehensive morphological dataset covering living and fossil arthropods, and including recently proposed, putative apomorphies of Myriochelata. Our phylogenomic analyses also provide strong support for the inclusion of pycnogonids in a monophyletic Chelicerata, a paraphyletic Cycloneuralia, and a common origin of Arthropoda (tardigrades, onychophorans and arthropods), suggesting that previous phylogenies grouping tardigrades and nematodes may also have been subject to tree-reconstruction artefacts. 相似文献
19.
20.
作物生产力模型及其应用研究 总被引:12,自引:1,他引:12
从农业生态环境的角度论述了作物生产力模型的产生背景,讨论了作物生产力模型发展的幼年期、少年期、青年期和成熟期4个阶段,从科学研究,农业作物管理和农业决策分析等方面论述了作物生产力模型在保护农业生态环境中的作用,讨论了作物生产力模型的不足之处主要为简单的模型的地区适应性不强,而复杂的模型则由于参数的难以获取,且不同研究区域基础数据格式的一致性问题,也导致模型的地区适应性比较弱,因而提出要建立通用,统一的数据格式,以使作物生产力模型在不同地区易于推广应用;最后针对作物生产力模型普遍适应性能比较弱的问题,对作物生产力模型与地理信息系统的结合进行了研究,并综述了目前在作物生产力模型的界面友好化方面的一些工作,提出建立通用的作物生产力模型界面是今后发展的重点所在。 相似文献