首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
Malin is a software package for the analysis of eukaryotic gene structure evolution. It provides a graphical user interface for various tasks commonly used to infer the evolution of exon-intron structure in protein-coding orthologs. Implemented tasks include the identification of conserved homologous intron sites in protein alignments, as well as the estimation of ancestral intron content, lineage-specific intron losses and gains. Estimates are computed either with parsimony, or with a probabilistic model that incorporates rate variation across lineages and intron sites. Availability: Malin is available as a stand-alone Java application, as well as an application bundle for MacOS X, at the website http://www.iro.umontreal.ca/~csuros/introns/malin/. The software is distributed under a BSD-style license.  相似文献   

Wildlife populations consist of individuals that contribute disproportionately to growth and viability. Understanding a population's spatial and temporal dynamics requires estimates of abundance and demographic rates that account for this heterogeneity. Estimating these quantities can be difficult, requiring years of intensive data collection. Often, this is accomplished through the capture and recapture of individual animals, which is generally only feasible at a limited number of locations. In contrast, N‐mixture models allow for the estimation of abundance, and spatial variation in abundance, from count data alone. We extend recently developed multistate, open population N‐mixture models, which can additionally estimate demographic rates based on an organism's life history characteristics. In our extension, we develop an approach to account for the case where not all individuals can be assigned to a state during sampling. Using only state‐specific count data, we show how our model can be used to estimate local population abundance, as well as density‐dependent recruitment rates and state‐specific survival. We apply our model to a population of black‐throated blue warblers (Setophaga caerulescens) that have been surveyed for 25 years on their breeding grounds at the Hubbard Brook Experimental Forest in New Hampshire, USA. The intensive data collection efforts allow us to compare our estimates to estimates derived from capture–recapture data. Our model performed well in estimating population abundance and density‐dependent rates of annual recruitment/immigration. Estimates of local carrying capacity and per capita recruitment of yearlings were consistent with those published in other studies. However, our model moderately underestimated annual survival probability of yearling and adult females and severely underestimates survival probabilities for both of these male stages. The most accurate and precise estimates will necessarily require some amount of intensive data collection efforts (such as capture–recapture). Integrated population models that combine data from both intensive and extensive sources are likely to be the most efficient approach for estimating demographic rates at large spatial and temporal scales.  相似文献   

Complementary developments in comparative genomics, protein structure determination and in-depth comparison of protein sequences and structures have provided a better understanding of the prevailing trends in the emergence and diversification of protein domains. The investigation of deep relationships among different classes of proteins involved in key cellular functions, such as nucleic acid polymerases and other nucleotide-dependent enzymes, indicates that a substantial set of diverse protein domains evolved within the primordial, ribozyme-dominated RNA world.  相似文献   

The genus Phrynosoma includes 13 species of North American lizards characterized by unique and highly derived morphologies and ecologies. Understanding interspecific relationships within this genus is essential for testing hypotheses about character evolution in this group. We analyzed mitochondrial ND4 and cytochrome b gene sequence data from all species of Phrynosoma in conjunction with a previously published dataset including 12S and 16S rRNA gene sequences and morphological characters. We used multiple phylogenetic methods and diagnostic tests for data combinability and taxonomic congruence to investigate the data in separate and combined analyses. Separate data partitions resulted in several well-supported lineages, but taxonomic congruence was lacking between topologies from separate and combined analyses. Partitioned Bremer support analyses also reveals conflict between data partitions in certain tree regions. When taxa associated with well-supported clades were removed from analyses, phylogenetic signal was lost. Combined, our results initially suggest conflict between data partitions, but further tests show the data are only appropriate for phylogenetic reconstruction of those parts of the topology that were well resolved. Nonetheless, our data analyses reveal five well-supported clades: (1) Phrynosoma ditmarsi and Phrynosoma hernandesi, (2) P. ditmarsi, P. hernandesi, and Phrynosoma douglasii, (3) P. ditmarsi, P. hernandesi, P. douglasii, and Phrynosoma orbiculare, (4) Phrynosoma mcallii and Phrynosoma platyrhinos, and (5) Phrynosoma braconnieri and Phrynosoma taurus.  相似文献   

SUMMARY: TREE-PUZZLE is a program package for quartet-based maximum-likelihood phylogenetic analysis (formerly PUZZLE, Strimmer and von Haeseler, Mol. Biol. Evol., 13, 964-969, 1996) that provides methods for reconstruction, comparison, and testing of trees and models on DNA as well as protein sequences. To reduce waiting time for larger datasets the tree reconstruction part of the software has been parallelized using message passing that runs on clusters of workstations as well as parallel computers. AVAILABILITY: http://www.tree-puzzle.de. The program is written in ANSI C. TREE-PUZZLE can be run on UNIX, Windows and Mac systems, including Mac OS X. To run the parallel version of PUZZLE, a Message Passing Interface (MPI) library has to be installed on the system. Free MPI implementations are available on the Web (cf. http://www.lam-mpi.org/mpi/implementations/).  相似文献   

Protein engineers can alter the properties of enzymes by directing their evolution in vitro. Many methods to generate molecular diversity and to identify improved clones have been developed, but experimental evolution remains as much an art as a science. We previously used DNA shuffling (sexual recombination) and a histochemical screen to direct the evolution of Escherichia coli beta-glucuronidase (GUS) variants with improved beta-galactosidase (BGAL) activity. Here, we employ the same model evolutionary system to test the efficiencies of several other techniques: recursive random mutagenesis (asexual), combinatorial cassette mutagenesis (high-frequency recombination) and a versatile high-throughput microplate screen. GUS variants with altered specificity evolved in each trial, but different combinations of mutagenesis and screening techniques effected the fixation of different beneficial mutations. The new microplate screen identified a broader set of mutations than the previously employed X-gal colony screen. Recursive random mutagenesis produced essentially asexual populations, within which beneficial mutations drove each other into extinction (clonal interference); DNA shuffling and combinatorial cassette mutagenesis led instead to the accumulation of beneficial mutations within a single allele. These results explain why recombinational approaches generally increase the efficiency of laboratory evolution.  相似文献   

We present a new likelihood method for detecting constrained evolution at synonymous sites and other forms of nonneutral evolution in putative pseudogenes. The model is applicable whenever the DNA sequence is available from a protein-coding functional gene, a pseudogene derived from the protein-coding gene, and an orthologous functional copy of the gene. Two nested likelihood ratio tests are developed to test the hypotheses that (1) the putative pseudogene has equal rates of silent and replacement substitutions; and (2) the rate of synonymous substitution in the functional gene equals the rate of substitution in the pseudogene. The method is applied to a data set containing 74 human processed-pseudogene loci, 25 mouse processed-pseudogene loci, and 22 rat processed-pseudogene loci. Using the informatics resources of the Human Genome Project, we localized 67 of the human-pseudogene pairs in the genome and estimated the GC content of a large surrounding genomic region for each. We find that, for pseudogenes deposited in GC regions similar to those of their paralogs, the assumption of equal rates of silent and replacement site evolution in the pseudogene is upheld; in these cases, the rate of silent site evolution in the functional genes is approximately 70% the rate of evolution in the pseudogene. On the other hand, for pseudogenes located in genomic regions of much lower GC than their functional gene, we see a sharp increase in the rate of silent site substitutions, leading to a large rate of rejection for the pseudogene equality likelihood ratio test.  相似文献   

MOTIVATION: TipDate is a program that will use sequences that have been isolated at different dates to estimate their rate of molecular evolution. The program provides a maximum likelihood estimate of the rate and also the associated date of the most recent common ancestor of the sequences, under a model which assumes a constant rate of substitution (molecular clock) but which accommodates the dates of isolation. Confidence intervals for these parameters are also estimated. Results: The approach was applied to a sample of 17 dengue virus serotype 4 sequences, isolated at dates ranging from 1956 to 1994. The rate of substitution for this serotype was estimated to be 7.91 x 10(-4) substitutions per site per year (95% confidence intervals of 6.07 x 10(-4), 9.86 x 10(-4)). This is compatible with a date of 1922 (95% confidence intervals of 1900-1936) for the most recent common ancestor of these sequences. AVAILABILITY: TipDate can be obtained by WWW from http://evolve.zoo. ox.ac.uk/software. The package includes the source code, manual and example files. Both UNIX and Apple Macintosh versions are available from the same site.  相似文献   

We have developed a new tool, called fastDNAml, for constructingphylogenetic trees from DNA sequences. The program can be runon a wide variety of computers ranging from Unix workstationsto massively parallel systems, and is available from the RibosomalDatabase Project (RDP) by anonymous FTP. Our program uses amaximum likelihood approach and is based on version 3.3 of Felsenstein'sdnaml program. Several enhancements, including algorithmic changes,significantly improve performance and reduce memory usage, makingit feasible to construct even very large trees. Trees containing40–100 taxa have been easily generated, and phylogeneticestimates are possible even when hundreds of sequences exist.We are currently using the tool to construct a phylogenetictree based on 473 small subunit rRNA sequences from prokaryotes.  相似文献   

Summary Methods of classical segregation analysis were applied to a sample of 129 sibships with one or more individuals affected by neurofibromatosis-1 (NF-1). The sample consists only of subjects with NF-1; all the probands had been referred for genetic counselling because of café-au-lait spots, and a diagnostic protocol was invariably applied. No deviation from the segregation ratio expected for a fully penetrant Mendelian dominant gene was observed. A maximum likelihood estimate of the proportion of sporadic cases was obtained, and the mutation rate was estimated to be 6.5×10-5 gametes per generation (95% CI 5.0–8.1).  相似文献   

Stewart WC  Thompson EA 《Biometrics》2006,62(3):728-734
As a result of previous large, multipoint linkage studies there is a substantial amount of existing marker data. Due to the increased sample size, genetic maps estimated from these data could be more accurate than publicly available maps. However, current methods for map estimation are restricted to data sets containing pedigrees with a small number of individuals, or cannot make full use of marker data that are observed at several loci on members of large, extended pedigrees. In this article, a maximum likelihood (ML) method for map estimation that can make full use of the marker data in a large, multipoint linkage study is described. The method is applied to replicate sets of simulated marker data involving seven linked loci, and pedigree structures based on the real multipoint linkage study of Abkevich et al. (2003, American Journal of Human Genetics 73, 1271-1281). The variance of the ML estimate is accurately estimated, and tests of both simple and composite null hypotheses are performed. An efficient procedure for combining map estimates over data sets is also suggested.  相似文献   

We introduce a new approach to estimate the evolutionary distance between two sequences. This approach uses a tree with three leaves: two of them correspond to the studied sequences, whereas the third is chosen to handle long-distance estimation. The branch lengths of this tree are obtained by likelihood maximization and are then used to deduce the desired distance. This approach, called TripleML, improves the precision of evolutionary distance estimates, and thus the topological accuracy of distance-based methods. TripleML can be used with neighbor-joining-like (NJ-like) methods not only to compute the initial distance matrix but also to estimate new distances encountered during the agglomeration process. Computer simulations indicate that using TripleML significantly improves the topological accuracy of NJ, BioNJ, and Weighbor, while conserving a reasonable computation time. With randomly generated 24-taxon trees and realistic parameter values, combining NJ with TripleML reduces the number of wrongly inferred branches by about 11% (against 2.6% and 5.5% for BioNJ and Weighbor, respectively). Moreover, this combination requires only about 1.5 min to infer a phylogeny of 96 sequences composed of 1,200 nucleotides, as compared with 6.5 h for FastDNAml on the same machine (PC 466 MHz).  相似文献   

Comparative analysis of molecular sequence data is essential for reconstructing the evolutionary histories of species and inferring the nature and extent of selective forces shaping the evolution of genes and species. Here, we announce the release of Molecular Evolutionary Genetics Analysis version 5 (MEGA5), which is a user-friendly software for mining online databases, building sequence alignments and phylogenetic trees, and using methods of evolutionary bioinformatics in basic biology, biomedicine, and evolution. The newest addition in MEGA5 is a collection of maximum likelihood (ML) analyses for inferring evolutionary trees, selecting best-fit substitution models (nucleotide or amino acid), inferring ancestral states and sequences (along with probabilities), and estimating evolutionary rates site-by-site. In computer simulation analyses, ML tree inference algorithms in MEGA5 compared favorably with other software packages in terms of computational efficiency and the accuracy of the estimates of phylogenetic trees, substitution parameters, and rate variation among sites. The MEGA user interface has now been enhanced to be activity driven to make it easier for the use of both beginners and experienced scientists. This version of MEGA is intended for the Windows platform, and it has been configured for effective use on Mac OS X and Linux desktops. It is available free of charge from http://www.megasoftware.net.  相似文献   

The Thoracica includes the ordinary barnacles found along the sea shore and is the most diverse and well-studied superorder of Cirripedia. However, although the literature abounds with scenarios explaining the evolution of these barnacles, very few studies have attempted to test these hypotheses in a phylogenetic context. The few attempts at phylogenetic analyses have suffered from a lack of phylogenetic signal and small numbers of taxa. We collected DNA sequences from the nuclear 18S, 28S, and histone H3 genes and the mitochondrial 12S and 16S genes (4,871 bp total) and data for 37 adult and 53 larval morphological characters from 43 taxa representing all the extant thoracican suborders (except the monospecific Brachylepadomorpha). Four Rhizocephala (highly modified parasitic barnacles) taxa and a Rhizocephala + Acrothoracica (burrowing barnacles) hypothetical ancestor were used as the outgroup for the molecular and morphological analyses, respectively. We analyzed these data separately and combined using maximum likelihood (ML) under "hill-climbing" and genetic algorithm heuristic searches, maximum parsimony procedures, and Bayesian inference coupled with Markov chain Monte Carlo techniques under mixed and homogeneous models of nucleotide substitution. The resulting phylogenetic trees answered key questions in barnacle evolution. The four-plated Iblomorpha were shown as the most primitive thoracican, and the plateless Heteralepadomorpha were placed as the sister group of the Lepadomorpha. These relationships suggest for the first time in an invertebrate that exoskeleton biomineralization may have evolved from phosphatic to calcitic. Sessilia (nonpedunculate) barnacles were depicted as monophyletic and appear to have evolved from a stalked (pedunculate) multiplated (5+) scalpelloidlike ancestor rather than a five-plated lepadomorphan ancestor. The Balanomorpha (symmetric sessile barnacles) appear to have the following relationship: (Chthamaloidea(Coronuloidea(Tetraclitoidea, Balanoidea))). Thoracican divergence times were estimated under ML-based local clock, Bayesian, and penalized likelihood approaches using an 18S data set and three calibration points: Heteralepadomorpha = 530 million years ago (MYA), Scalpellomorpha = 340 MYA, and Verrucomorpha = 120 MYA. Estimated dates varied considerably within and between approaches depending on the calibration point. Highly parameterized local clock models that assume independent rates (r > or = 15) for confamilial or congeneric species generated the most congruent estimates among calibrations and agreed more closely with the barnacle fossil record. Reasonable estimates were also obtained under the Bayesian procedure of Kishino et al. (2001, Mol. Biol. Evol. 18:352-361) but using multiple calibrations. Most of the dates estimated under the Bayesian procedure of Aris-Brosou and Yang (2002, Syst. Biol. 51:703-714) and the penalized likelihood method using single and/or multiple calibrations were inconsistent among calibrations and did not fit the fossil record.  相似文献   

Dominance hierarchies have been widely used for describing the outcome of competitive interactions in an animal group. We present a procedure for estimating the linear dominance hierarchy. The procedure uses the statistical method of paired comparisons, assuming weak stochastic transitivity to model interactions within a linear dominance hierarchy. The linear dominance hierarchy is estimated using a maximum likelihood ranking procedure. This method allows unequal numbers of encounters between pairs and does not require all pairs to have observed encounters. The method is illustrated by application to behavioural data from a group of 10 baboons (Papio cynocephalus anubis).  相似文献   

Most phylogeographic studies have used maximum likelihood or maximum parsimony to infer phylogeny and bootstrap analysis to evaluate support for trees. Recently, Bayesian methods using Marlov chain Monte Carlo to search tree space and simultaneously estimate tree support have become popular due to its fast search speed and ability to create a posterior distribution of parameters of interest. Here, I present a study that utilizes Bayesian methods to infer phylogenetic relationships of the cornsnake (Elaphe guttata) complex using cytochrome b sequences. Examination of the posterior probability distributions confirms the existence of three geographic lineages. Additionally, there is no support for the monophyly of the subspecies of E. guttata. Results suggest the three geographic lineages partially conform to the ranges of previously defined subspecies, although Shimodaira-Hasegawa tests suggest that subspecies-constrained trees produce significantly poorer likelihood estimates than the most likely trees reflecting the evolution of three geographic assemblages. Based on molecular support, these three geographic assemblages are recognized as species using evolutionary species criteria: E. guttata, Elaphe slowinskii, and Elaphe emoryi [phylogeographic, maximum likelihood, maximum parsimony, bootstrap, Bayesian, Markov chain Monte Carlo, cornsnake, Cytochrome b, geographic lineages, E. guttta, E. slowinskii, and E. emoryi].  相似文献   

Evolution of proteins is generally modeled as a Markov process acting on each site of the sequence. Replacement frequencies need to be estimated based on sequence alignments. Here we compare three approaches: First, the original method by Dayhoff, Schwartz, and Orcutt (1978) Atlas Protein Seq. Struc. 5:345-352, secondly, the resolvent method (RV) by Müller and Vingron (2000) J. Comput. Biol. 7(6):761-776, and finally a maximum likelihood approach (ML) developed in this paper. We evaluate the methods using a highly divergent and inhomogeneous set of sequence alignments as an input to the estimation procedure. ML is the method of choice for small sets of input data. Although the RV method is computationally much less demanding it performs only slightly worse than ML. Therefore, it is perfectly appropriate for large-scale applications.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号