首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Detecting Isolation by Distance Using Phylogenies of Genes   总被引:9,自引:3,他引:9       下载免费PDF全文
M. Slatkin  W. P. Maddison 《Genetics》1990,126(1):249-260
We introduce a method for analyzing phylogenies of genes sampled from a geographically structured population. A parsimony method can be used to compute s, the minimum number of migration events between pairs of populations sampled, and the value of s can be used to estimate the effective migration rate M, the value of Nm in an island model with local populations of size N and a migration rate m that would yield the same value of s. Extensive simulations show that there is a simple relationship between M and the geographic distance between pairs of samples in one- and two-dimensional models of isolation by distance. Both stepping-stone and lattice models were simulated. If two demes k steps apart are sampled, then, s, the average value of s, is a function only of k/(Nm) in a one-dimensional model and is a function only of k/(Nm)2 in a two-dimensional model. Furthermore, log(M) is approximately a linear function of log(k). In a one-dimensional model, the regression coefficient is approximately -1 and in a two-dimensional model the regression coefficient is approximately -0.5. Using data from several locations, the regression of log(M) on log(distance) may indicate whether there is isolation by distance in a population at equilibrium and may allow an estimate of the effective migration rate between adjacent sampling locations. Alternative methods for analyzing DNA sequence data from a geographically structured population are discussed. An application of our method to the data of R. L. Cann, M. Stoneking and A. C. Wilson on human mitochondrial DNA is presented.  相似文献   

2.
Morphological integration describes the degree to which sets of organismal traits covary with one another. Morphological covariation may be evaluated at various levels of biological organization, but when characterizing such patterns across species at the macroevolutionary level, phylogeny must be taken into account. We outline an analytical procedure based on the evolutionary covariance matrix that allows species-level patterns of morphological integration among structures defined by sets of traits to be evaluated while accounting for the phylogenetic relationships among taxa, providing a flexible and robust complement to related phylogenetic independent contrasts based approaches. Using computer simulations under a Brownian motion model we show that statistical tests based on the approach display appropriate Type I error rates and high statistical power for detecting known levels of integration, and these trends remain consistent for simulations using different numbers of species, and for simulations that differ in the number of trait dimensions. Thus, our procedure provides a useful means of testing hypotheses of morphological integration in a phylogenetic context. We illustrate the utility of this approach by evaluating evolutionary patterns of morphological integration in head shape for a lineage of Plethodon salamanders, and find significant integration between cranial shape and mandible shape. Finally, computer code written in R for implementing the procedure is provided.  相似文献   

3.
4.
When aligning RNAs, it is important to consider both the secondary structure similarity and primary sequence similarity to find an accurate alignment. However, algorithms that can handle RNA secondary structures typically have high computational complexity that limits their utility. For this reason, there have been a number of attempts to find useful alignment constraints that can reduce the computations without sacrificing the alignment accuracy. In this paper, we propose a new method for finding effective alignment constraints for fast and accurate structural alignment of RNAs, including pseudoknots. In the proposed method, we use a profile-HMM to identify the “seedâ€� regions that can be aligned with high confidence. We also estimate the position range of the aligned bases that are located outside the seed regions. The location of the seed regions and the estimated range of the alignment positions are then used to establish the sequence alignment constraints. We incorporated the proposed constraints into the profile context-sensitive HMM (profile-csHMM) based RNA structural alignment algorithm. Experiments indicate that the proposed method can make the alignment speed up to 11 times faster without degrading the accuracy of the RNA alignment.  相似文献   

5.
RNA molecules are important cellular components involved in many fundamental biological processes. Understanding the mechanisms behind their functions requires knowledge of their tertiary structures. Though computational RNA folding approaches exist, they often require manual manipulation and expert intuition; predicting global long-range tertiary contacts remains challenging. Here we develop a computational approach and associated program module (RNAJAG) to predict helical arrangements/topologies in RNA junctions. Our method has two components: junction topology prediction and graph modeling. First, junction topologies are determined by a data mining approach from a given secondary structure of the target RNAs; second, the predicted topology is used to construct a tree graph consistent with geometric preferences analyzed from solved RNAs. The predicted graphs, which model the helical arrangements of RNA junctions for a large set of 200 junctions using a cross validation procedure, yield fairly good representations compared to the helical configurations in native RNAs, and can be further used to develop all-atom models as we show for two examples. Because junctions are among the most complex structural elements in RNA, this work advances folding structure prediction methods of large RNAs. The RNAJAG module is available to academic users upon request.  相似文献   

6.
When phylogenetic trees constructed from morphological and molecular evidence disagree (i.e. are incongruent) it has been suggested that the differences are spurious or that the molecular results should be preferred a priori. Comparing trees can increase confidence (congruence), or demonstrate that at least one tree is incorrect (incongruence). Statistical analyses of 181 molecular and 49 morphological trees shows that incongruence is greater between than within the morphological and molecular partitions, and this difference is significant for the molecular partition. Because the level of incongruence between a pair of trees gives a minimum bound on how much error is present in the two trees, our results indicate that the level of error may be underestimated by congruence within partitions. Thus comparisons between morphological and molecular trees are particularly useful for detecting this incongruence (spurious or otherwise). Molecular trees have higher average congruence than morphological trees, but the difference is not significant, and both within- and between-partition incongruence is much lower than expected by chance alone. Our results suggest that both molecular and morphological trees are, in general, useful approximations of a common underlying phylogeny and thus, when molecules and morphology clash, molecular phylogenies should not be considered more reliable a priori.  相似文献   

7.
Few sequence alignment methods have been designed specifically for integral membrane proteins, even though these important proteins have distinct evolutionary and structural properties that might affect their alignments. Existing approaches typically consider membrane-related information either by using membrane-specific substitution matrices or by assigning distinct penalties for gap creation in transmembrane and non-transmembrane regions. Here, we ask whether favoring matching of predicted transmembrane segments within a standard dynamic programming algorithm can improve the accuracy of pairwise membrane protein sequence alignments. We tested various strategies using a specifically designed program called AlignMe. An updated set of homologous membrane protein structures, called HOMEP2, was used as a reference for optimizing the gap penalties. The best of the membrane-protein optimized approaches were then tested on an independent reference set of membrane protein sequence alignments from the BAliBASE collection. When secondary structure (S) matching was combined with evolutionary information (using a position-specific substitution matrix (P)), in an approach we called AlignMePS, the resultant pairwise alignments were typically among the most accurate over a broad range of sequence similarities when compared to available methods. Matching transmembrane predictions (T), in addition to evolutionary information, and secondary-structure predictions, in an approach called AlignMePST, generally reduces the accuracy of the alignments of closely-related proteins in the BAliBASE set relative to AlignMePS, but may be useful in cases of extremely distantly related proteins for which sequence information is less informative. The open source AlignMe code is available at https://sourceforge.net/projects/alignme/, and at http://www.forrestlab.org, along with an online server and the HOMEP2 data set.  相似文献   

8.
In this paper, we introduce the Hosoya-Spectral indices and the Hosoya information content of a graph. The first measure combines structural information captured by partial Hosoya polynomials and graph spectra. The latter is a graph entropy measure which is based on blocks consisting of vertices with the same partial Hosoya polynomial. We evaluate the discrimination power of these quantities by interpreting numerical results.  相似文献   

9.
Z Sun  W Tian 《PloS one》2012,7(8):e42887
The third-generation of sequencing technologies produces sequence reads of 1000 bp or more that may contain high polymorphism information. However, most currently available sequence analysis tools are developed specifically for analyzing short sequence reads. While the traditional Smith-Waterman (SW) algorithm can be used to map long sequence reads, its naive implementation is computationally infeasible. We have developed a new Sequence mapping and Analyzing Program (SAP) that implements a modified version of SW to speed up the alignment process. In benchmarks with simulated and real exon sequencing data and a real E. coli genome sequence data generated by the third-generation sequencing technologies, SAP outperforms currently available tools for mapping short and long sequence reads in both speed and proportion of captured reads. In addition, it achieves high accuracy in detecting SNPs and InDels in the simulated data. SAP is available at https://github.com/davidsun/SAP.  相似文献   

10.
The brain''s structural and functional systems, protein-protein interaction, and gene networks are examples of biological systems that share some features of complex networks, such as highly connected nodes, modularity, and small-world topology. Recent studies indicate that some pathologies present topological network alterations relative to norms seen in the general population. Therefore, methods to discriminate the processes that generate the different classes of networks (e.g., normal and disease) might be crucial for the diagnosis, prognosis, and treatment of the disease. It is known that several topological properties of a network (graph) can be described by the distribution of the spectrum of its adjacency matrix. Moreover, large networks generated by the same random process have the same spectrum distribution, allowing us to use it as a “fingerprint”. Based on this relationship, we introduce and propose the entropy of a graph spectrum to measure the “uncertainty” of a random graph and the Kullback-Leibler and Jensen-Shannon divergences between graph spectra to compare networks. We also introduce general methods for model selection and network model parameter estimation, as well as a statistical procedure to test the nullity of divergence between two classes of complex networks. Finally, we demonstrate the usefulness of the proposed methods by applying them to (1) protein-protein interaction networks of different species and (2) on networks derived from children diagnosed with Attention Deficit Hyperactivity Disorder (ADHD) and typically developing children. We conclude that scale-free networks best describe all the protein-protein interactions. Also, we show that our proposed measures succeeded in the identification of topological changes in the network while other commonly used measures (number of edges, clustering coefficient, average path length) failed.  相似文献   

11.
Characterizing and comparing the covariance or correlation structure of phenotypic traits lies at the heart of studies concerned with multivariate evolution. I describe an approach that represents the geometric structure of a correlation matrix as a type of proximity graph called a Correlation Proximity graph. Correlation Proximity graphs provide a compact representation of the geometric relationships inherent in correlation matrices, and these graphs have simple and intuitive properties. I demonstrate how this framework can be used to study patterns of phenotypic integration by employing this approach to compare phenotypic and additive genetic correlation matrices within and between species. I also outline a graph-based method for testing whether an inferred correlation proximity graph is one of a number of possible models that are consistent with a “soft” biological hypothesis.  相似文献   

12.
A recent analysis of sequence variations in ribosomal RNA's from 31 species of tetrahymenine ciliates groups them into 9 sets referred to as "ribosets." These species associations are not well correlated with the distributions of distinctive morphological characteristics. The phylogenetic structure suggests that modem "pyriform" tetrahymenines may be paraphyletic survivors of primitive design and that the morphologically distinctive forms may include examples of convergent evolution of derived forms. Alternatively, the common ancestor may have been a polymorphic species that has lost its plasticity in some derived lineages. In an attempt to test the ribosomal phylogeny, we here compare it with a phytogeny based on isozymic variation. The main features of the ribosomal and isozymic phylogenies are similar. The carnivorous (macrostome-forming) species are widely scattered in both, as are the bacteriophagous pyriform species. Isozymic and ribosomal analyses are optimally useful, however, in different contexts. Isozymic variations can distinguish species that are ribosomally identical. Ribosomal variations provide more secure evaluations of distant relationships.  相似文献   

13.
14.
ABSTRACT

The carotenoids constitute the most widespread class of pigments in nature. Most previous work has concentrated on the identification and characterization of their chemical physical properties and bioavailability. In recent years, significant amounts of research have been conducted in an attempt to analyze the genes and the molecular regulation of the genes involved in the biosynthesis of carotenoids. However, it is important not to lose sight of the early evolution of carotenoid biosynthesis. One of the major obstacles in understanding the evolution of the respective enzymes and their patterns of selection is a lack of a well-supported phylogenic analysis. In the present research, a major long-term objective was to provide a clearer picture of the evolutionary history of genes, together with an evaluation of the patterns of selection in algae. These phylogenies will be important in studies characterizing the evolution of algae. The gene sequences of the enzymes involved in the major steps of the carotenoid biosynthetic pathway in algae (cyanobacteria, rhofophyta, chlorophyta) have been analyzed. Phylogenetic relationships among protein-coding DNA sequences were reconstructed by neighbor-joining (NJ) analysis for the respective carotenoid biosynthetic pathway genes (crt) in algae. The analysis also contains an estimation of the rate of nonsynonymous nucleotide substitutions per nonsynonymous site (dN), synonymous nucleotide substitution per synonymous site (dS), and the ratio of nonsynonmous (dN/dS) for the test of selection patterns. The phylogenetic trees show that the taxa of some genera have a closer evolutionary relationship with other genera in some gene sequences, which suggests a common ancient origin and that lateral gene transfer has occurred among unrelated genera. The dN values of crt genes in the early pathway are relatively low, while those of the following steps are slightly higher, while the dN values of crt genes in chlorophyta are higher than those in cyanobacteria. Most of the dN/dS values exceed 1. The phylogenetic analysis revealed that lateral gene transfer may have taken place across algal genomes and the dN values suggest that most of the early crt genes are well conserved compared to the later crt genes. Furthermore, dN values also revealed that the crt genes of chlorophyta are more evolutionary than cyanobacteria. The amino acids' changes are mostly adaptive evolution under the influence of positive diversity selection.  相似文献   

15.
The explosion of bioinformatics technologies in the form of next generation sequencing (NGS) has facilitated a massive influx of genomics data in the form of short reads. Short read mapping is therefore a fundamental component of next generation sequencing pipelines which routinely match these short reads against reference genomes for contig assembly. However, such techniques have seldom been applied to microbial marker gene sequencing studies, which have mostly relied on novel heuristic approaches. We propose NINJA Is Not Just Another OTU-Picking Solution (NINJA-OPS, or NINJA for short), a fast and highly accurate novel method enabling reference-based marker gene matching (picking Operational Taxonomic Units, or OTUs). NINJA takes advantage of the Burrows-Wheeler (BW) alignment using an artificial reference chromosome composed of concatenated reference sequences, the “concatesome,” as the BW input. Other features include automatic support for paired-end reads with arbitrary insert sizes. NINJA is also free and open source and implements several pre-filtering methods that elicit substantial speedup when coupled with existing tools. We applied NINJA to several published microbiome studies, obtaining accuracy similar to or better than previous reference-based OTU-picking methods while achieving an order of magnitude or more speedup and using a fraction of the memory footprint. NINJA is a complete pipeline that takes a FASTA-formatted input file and outputs a QIIME-formatted taxonomy-annotated BIOM file for an entire MiSeq run of human gut microbiome 16S genes in under 10 minutes on a dual-core laptop.  相似文献   

16.
17.
ABSTRACT. The single name Pneumocystis carinii consists of an heterogeneous group of specific fungal organisms that colonize a very wide range of mammalian hosts. In the present study, mitochondrial large subunit (mtLSU) and small subunit (mtSSU) rRNA sequences of P. carinii organisms from 24 different mammalian species were compared. The mammals were included in six major groups: Primates (12 species). Rodents (5 species). Carnivores (3 species). Bats (1 species), Lagomorphs (1 species), Marsupials (1 species) and Ungulates (1 species). Direct sequencing of PCR products demonstrated that specific mtSSU and mtLSU rRNA Pneumocystis sequence could be attributed to each mammalian species. No animal harbored P. carinii f. sp. hominis. Comparison of combined mtLSU and mtSSU aligned sequences confirmed cospeciation of P. carinii and corresponding mammalian hosts. P. carinii organisms isolated from mammals of the same zoological group systematically clustered together. Within each cluster, the genetic divergence between P. carinii organisms varied in terms of the phylogenetic divergence existing among the corresponding host species. However, the relative position of P. carinii groups (rodent, carnivore or primate-derived P. carinii) could not be clearly determined. Further resolution will require the integration of additional sequence data.  相似文献   

18.
Risk assessment is an essential prelude to the development of accident prevention strategies in any chemical or petrochemical industry. Many techniques and methodologies such as HAZOP, failure mode effect analysis, fault tree analysis, preliminary hazard analysis, quantitative risk assessment and probabilistic safety analysis are available to conduct qualitative, quantitative, and probabilistic risk assessment. However, these methodologies are limited by: extensive data requirements, the length of study, results are not directly interpretable for decision making, simulation is often difficult, and they are applicable only at the operation or late design stage. Khan et al. (2001a) recently proposed a detailed methodology for risk assessment and safety evaluation. This methodology is simple, yet it is effective in safety and design-related decision making, and it has been applied successfully to many case studies. It is named SCAP, where S stands for safety, C and A stand for credible accident respectively, and P stands for probabilistic fault tree analysis. This paper recapitulates the SCAP methodology and demonstrates its application to a petrochemical plant.  相似文献   

19.
All cellular processes depend on the functionality of proteins. Although the functionality of a given protein is the direct consequence of its unique amino acid sequence, it is only realized by the folding of the polypeptide chain into a single defined three-dimensional arrangement or more commonly into an ensemble of interconverting conformations. Investigating the connection between protein conformation and its function is therefore essential for a complete understanding of how proteins are able to fulfill their great variety of tasks. One possibility to study conformational changes a protein undergoes while progressing through its functional cycle is hydrogen-1H/2H-exchange in combination with high-resolution mass spectrometry (HX-MS). HX-MS is a versatile and robust method that adds a new dimension to structural information obtained by e.g. crystallography. It is used to study protein folding and unfolding, binding of small molecule ligands, protein-protein interactions, conformational changes linked to enzyme catalysis, and allostery. In addition, HX-MS is often used when the amount of protein is very limited or crystallization of the protein is not feasible. Here we provide a general protocol for studying protein dynamics with HX-MS and describe as an example how to reveal the interaction interface of two proteins in a complex.     相似文献   

20.
Tests of a sample of 206 cladograms of mammals show that morphological data seem to predict phylogenies that match the known fossil record better than molecular trees. Three metrics that assess the rank order of branching points, the stratigraphic consistency of those nodes, and the ratio of ghost range to known range show a considerable diversity of values. Some published trees show excellent matching with fossil-record data; others show almost no correspondence whatsoever. Morphological trees are nearly twice as good as molecular trees in terms of matching of the rank orders of nodes and oldest fossils, while morphological trees are 10% better than molecular in terms of stratigraphic consistency of the nodes. The ratios of ghost range to known range are lower for molecular trees. Among the molecular trees, those based on gene data are considerably better than those based on protein sequences, at least in terms of the rank order of nodes and the stratigraphic consistency of nodes. Protein trees, however, were best of all in terms of minimizing the proportion of ghost range. These findings probably indicate real phenomena, but the match of molecular trees to the expectations of stratigraphy may improve as the study of molecular phylogeny matures.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号