共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
How best to summarize large and complex datasets is a problem that arises in many areas of science. We approach it from the point of view of seeking data summaries that minimize the average squared error of the posterior distribution for a parameter of interest under approximate Bayesian computation (ABC). In ABC, simulation under the model replaces computation of the likelihood, which is convenient for many complex models. Simulated and observed datasets are usually compared using summary statistics, typically in practice chosen on the basis of the investigator's intuition and established practice in the field. We propose two algorithms for automated choice of efficient data summaries. Firstly, we motivate minimisation of the estimated entropy of the posterior approximation as a heuristic for the selection of summary statistics. Secondly, we propose a two-stage procedure: the minimum-entropy algorithm is used to identify simulated datasets close to that observed, and these are each successively regarded as observed datasets for which the mean root integrated squared error of the ABC posterior approximation is minimized over sets of summary statistics. In a simulation study, we both singly and jointly inferred the scaled mutation and recombination parameters from a population sample of DNA sequences. The computationally-fast minimum entropy algorithm showed a modest improvement over existing methods while our two-stage procedure showed substantial and highly-significant further improvement for both univariate and bivariate inferences. We found that the optimal set of summary statistics was highly dataset specific, suggesting that more generally there may be no globally-optimal choice, which argues for a new selection for each dataset even if the model and target of inference are unchanged. 相似文献
3.
4.
Blanc G Ngwamidiba M Ogata H Fournier PE Claverie JM Raoult D 《Molecular biology and evolution》2005,22(10):2073-2083
The Rickettsia genus is a group of obligate intracellular parasitic alpha-proteobacteria that includes human pathogens responsible for the typhus disease and various types of spotted fevers. rOmpA and rOmpB are two members of the "surface cell antigen" (Sca) autotransporter (AT) protein family that may play key roles in the adhesion of the Rickettsia cells to the host tissue. These molecules are likely determinants for the pathogenicity of the Rickettsia and represent good candidates for vaccine development. We identified the 17 members of this family of outer-membrane proteins in nine fully sequenced Rickettsia genomes. The typical architecture of the Sca proteins is composed of an N-terminal signal peptide and a C-terminal AT domain that promote the export of the central passenger domain to the outside of the bacteria. A characteristic of this family is the frequent degradation of the genes, which results in different subsets of the sca genes being expressed among Rickettsia species. Here, we present a detailed analysis of their phylogenetic relationships and evolution. We provide strong evidence that rOmpA and rOmpB as well as three other members of the Sca protein family--Sca1, Sca2, and Sca4--have evolved under positive selection. The exclusive distribution of the predicted positively selected sites within the passenger domains of these proteins argues that these regions are involved in the interaction with the host and may be locked in "arms race" coevolutionary conflicts. 相似文献
5.
Tennessen JA 《Journal of evolutionary biology》2005,18(6):1387-1394
An increasing number of studies in both vertebrates and invertebrates show that the evolution of antimicrobial peptides is driven by positive selection. Because these diverse molecules show potential for therapeutic applications, they are currently the targets of much structural and functional research, providing extensive background data for evolutionary studies. In this paper, patterns of molecular evolution in antimicrobial peptide genes are reviewed. Evidence for positive selection on antimicrobial peptides includes an excess of nonsynonymous nucleotide substitutions, an excess of charge-changing amino acid substitutions, nonneutral patterns of allelic variation, and functional assays in vivo and in vitro that show improved antimicrobial effects for derived sequence variants. Positive selection on antimicrobial peptides may be as common as, but perhaps weaker than, selection on the best-known example of adaptively evolving immunity genes, the major histocompatibility complex. Thus, antimicrobial peptides present a useful and underutilized model for the study of adaptive molecular evolution. 相似文献
6.
Ford MJ 《Molecular biology and evolution》2001,18(4):639-647
Transferrins are iron-binding proteins that are involved in iron storage and resistance to bacterial disease. Previous work has shown that nonsynonymous-to-synonymous-site substitution ratios (d(n)/d(s) ratios) between transferrin genes from some salmonid species were significantly greater than 1.0, providing evidence for positive selection at the transferrin gene. The purpose of the current study was to put these earlier results in a broader evolutionary context by examining variation among 25 previously published transferrin sequences from fish, amphibians, and mammals. The results of the study show that evidence for positive selection at transferrin is limited to salmonids-d(n)/d(s) ratios estimated for nonsalmonid lineages were generally less than 1.0. Within the salmonids, approximately 13% of the transferrin codon sites are estimated to be subject to positive selection, with an estimated d(n)/d(s) ratio of approximately 7. The three- dimensional locations of some of the selected sites were inferred by comparing these sites to homologous sites in the bovine lactoferrin crystallographic structure. The selected sites generally fall on the outside of the molecule, within and near areas that are bound by transferrin-binding proteins from human pathogenic bacteria. The physical locations of sites estimated to be subject to positive selection support previous speculation that competition for iron from pathogenic bacteria could be the source of positive selection. 相似文献
7.
It is not clear whether matK evolves under Darwinian selection. In this study, the gymnosperm Taxaceae, Cephalotaxaceae and
Pinaceae were used to illustrate the physicochemical evolution, molecular adaptation and evolutionary dynamics of gene divergence
in matKs. matK sequences were amplified from 27 Taxaceae and 12 Cephalotaxaceae species. matK sequences of 19 Pinaceae species were retrieved from GenBank. The phylogenetic tree was generated using conceptual-translated
amino acid sequences. Selective influences were investigated using standard d
N/d
S ratio methods and more sensitive techniques investigating the amino acid property changes resulting from nonsynonymous replacements
in a phylogenetic context. Analyses revealed the presence of positive selection in matKs (N-terminal region, RT domain and
domain X) of Taxaceae and Pinaceae, and found positive destabilizing selection in N-terminal region and RT domain of Cephalotaxaceae
matK. Moreover, various amino acid properties were found to be influenced by destabilizing positive selection. Amino acid
sites relating to these properties and to different secondary structures were found and have the potential to affect group
II intron maturase function. Despite the evolutionary constraint on the rapidly evolving matK, this protein evolves under
positive selection in gymnosperm. Several regions of matK have experienced molecular adaptation which fine-tunes maturase
performance. 相似文献
8.
It is not clear whether matK evolves under Darwinian selection. In this study, 70 plant groups, representing 2,279 species at various evolutionary levels, were used to illustrate the molecular adaptation and evolutionary dynamics of gene divergence in matKs. Selective influences were investigated using standard dN/dS ratio methods. Analyses revealed the presence of positive selection in matKs of 32 plant groups. More positively selected sites were detected in the N-terminal region than in the RT domain and domain X of matK. Moreover, removing amino acid sites that are under positive selection has a significant effect on the bootstrap values of phylogenetic reconstruction. Our results suggest that the rapidly evolving matK evolves under positive selection in some lineages of land plants. Several regions of matK have experienced molecular adaptation, which fine-tunes maturase performance. 相似文献
9.
Estimating recombination rates from single-nucleotide polymorphisms using summary statistics
下载免费PDF全文

We describe a novel method for jointly estimating crossing-over and gene-conversion rates from population genetic data using summary statistics. The performance of our method was tested on simulated data sets and compared with the composite-likelihood method of R. R. Hudson. For several realistic parameter values, the new method performed similarly to the composite-likelihood approach for estimating crossing-over rates and better when estimating gene-conversion rates. We used our method to analyze a human data set recently genotyped by Perlegen Sciences. 相似文献
10.
To evaluate the relative importance of positive selection and neutral drift from the nucleotide base changes observed in
the homologous alignment of genes, a theoretical equation of base changes is formulated by including both the influence of
selection and the base substitutions due to mutations. Under the assumption that the average rate of base substitutions estimated
from synonymous changes is the ``true' mutation rate applicable at all positions, this method is applied to the vertebrate
globin gene family, and evaluates the departures of base change rates from the ``true' mutation rate at the first and second
codon positions as a consequence of preferential selection for the conservation of important function. In addition to the
strong effect of selection on the amino acid residues in the internal region mostly common to myoglobin and hemoglobin chains,
the distinctive directions of selective parameter values are seen at sites on the globin surface, distinguishing the subunit
contact residues of hemoglobins from the polar residues on the surface of myoglobins. Moreover, this effect of selection distinguishing
between the myoglobin and hemoglobin chain genes becomes weaker in cold-blooded vertebrates, especially in fish, strongly
suggesting the possibility that the clear distinction between these globins is a result of selection out of the changes regarded
as neutral ones in an ancestor of vertebrates. Thus, the present method may also serve to investigate the homology of many
other proteins from the aspect of molecular evolution, mainly focusing on the evolution of their biological functions.
Received: 2 January 1996 / Accepted: 20 February 1997 相似文献
11.
Estimating sample averages and sample variability is important in analyzing neural spike trains data in computational neuroscience. Current approaches have focused on advancing the use of parametric or semiparametric probability models of the underlying stochastic process, where the probabilistic distribution is characterized at each time point with basic statistics such as mean and variance. To directly capture and analyze the average and variability in the observation space of the spike trains, we focus on a data-driven approach where statistics are defined and computed in a function space in which the spike trains are viewed as individual points. Based on the definition of a “Euclidean” metric, a recent paper introduced the notion of the mean of a set of spike trains and developed an efficient algorithm to compute it under some restrictive conditions. Here we extend this study by: (1) developing a novel algorithm for mean computation that is quite general, and (2) introducing a notion of covariance of a set of spike trains. Specifically, we estimate the covariance matrix using the geometry of the warping functions that map the mean spike train to each of the spike trains in the dataset. Results from simulations as well as a neural recording in primate motor cortex indicate that the proposed mean and covariance successfully capture the observed variability in spike trains. In addition, a “Gaussian-type” probability model (defined using the estimated mean and covariance) reasonably characterizes the distribution of the spike trains and achieves a desirable performance in the classification of the spike trains. 相似文献
12.
Arindam Dutta Joydeep ChakrabortyTapan K. Dutta 《Biochemical and biophysical research communications》2013
Using different maximum-likelihood models of adaptive evolution, signatures of natural selective pressure, operating across the naphthalene family of dioxygenases, were examined. A lineage- and branch-site specific combined analysis revealed that purifying selection pressure dominated the evolutionary history of the enzyme family. Specifically, episodic positive Darwinian selection pressure, affecting only a few sites in a subset of lineages, was found to be responsible for the evolution of nitroarene dioxygenases (NArDO) from naphthalene dioxygenase (NDO). Site-specific analysis confirmed the absence of diversifying selection pressure at any particular site. Different sets of positively selected residues, obtained from branch-site specific analysis, were detected for the evolution of each NArDO. They were mainly located around the active site, the catalytic pocket and their adjacent regions, when mapped onto the 3D structure of the α-subunit of NDO. The present analysis enriches the current understanding of adaptive evolution and also broadens the scope for rational alteration of substrate specificity of enzyme by directed evolution. 相似文献
13.
Pierron D Opazo JC Heiske M Papper Z Uddin M Chand G Wildman DE Romero R Goodman M Grossman LI 《PloS one》2011,6(10):e26269
Cytochrome c (cyt c) participates in two crucial cellular processes, energy production and apoptosis, and unsurprisingly is a highly conserved protein. However, previous studies have reported for the primate lineage (i) loss of the paralogous testis isoform, (ii) an acceleration and then a deceleration of the amino acid replacement rate of the cyt c somatic isoform, and (iii) atypical biochemical behavior of human cyt c. To gain insight into the cause of these major evolutionary events, we have retraced the history of cyt c loci among primates. For testis cyt c, all primate sequences examined carry the same nonsense mutation, which suggests that silencing occurred before the primates diversified. For somatic cyt c, maximum parsimony, maximum likelihood, and Bayesian phylogenetic analyses yielded the same tree topology. The evolutionary analyses show that a fast accumulation of non-synonymous mutations (suggesting positive selection) occurred specifically on the anthropoid lineage root and then continued in parallel on the early catarrhini and platyrrhini stems. Analysis of evolutionary changes using the 3D structure suggests they are focused on the respiratory chain rather than on apoptosis or other cyt c functions. In agreement with previous biochemical studies, our results suggest that silencing of the cyt c testis isoform could be linked with the decrease of primate reproduction rate. Finally, the evolution of cyt c in the two sister anthropoid groups leads us to propose that somatic cyt c evolution may be related both to COX evolution and to the convergent brain and body mass enlargement in these two anthropoid clades. 相似文献
14.
Distinguishing mechanisms for the evolution of co-operation 总被引:1,自引:0,他引:1
The existence of co-operation between species has been cast as a problem to the selfish-gene view of evolution: why does co-operation persist, when it would seem that individual selection should favor the unco-operative individual who exploits the co-operative tendencies of its partner and gives nothing in return? The recent literature has emphasized one type of model as underlying the evolution and stability of interspecific co-operation, which we term the "partner-fidelity" model, and which is typified by the game theory model known as the iterated Prisoner's Dilemma game. Under this mechanism, individuals are associated with the same partner(s) during an indefinite sequence of interactions. Individuals who at any time fail to co-operate with their partner can be penalized by those same partners in subsequent trials, hence the co-operation can be evolutionarily stable. Many examples of biological co-operation that have been offered appear to conform to this model. However, a few examples appear instead to fit a different and unrecognized mechanism, termed "partner-choice". Under partner-choice, individuals are associated for just one interaction, but an asymmetry enables one member to differentially reward co-operative vs. unco-operative partners in advance of any possible exploitation. Possible examples of co-operation maintained through partner-choice mechanisms are provided by the yucca/yucca moth system and the fig/fig wasp system. 相似文献
15.
Kishimoto T Iijima L Tatsumi M Ono N Oyake A Hashimoto T Matsuo M Okubo M Suzuki S Mori K Kashiwagi A Furusawa C Ying BW Yomo T 《PLoS genetics》2010,6(10):e1001164
It remains to be determined experimentally whether increasing fitness is related to positive selection, while stationary fitness is related to neutral evolution. Long-term laboratory evolution in Escherichia coli was performed under conditions of thermal stress under defined laboratory conditions. The complete cell growth data showed common continuous fitness recovery to every 2°C or 4°C stepwise temperature upshift, finally resulting in an evolved E. coli strain with an improved upper temperature limit as high as 45.9°C after 523 days of serial transfer, equivalent to 7,560 generations, in minimal medium. Two-phase fitness dynamics, a rapid growth recovery phase followed by a gradual increasing growth phase, was clearly observed at diverse temperatures throughout the entire evolutionary process. Whole-genome sequence analysis revealed the transition from positive to neutral in mutation fixation, accompanied with a considerable escalation of spontaneous substitution rate in the late fitness recovery phase. It suggested that continually increasing fitness not always resulted in the reduction of genetic diversity due to the sequential takeovers by fit mutants, but caused the accumulation of a considerable number of mutations that facilitated the neutral evolution. 相似文献
16.
Errors in the inferred multiple sequence alignment may lead to false prediction of positive selection. Recently, methods for detecting unreliable alignment regions were developed and were shown to accurately identify incorrectly aligned regions. While removing unreliable alignment regions is expected to increase the accuracy of positive selection inference, such filtering may also significantly decrease the power of the test, as positively selected regions are fast evolving, and those same regions are often those that are difficult to align. Here, we used realistic simulations that mimic sequence evolution of HIV-1 genes to test the hypothesis that the performance of positive selection inference using codon models can be improved by removing unreliable alignment regions. Our study shows that the benefit of removing unreliable regions exceeds the loss of power due to the removal of some of the true positively selected sites. 相似文献
17.
A simple method to distinguish hitchhiking and background selection is proposed. It is based on the observation that these models make different predictions about the average level of nucleotide diversity in regions of low recombination. The method is applied to data from Drosophila melanogaster and two highly selfing tomato species. 相似文献
18.
19.
Neutral dynamics occur in evolution if all types are ‘effectively equal’ in their reproductive success, where the definition of ‘effectively equal’ depends on the population size and the details of mutations. Empirically observed neutral genetic evolution in extremely large clonal populations can only be explained under current models if selection is completely absent. Such models typically consider the case where population dynamics occurs on a different timescale to evolution. However, this assumption is invalid when mutations are not rare in a whole population. We show that this has important consequences for the occurrence of neutral evolution in clonal populations. In highly connected type spaces, neutral dynamics can occur for all population sizes despite significant selective differences, via the forming of effectively neutral networks connecting rare neutral types. Biological implications include an explanation for the high diversity of rare types that survive in large clonal populations, and a theoretical justification for the use of neutral null models. 相似文献
20.
Bazykin GA Kondrashov AS 《Proceedings. Biological sciences / The Royal Society》2012,279(1742):3409-3417
Slow evolution of conservative segments of coding and non-coding DNA is caused by the action of negative selection, which removes new mutations. However, the mode of selection that affects the few substitutions that do occur within such segments remains unclear. Here, we show that the fraction of allele replacements that were driven by positive selection, and the strength of this selection, is the highest within the conservative segments of Drosophila protein-coding genes. The McDonald-Kreitman test, applied to the data on variation in Drosophila melanogaster and in Drosophila simulans, indicates that within the most conservative protein segments, approximately 72 per cent (approx. 80%) of allele replacements were driven by positive selection, as opposed to only approximately 44 per cent (approx. 53%) at rapidly evolving segments. Data on multiple non-synonymous substitutions at a codon lead to the same conclusion and additionally indicate that positive selection driving allele replacements at conservative sites is the strongest, as it accelerates evolution by a factor of approximately 40, as opposed to a factor of approximately 5 at rapidly evolving sites. Thus, random drift plays only a minor role in the evolution of conservative DNA segments, and those relatively rare allele replacements that occur within such segments are mostly driven by substantial positive selection. 相似文献