首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Understanding the mechanics of adaptive evolution requires not only knowing the quantitative genetic bases of the traits of interest but also obtaining accurate measures of the strengths and modes of selection acting on these traits. Most recent empirical studies of multivariate selection have employed multiple linear regression to obtain estimates of the strength of selection. We reconsider the motivation for this approach, paying special attention to the effects of nonnormal traits and fitness measures. We apply an alternative statistical method, logistic regression, to estimate the strength of selection on multiple phenotypic traits. First, we argue that the logistic regression model is more suitable than linear regression for analyzing data from selection studies with dichotomous fitness outcomes. Subsequently, we show that estimates of selection obtained from the logistic regression analyses can be transformed easily to values that directly plug into equations describing adaptive microevolutionary change. Finally, we apply this methodology to two published datasets to demonstrate its utility. Because most statistical packages now provide options to conduct logistic regression analyses, we suggest that this approach should be widely adopted as an analytical tool for empirical studies of multivariate selection.  相似文献   

2.
A popular approach to detecting positive selection is to estimate the parameters of a probabilistic model of codon evolution and perform inference based on its maximum likelihood parameter values. This approach has been evaluated intensively in a number of simulation studies and found to be robust when the available data set is large. However, uncertainties in the estimated parameter values can lead to errors in the inference, especially when the data set is small or there is insufficient divergence between the sequences. We introduce a Bayesian model comparison approach to infer whether the sequence as a whole contains sites at which the rate of nonsynonymous substitution is greater than the rate of synonymous substitution. We incorporated this probabilistic model comparison into a Bayesian approach to site-specific inference of positive selection. Using simulated sequences, we compared this approach to the commonly used empirical Bayes approach and investigated the effect of tree length on the performance of both methods. We found that the Bayesian approach outperforms the empirical Bayes method when the amount of sequence divergence is small and is less prone to false-positive inference when the sequences are saturated, while the results are indistinguishable for intermediate levels of sequence divergence.  相似文献   

3.
Detecting signals of selection from genomic data is a central problem in population genetics. Coupling the rich information in the ancestral recombination graph (ARG) with a powerful and scalable deep-learning framework, we developed a novel method to detect and quantify positive selection: Selection Inference using the Ancestral recombination graph (SIA). Built on a Long Short-Term Memory (LSTM) architecture, a particular type of a Recurrent Neural Network (RNN), SIA can be trained to explicitly infer a full range of selection coefficients, as well as the allele frequency trajectory and time of selection onset. We benchmarked SIA extensively on simulations under a European human demographic model, and found that it performs as well or better as some of the best available methods, including state-of-the-art machine-learning and ARG-based methods. In addition, we used SIA to estimate selection coefficients at several loci associated with human phenotypes of interest. SIA detected novel signals of selection particular to the European (CEU) population at the MC1R and ABCC11 loci. In addition, it recapitulated signals of selection at the LCT locus and several pigmentation-related genes. Finally, we reanalyzed polymorphism data of a collection of recently radiated southern capuchino seedeater taxa in the genus Sporophila to quantify the strength of selection and improved the power of our previous methods to detect partial soft sweeps. Overall, SIA uses deep learning to leverage the ARG and thereby provides new insight into how selective sweeps shape genomic diversity.  相似文献   

4.
Phylogenetic codon models are routinely used to characterize selective regimes in coding sequences. Their parametric design, however, is still a matter of debate, in particular concerning the question of how to account for differing nucleotide frequencies and substitution rates. This problem relates to the fact that nucleotide composition in protein-coding sequences is the result of the interactions between mutation and selection. In particular, because of the structure of the genetic code, the nucleotide composition differs between the three coding positions, with the third position showing a more extreme composition. Yet, phylogenetic codon models do not correctly capture this phenomenon and instead predict that the nucleotide composition should be the same for all three positions. Alternatively, some models allow for different nucleotide rates at the three positions, an approach conflating the effects of mutation and selection on nucleotide composition. In practice, it results in inaccurate estimation of the strength of selection. Conceptually, the problem comes from the fact that phylogenetic codon models do not correctly capture the fixation bias acting against the mutational pressure at the mutation–selection equilibrium. To address this problem and to more accurately identify mutation rates and selection strength, we present an improved codon modeling approach where the fixation rate is not seen as a scalar, but as a tensor. This approach gives an accurate representation of how mutation and selection oppose each other at equilibrium and yields a reliable estimate of the mutational process, while disentangling the mean fixation probabilities prevailing in different mutational directions.  相似文献   

5.
Goddard M 《Genetica》2009,136(2):245-257
Genomic selection refers to the use of dense markers covering the whole genome to estimate the breeding value of selection candidates for a quantitative trait. This paper considers prediction of breeding value based on a linear combination of the markers. In this case the best estimate of each marker’s effect is the expectation of the effect conditional on the data. To calculate this requires a prior distribution of marker effects. If the marker effects are normally distributed with constant variance, BLUP can be used to calculate the estimated effects of the markers and hence the estimated breeding value (EBV). In this case the model is equivalent to a conventional animal model in which the relationship matrix among the animals is estimated from the markers instead of the pedigree. The accuracy of the EBV can approach 1.0 but a very large amount of data is required. An alternative model was investigated in which only some markers have non-zero effects and these effects follow a reflected exponential distribution. In this case the expected effect of a marker is a non-linear function of the data such that apparently small effects are regressed back almost to zero and consequently these markers can be deleted from the model. The accuracy in this case is considerably higher than when marker effects are normally distributed. If genomic selection is practiced for several generations the response declines in a manner that can be predicted from the marker allele frequencies. Genomic selection is likely to lead to a more rapid decline in the selection response than phenotypic selection unless new markers are continually added to the prediction of breeding value. A method to find the optimum index to maximise long term selection response is derived. This index varies the weight given to a marker according to its frequency such that markers where the favourable allele has low frequency receive more weight in the index.  相似文献   

6.
Studies of sexual selection show that both female choice and male-male competition can influence the evolution and expression of male phenotypes. In this regard, it is important to determine the functional basis through which male traits influence variation in male reproductive success. In this study, we estimate the strength and type of sexual selection acting on adult males in a population of wild lemur, Verreaux's sifaka (Propithecus verreauxi verreauxi). The data used in this study were collected at Beza Mahafaly Special Reserve, southwest Madagascar. We conducted paternity analyses on 70 males in order to estimate the distribution of reproductive success in this population. Paternity data were combined with morphometric data in order to determine which morphological traits covary with male fitness. Five morphological traits were defined in this analysis: body size, canine size, torso shape, arm shape, and leg shape. We utilized phenotypic selection models in order to determine the strength and type of selection acting directly on each trait. Our results show that directional selection acts on leg shape (a trait that is functionally related to locomotor performance), stabilizing selection acts on body mass and torso shape, and negative correlational selection acts on body mass and leg shape. We draw from biomechanical and kinematic studies of sifaka locomotion to provide a functional context for how these traits influence male mating competition within an arboreal environment. Verreaux's sifaka and many other gregarious lemurs are sexually monomorphic in body mass and canine size, despite a high frequency and intensity of male-male aggressive competition. Our results provide some insight into this paradox: in our population, there is no directional selection acting on body mass or canine size in males. The total pattern of selection implicates that behaviors relating to locomotor performance are more important than behaviors relating to fighting ability during intrasexual contests.  相似文献   

7.
Size-related traits are common targets of natural selection, yet there is a relative paucity of data on selection among mammals, particularly from studies measuring lifetime reproductive success (LRS). We present the first phenotypic selection analysis using LRS on size-related traits in a large terrestrial carnivore, the spotted hyena, which displays a rare pattern of female-biased sexual size dimorphism (SSD). Using path analysis, we investigate the operation of selection to address hypotheses proposed to explain SSD in spotted hyenas. Ideal size measures are elusive, and allometric variation often obfuscates interpretation of size proxies. We adopt a novel approach integrating two common methods of assessing size, and demonstrate lifetime selection on size-related traits that scale hypoallometrically with overall body size. Our data support selection on hypoallometric traits in hyenas, but not on traits exhibiting isometric or hyperallometric scaling relationships, or on commonly used measures of overall body size. Our results represent the first estimate of lifetime selection on a large carnivore, and suggest a possible route for maintenance of female-biased SSD in spotted hyenas. Finally, our results highlight the importance of choosing appropriate measures when estimating animal body size, and suggest caution in interpreting selection on size-related traits as selection on size itself.  相似文献   

8.
The distribution of selection coefficients of new mutations is of key interest in population genetics. In this paper we explore how codon-based likelihood models can be used to estimate the distribution of selection coefficients of new amino acid replacement mutations from phylogenetic data. To obtain such estimates we assume that all mutations at the same site have the same selection coefficient. We first estimate the distribution of selection coefficients from two large viral data sets under the assumption that the viral population size is the same along all lineages of the phylogeny and that the selection coefficients vary among sites. We then implement several new models in which the lineages of the phylogeny may have different population sizes. We apply the new models to a data set consisting of the coding regions from eight primate mitochondrial genomes. The results suggest that there might be little power to determine the exact shape of the distribution of selection coefficient but that the normal and gamma distributions fit the data significantly better than the exponential distribution.  相似文献   

9.
Accounting for historical demographic features, such as the strength and timing of gene flow and divergence times between closely related lineages, is vital for many inferences in evolutionary biology. Approximate Bayesian computation (ABC) is one method commonly used to estimate demographic parameters. However, the DNA sequences used as input for this method, often microsatellites or RADseq loci, usually represent a small fraction of the genome. Whole genome sequencing (WGS) data, on the other hand, have been used less often with ABC, and questions remain about the potential benefit of, and how to best implement, this type of data; we used pseudo‐observed data sets to explore such questions. Specifically, we addressed the potential improvements in parameter estimation accuracy that could be associated with WGS data in multiple contexts; namely, we quantified the effects of (a) more data, (b) haplotype‐based summary statistics, and (c) locus length. Compared with a hypothetical RADseq data set with 2.5 Mbp of data, using a 1 Gbp data set consisting of 100 Kbp sequences led to substantial gains in the accuracy of parameter estimates, which was mostly due to haplotype statistics and increased data. We also quantified the effects of including (a) locus‐specific recombination rates, and (b) background selection information in ABC analyses. Importantly, assuming uniform recombination or ignoring background selection had a negative effect on accuracy in many cases. Software and results from this method validation study should be useful for future demographic history analyses.  相似文献   

10.
Natural selection operates via fitness components like mating success, fecundity, and longevity, which can be understood as intermediaries in the causal process linking traits to fitness. In particular, sexual selection occurs when traits influence mating or fertilization success, which, in turn, influences fitness. We show how to quantify both these steps in a single path analysis, leading to better estimates of the strength of sexual selection. Our model controls for confounding variables, such as body size or condition, when estimating the relationship between mating and reproductive success. Correspondingly, we define the Bateman gradient and the Jones index using partial rather than simple regressions, which better captures how they are commonly interpreted. The model can be applied both to purely phenotypic data and to quantitative genetic parameters estimated using information on relatedness. The phenotypic approach breaks down selection differentials into a sexually selected and a “remainder” component. The quantitative genetic approach decomposes the estimated evolutionary response to selection analogously. We apply our method to analyze sexual selection in male dusky pipefish, Syngnathus floridae, and in two simulated datasets. We highlight conceptual and statistical limitations of previous path‐based approaches, which can lead to substantial misestimation of sexual selection.  相似文献   

11.
The study of the mechanisms that maintain genetic variation has a long history in population genetics. We analyze a multilocus-multiallele model of frequency- and density-dependent selection in a large randomly mating population. The number of loci and the number of alleles per locus are arbitrary. The n loci are assumed to contribute additively to a quantitative character under stabilizing or directional selection as well as under frequency-dependent selection caused by intraspecific competition. We assume the strength of stabilizing selection to be weak, whereas the strength of frequency dependence may be arbitrary. Density-dependence is induced by population regulation. Our main result is a characterization of the equilibrium structure and its stability properties in terms of all parameters. It turns out that no equilibrium exists with more than two alleles segregating per locus. We give necessary and sufficient conditions on the strength of frequency dependence to ensure the maintenance of multilocus polymorphism. We also give explicit formulas on the number of polymorphic loci maintained at equilibrium. These results are based on the assumption that selection is sufficiently weak compared with recombination, so that linkage equilibrium can be assumed. If additionally the population size is assumed to be constant, we prove that the dynamics of the model form a generalized gradient system. For the model in its general form we are able to derive necessary and sufficient conditions for the stability of the monomorphic equilibria. Furthermore, we briefly analyze a special symmetric two-locus two-allele model for a constant population size but allowing for linkage disequilibrium. Finally, we analyze a single diallelic locus with dominance to illustrate the complications that can occur if the assumption of additivity is relaxed.  相似文献   

12.
13.
Understanding how selection operates on a set of phenotypic traits is central to evolutionary biology. Often, it requires estimating survival (or other fitness‐related life‐history traits) which can be difficult to obtain for natural populations because individuals cannot be exhaustively followed. To cope with this issue of imperfect detection, we advocate the use of mark‐recapture data and we provide a general framework for both the estimation of linear and nonlinear selection gradients and the visualization of fitness surfaces. To quantify the strength of selection, the standard second‐order polynomial regression method is integrated in mark‐recapture models. To visualize the form of selection, we use splines to display selection acting on multivariate phenotypes in the most flexible way. We employ Markov chain Monte Carlo sampling in a Bayesian framework to estimate model parameters, assessing traits relevance and calculating the optimal amount of smoothing. We illustrate our approach using data from a wild population of Common blackbirds (Turdus merula) to investigate survival in relation to morphological traits, and provide evidence for correlational selection using the new methodology. Overall, the framework we propose will help in exploring the full potential of mark‐recapture data to study natural selection.  相似文献   

14.
The interest in individualized medicines and upcoming or renewed regulatory requests to assess treatment effects in subgroups of confirmatory trials requires statistical methods that account for selection uncertainty and selection bias after having performed the search for meaningful subgroups. The challenge is to judge the strength of the apparent findings after mining the same data to discover them. In this paper, we describe a resampling approach that allows to replicate the subgroup finding process many times. The replicates are used to adjust the effect estimates for selection bias and to provide variance estimators that account for selection uncertainty. A simulation study provides some evidence of the performance of the method and an example from oncology illustrates its use.  相似文献   

15.
Jing Qin  Yu Shen 《Biometrics》2010,66(2):382-392
Summary Length‐biased time‐to‐event data are commonly encountered in applications ranging from epidemiological cohort studies or cancer prevention trials to studies of labor economy. A longstanding statistical problem is how to assess the association of risk factors with survival in the target population given the observed length‐biased data. In this article, we demonstrate how to estimate these effects under the semiparametric Cox proportional hazards model. The structure of the Cox model is changed under length‐biased sampling in general. Although the existing partial likelihood approach for left‐truncated data can be used to estimate covariate effects, it may not be efficient for analyzing length‐biased data. We propose two estimating equation approaches for estimating the covariate coefficients under the Cox model. We use the modern stochastic process and martingale theory to develop the asymptotic properties of the estimators. We evaluate the empirical performance and efficiency of the two methods through extensive simulation studies. We use data from a dementia study to illustrate the proposed methodology, and demonstrate the computational algorithms for point estimates, which can be directly linked to the existing functions in S‐PLUS or R .  相似文献   

16.
Recent genome sequencing studies with large sample sizes in humans have discovered a vast quantity of low-frequency variants, providing an important source of information to analyze how selection is acting on human genetic variation. In order to estimate the strength of natural selection acting on low-frequency variants, we have developed a likelihood-based method that uses the lengths of pairwise identity-by-state between haplotypes carrying low-frequency variants. We show that in some nonequilibrium populations (such as those that have had recent population expansions) it is possible to distinguish between positive or negative selection acting on a set of variants. With our new framework, one can infer a fixed selection intensity acting on a set of variants at a particular frequency, or a distribution of selection coefficients for standing variants and new mutations. We show an application of our method to the UK10K phased haplotype dataset of individuals.  相似文献   

17.
For over a decade, experimental evolution has been combined with high-throughput sequencing techniques. In so-called Evolve-and-Resequence (E&R) experiments, populations are kept in the laboratory under controlled experimental conditions where their genomes are sampled and allele frequencies monitored. However, identifying signatures of adaptation in E&R datasets is far from trivial, and it is still necessary to develop more efficient and statistically sound methods for detecting selection in genome-wide data. Here, we present Bait-ER – a fully Bayesian approach based on the Moran model of allele evolution to estimate selection coefficients from E&R experiments. The model has overlapping generations, a feature that describes several experimental designs found in the literature. We tested our method under several different demographic and experimental conditions to assess its accuracy and precision, and it performs well in most scenarios. Nevertheless, some care must be taken when analysing trajectories where drift largely dominates and starting frequencies are low. We compare our method with other available software and report that ours has generally high accuracy even for trajectories whose complexity goes beyond a classical sweep model. Furthermore, our approach avoids the computational burden of simulating an empirical null distribution, outperforming available software in terms of computational time and facilitating its use on genome-wide data. We implemented and released our method in a new open-source software package that can be accessed at https://doi.org/10.5281/zenodo.7351736 .  相似文献   

18.
19.
The weak selection approximation of population genetics has made possible the analysis of social evolution under a considerable variety of biological scenarios. Despite its extensive usage, the accuracy of weak selection in predicting the emergence of altruism under limited dispersal when selection intensity increases remains unclear. Here, we derive the condition for the spread of an altruistic mutant in the infinite island model of dispersal under a Moran reproductive process and arbitrary strength of selection. The simplicity of the model allows us to compare weak and strong selection regimes analytically. Our results demonstrate that the weak selection approximation is robust to moderate increases in selection intensity and therefore provides a good approximation to understand the invasion of altruism in spatially structured population. In particular, we find that the weak selection approximation is excellent even if selection is very strong, when either migration is much stronger than selection or when patches are large. Importantly, we emphasize that the weak selection approximation provides the ideal condition for the invasion of altruism, and increasing selection intensity will impede the emergence of altruism. We discuss that this should also hold for more complicated life cycles and for culturally transmitted altruism. Using the weak selection approximation is therefore unlikely to miss out on any demographic scenario that lead to the evolution of altruism under limited dispersal.  相似文献   

20.
HIV patients are treated by administration of combinations of antiretroviral drugs. The very large number of such combinations makes the manual search for an effective therapy practically impossible, especially in advanced stages of the disease. Therapy selection can be supported by statistical methods that predict the outcomes of candidate therapies. However, these methods are based on clinical data sets that have highly unbalanced therapy representation. This paper presents a novel approach that considers each drug belonging to a target combination therapy as a separate task in a multi-task hierarchical Bayes setting. The drug-specific models take into account information on all therapies containing the drug, not just the target therapy. In this way, we can circumvent the problem of data sparseness pertaining to some target therapies. The computational validation shows that compared to the most commonly used approach that provides therapy information in the form of input features, our model has significantly higher predictive power for therapies with very few training samples and is at least as powerful for abundant therapies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号