首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The sample frequency spectrum of a segregating site is the probability distribution of a sample of alleles from a genetic locus, conditional on observing the sample to be polymorphic. This distribution is widely used in population genetic inferences, including statistical tests of neutrality in which a skew in the observed frequency spectrum across independent sites is taken as a signature of departure from neutral evolution. Theoretical aspects of the frequency spectrum have been well studied and several interesting results are available, but they are usually under the assumption that a site has undergone at most one mutation event in the history of the sample. Here, we extend previous theoretical results by allowing for at most two mutation events per site, under a general finite allele model in which the mutation rate is independent of current allelic state but the transition matrix is otherwise completely arbitrary. Our results apply to both nested and nonnested mutations. Only the former has been addressed previously, whereas here we show it is the latter that is more likely to be observed except for very small sample sizes. Further, for any mutation transition matrix, we obtain the joint sample frequency spectrum of the two mutant alleles at a triallelic site, and derive a closed-form formula for the expected age of the younger of the two mutations given their frequencies in the population. Several large-scale resequencing projects for various species are presently under way and the resulting data will include some triallelic polymorphisms. The theoretical results described in this paper should prove useful in population genomic analyses of such data.  相似文献   

2.
Most SNPs in the human genome are biallelic; however, there are some sites that are triallelic. We show here that there are approximately twice as many triallelic sites as we would expect by chance. This excess does not appear to be caused by natural selection or mutational hotspots. Instead we propose that a new mutation can induce another mutation either within the same individual or subsequently during recombination. We provide evidence for this model by showing that the rarer two alleles at triallelic sites tend to cluster on phylogenetic trees of human haplotypes. However, we find no association between the density of triallelic sites and the rate of recombination, which leads us to suggest that triallelic sites might be generated by the simultaneous production of two new mutations within the same individual on the same genetic background. Under this model we estimate that simultaneous mutation contributes ∼3% of all distinct SNPs. We also show that there is a twofold excess of adjacent SNPs. Approximately half of these seem to be generated simultaneously since they have identical minor allele frequencies. We estimate that the mutation of adjacent nucleotides accounts for a little less than 1% of all SNPs.ALTHOUGH the density of biallelic SNPs in the human genome is reasonably low, there are some sites that have three (triallelic sites) or even four nucleotides segregating in the human population. We show here that there are approximately twice as many triallelic sites as we would expect by chance. There are at least three mutational mechanisms that could potentially generate such an excess of triallelic sites. First, some sites may be hypermutable, and if the mutation rate of at least two pathways (e.g., C → T and C → A) is elevated at such sites, then there will be an excess of triallelic sites. The mutation rate of a site is known to depend upon the adjacent nucleotides, the best known example being the CpG dinucleotide (Coulondre et al. 1978; Bird 1980) at which the frequency of both transition and transversion mutations is elevated. However, other adjacent nucleotides also influence the mutation rate (Blake et al. 1992; Zhao et al. 2003; Hwang and Green 2004). Furthermore, we have recently shown that there is variation in the mutation rate that does not depend upon the identity of the adjacent nucleotides or any specific context (Hodgkinson et al. 2009).Second, it is possible that two of the alleles at a triallelic site are generated simultaneously within a single individual. Point mutations are generally assumed to involve the production of a single new allele per mutation event at a rate that is governed by the effects mentioned above. However, it is not difficult to imagine mechanisms that might induce mutations on both strands of the DNA duplex; for example, the presence of a base mismatch may itself be unstable, so we might go from a G-C base pair to a G-A, which then may mutate to C-A; if DNA replication reads through this mismatch, the G allele will have mutated to both C and T. Alternatively, the mutation may occur across both strands of the duplex at the same time, possibly as a result of a chemical or radiation event. Third, in a similar manner, we might imagine a single SNP inducing subsequent mutations if base mismatches are formed during recombination in heteroduplex DNA.Here we attempt to identify the cause of the excess of triallelic sites by analyzing sequence data around triallelic sites.  相似文献   

3.
Current procedures for inferring population history generally assume complete neutrality—that is, they neglect both direct selection and the effects of selection on linked sites. We here examine how the presence of direct purifying selection and background selection may bias demographic inference by evaluating two commonly-used methods (MSMC and fastsimcoal2), specifically studying how the underlying shape of the distribution of fitness effects and the fraction of directly selected sites interact with demographic parameter estimation. The results show that, even after masking functional genomic regions, background selection may cause the mis-inference of population growth under models of both constant population size and decline. This effect is amplified as the strength of purifying selection and the density of directly selected sites increases, as indicated by the distortion of the site frequency spectrum and levels of nucleotide diversity at linked neutral sites. We also show how simulated changes in background selection effects caused by population size changes can be predicted analytically. We propose a potential method for correcting for the mis-inference of population growth caused by selection. By treating the distribution of fitness effect as a nuisance parameter and averaging across all potential realizations, we demonstrate that even directly selected sites can be used to infer demographic histories with reasonable accuracy.  相似文献   

4.
The nucleotide composition of the genome is a balance between the origin and fixation rates of different mutations. For example, it is well-known that transitions occur more frequently than transversions, particularly at CpG sites. Differences in fixation rates of mutation types are less explored. Specifically, recombination-associated GC-biased gene conversion (gBGC) may differentially impact GC-changing mutations, due to differences in their genomic distributions and efficiency of mismatch repair mechanisms. Given that recombination evolves rapidly across species, we explore gBGC of different mutation types across human populations and great ape species. We report a stronger correlation between segregating GC frequency and recombination for transitions than for transversions. Notably, CpG transitions are most strongly affected by gBGC in humans and chimpanzees. We show that the overall strength of gBGC is generally correlated with effective population sizes in humans, with some notable exceptions, such as a stronger effect of gBGC on non-CpG transitions in populations of European descent. Furthermore, species of the Gorilla and Pongo genus have a greatly reduced gBGC effect on CpG sites. We also study the dependence of gBGC dynamics on flanking nucleotides and show that some mutation types evolve in opposition to the gBGC expectation, likely due to the hypermutability of specific nucleotide contexts. Our results highlight the importance of different gBGC dynamics experienced by GC-changing mutations and their impact on nucleotide composition evolution.  相似文献   

5.
6.
The allele frequency spectrum has attracted considerable interest for the simultaneous inference of the demographic and adaptive history of populations. In a recent study, Evans et al. (2007) developed a forward diffusion equation describing the allele frequency spectrum, when the population is subject to size changes, selection and mutation. From the diffusion equation, the authors derived a system of ordinary differential equations (ODEs) for the moments in a Wright–Fisher diffusion with varying population size and constant selection. Here, we present an explicit solution for this system of ODEs with variable population size, but without selection, and apply this result to derive the expected spectrum of a sample for time-varying population size. We use this forward-in-time-solution of the allele frequency spectrum to obtain the backward-in-time-solution previously derived via coalescent theory by Griffiths and Tavaré (1998). Finally, we discuss the applicability of the theoretical results to the analysis of nucleotide polymorphism data.  相似文献   

7.
Understanding how assemblages of species responded to past climate change is a central goal of comparative phylogeography and comparative population genomics, an endeavour that has increasing potential to integrate with community ecology. New sequencing technology now provides the potential to perform complex demographic inference at unprecedented resolution across assemblages of nonmodel species. To this end, we introduce the aggregate site frequency spectrum (aSFS), an expansion of the site frequency spectrum to use single nucleotide polymorphism (SNP) data sets collected from multiple, co‐distributed species for assemblage‐level demographic inference. We describe how the aSFS is constructed over an arbitrary number of independent population samples and then demonstrate how the aSFS can differentiate various multispecies demographic histories under a wide range of sampling configurations while allowing effective population sizes and expansion magnitudes to vary independently. We subsequently couple the aSFS with a hierarchical approximate Bayesian computation (hABC) framework to estimate degree of temporal synchronicity in expansion times across taxa, including an empirical demonstration with a data set consisting of five populations of the threespine stickleback (Gasterosteus aculeatus). Corroborating what is generally understood about the recent postglacial origins of these populations, the joint aSFS/hABC analysis strongly suggests that the stickleback data are most consistent with synchronous expansion after the Last Glacial Maximum (posterior probability = 0.99). The aSFS will have general application for multilevel statistical frameworks to test models involving assemblages and/or communities, and as large‐scale SNP data from nonmodel species become routine, the aSFS expands the potential for powerful next‐generation comparative population genomic inference.  相似文献   

8.
Identifying adaptively important loci in recently bottlenecked populations – be it natural selection acting on a population following the colonization of novel habitats in the wild, or artificial selection during the domestication of a breed – remains a major challenge. Here we report the results of a simulation study examining the performance of available population-genetic tools for identifying genomic regions under selection. To illustrate our findings, we examined the interplay between selection and demography in two species of Peromyscus mice, for which we have independent evidence of selection acting on phenotype as well as functional evidence identifying the underlying genotype. With this unusual information, we tested whether population-genetic-based approaches could have been utilized to identify the adaptive locus. Contrary to published claims, we conclude that the use of the background site frequency spectrum as a null model is largely ineffective in bottlenecked populations. Results are quantified both for site frequency spectrum and linkage disequilibrium-based predictions, and are found to hold true across a large parameter space that encompasses many species and populations currently under study. These results suggest that the genomic footprint left by selection on both new and standing variation in strongly bottlenecked populations will be difficult, if not impossible, to find using current approaches.  相似文献   

9.
The constant removal of deleterious mutations by natural selection causes a reduction in neutral diversity and efficacy of selection at genetically linked sites (a process called Background Selection, BGS). Population genetic studies, however, often ignore BGS effects when investigating demographic events or the presence of other types of selection. To obtain a more realistic evolutionary expectation that incorporates the unavoidable consequences of deleterious mutations, we generated high-resolution landscapes of variation across the Drosophila melanogaster genome under a BGS scenario independent of polymorphism data. We find that BGS plays a significant role in shaping levels of variation across the entire genome, including long introns and intergenic regions distant from annotated genes. We also find that a very large percentage of the observed variation in diversity across autosomes can be explained by BGS alone, up to 70% across individual chromosome arms at 100-kb scale, thus indicating that BGS predictions can be used as baseline to infer additional types of selection and demographic events. This approach allows detecting several outlier regions with signal of recent adaptive events and selective sweeps. The use of a BGS baseline, however, is particularly appropriate to investigate the presence of balancing selection and our study exposes numerous genomic regions with the predicted signature of higher polymorphism than expected when a BGS context is taken into account. Importantly, we show that these conclusions are robust to the mutation and selection parameters of the BGS model. Finally, analyses of protein evolution together with previous comparisons of genetic maps between Drosophila species, suggest temporally variable recombination landscapes and, thus, local BGS effects that may differ between extant and past phases. Because genome-wide BGS and temporal changes in linkage effects can skew approaches to estimate demographic and selective events, future analyses should incorporate BGS predictions and capture local recombination variation across genomes and along lineages.  相似文献   

10.
The rate at which new mutations arise in the genome is a key factor in the evolution and adaptation of species. Here we describe the rate and spectrum of spontaneous mutations for the fission yeast Schizosaccharomyces pombe, a key model organism with many similarities to higher eukaryotes. We undertook an ∼1700-generation mutation accumulation (MA) experiment with a haploid S. pombe, generating 422 single-base substitutions and 119 insertion-deletion mutations (indels) across the 96 replicates. This equates to a base-substitution mutation rate of 2.00 × 10−10 mutations per site per generation, similar to that reported for the distantly related budding yeast Saccharomyces cerevisiae. However, these two yeast species differ dramatically in their spectrum of base substitutions, the types of indels (S. pombe is more prone to insertions), and the pattern of selection required to counteract a strong AT-biased mutation rate. Overall, our results indicate that GC-biased gene conversion does not play a major role in shaping the nucleotide composition of the S. pombe genome and suggest that the mechanisms of DNA maintenance may have diverged significantly between fission and budding yeasts. Unexpectedly, CpG sites appear to be excessively liable to mutation in both species despite the likely absence of DNA methylation.  相似文献   

11.
In metapopulations in which habitat patches vary in quality and occupancy it can be complicated to calculate the net time-averaged contribution to reproduction of particular populations. Surprisingly, few indices have been proposed for this purpose. We combined occupancy, abundance, frequency of occurrence, and reproductive success to determine the net value of different sites through time and applied this method to a bird of conservation concern. The Tricolored Blackbird (Agelaius tricolor) has experienced large population declines, is the most colonial songbird in North America, is largely confined to California, and breeds itinerantly in multiple habitat types. It has had chronically low reproductive success in recent years. Although young produced per nest have previously been compared across habitats, no study has simultaneously considered site occupancy and reproductive success. Combining occupancy, abundance, frequency of occurrence, reproductive success and nest failure rate we found that that large colonies in grain fields fail frequently because of nest destruction due to harvest prior to fledging. Consequently, net time-averaged reproductive output is low compared to colonies in non-native Himalayan blackberry or thistles, and native stinging nettles. Cattail marshes have intermediate reproductive output, but their reproductive output might be improved by active management. Harvest of grain-field colonies necessitates either promoting delay of harvest or creating alternative, more secure nesting habitats. Stinging nettle and marsh colonies offer the main potential sources for restoration or native habitat creation. From 2005–2011 breeding site occupancy declined 3x faster than new breeding colonies were formed, indicating a rapid decline in occupancy. Total abundance showed a similar decline. Causes of variation in the value for reproduction of nesting substrates and factors behind continuing population declines merit urgent investigation. The method we employ should be useful in other metapopulation studies for calculating time-averaged reproductive output for different sites.  相似文献   

12.
13.
Genetic and epigenetic alterations are required for carcinogenesis and the mutation burden across tumor types has been investigated. Here, we investigate epigenetic alterations with a novel measure of global DNA methylation dysregulation, the methylation dysregulation index (MDI), across 14 cancer types in The Cancer Genome Atlas (TCGA) database. DNA methylation data—obtained using Illumina HumanMethylation450 BeadChip—was accessed from TCGA. We calculated the MDI in 14 tumor types (n = 5,592 tumors), using adjacent normal tissues (n = 701) from each tumor site. Copy number alteration, and mutation burden were retrieved from cBioportal (n = 5,152). We tested the relation of subject MDI across tumors and with age, gender, tumor stage, estimated tumor purity, and copy number alterations for both overall MDI and genomic-context-specific MDI. We also investigated the top most dysregulated loci shared across tumor types. There was a broad range of extent in methylation dysregulation across tumor types (P < 2.2E-16). However, a consistent pattern of methylation dysregulation stratified by genomic context was observed across tumor types where the highest dysregulation occurred at non-CpG island regions. Considering other summary measures of somatic alteration, MDI was correlated with copy number alterations but not with mutation burden. Using the top dysregulated CpG sites in common across tumors, 4 classes of cancer types were observed, and the functional consequences of these alterations to gene expression were confirmed. This work identified the global DNA methylation dysregulation patterns across 14 cancer types showing a higher impact for the non-CpG island areas. The most dysregulated loci across cancer types identified common clusters across cancer types that may have implications for future treatment and prevention measures.  相似文献   

14.
Previous studies have shown that the pattern of single nucleotide polymorphism (SNP) in Arabidopsis (Arabidopsis thaliana) deviates from the distribution expected under a neutral model. Here, we test whether or not ancestral misinference could explain this deviation. We start by showing that there are significant and complex influences of context on mutation dynamics as inferred from SNP frequency, in Arabidopsis, and compare the results to observations about context dependency that have been made on a previous analysis of a maize (Zea mays) SNP dataset. The data concerning heterogeneity across sites are then used to make corrections for ancestral misinference in a context-dependent manner. Using Arabidopsis lyrata to infer the ancestral state for SNPs, we show that the resulting unfolded site frequency spectrum (SFS) in Arabidopsis is skewed toward sites with high frequency derived nucleotides. Sites are also partitioned into two general functional classes, second codon position and 4-fold degenerate sites. These two classes show different SFS; although both show an overrepresentation of high frequency derived sites, low frequency derived sites are vastly overrepresented at the second codon position, but significantly underrepresented at 4-fold degenerate sites. We find that these results are robust to corrections for ancestral misinference, even when context-dependent variation in mutation properties is taken into consideration. The data suggest that, in addition to purifying selection, complex demographic events and/or linked positive selection need to be invoked to explain the SFS, and they highlight the importance of sequence context in analyses of genome-wide variation.Analyses of site frequency spectra (SFS) from single nucleotide polymorphism (SNP) datasets provide a powerful method for making inferences about selection (Akashi, 1999; Bustamante et al., 2001; Hernandez et al., 2007). The allele frequency distribution expected under a neutral model (Tajima, 1989) can be applied to datasets for which an outgroup is available by unfolding the distribution using the assumption of parsimony. Deviation of this distribution from the neutral model provides insights about the role of selection or demographics; an overabundance of high frequency derived sites is frequently attributed to either recurrent positive selection (Bustamante et al., 2001; Caicedo et al., 2007), a population bottleneck (Caicedo et al., 2007), or hidden population substructure (Wakeley and Aliacar, 2001; Hernandez et al., 2007), whereas an excess of low frequency derived sites is commonly explained as a result of constraining selection or a recent population expansion (Slatkin and Hudson, 1991; Hernandez et al., 2007).Arabidopsis (Arabidopsis thaliana) represents one of the most intensively studied model organisms for molecular population genetics, and several genome-scale patterns of nucleotide variation have been generated (Nordborg et al., 2005; Schmid et al., 2005; Borevitz et al., 2007; Clark et al., 2007). These studies have shown evidence for genome-wide departures from a standard neutral population genetic model assuming constant population size. One recurring pattern is that minor allele frequencies tend to be skewed such that there is an excess of rare variants across the genome (Nordborg et al., 2005; Schmid et al., 2005). This pattern has typically been interpreted as evidence for population expansion, although other aspects of the genome-wide data, including a high variance in diversity across loci (Nordborg et al., 2005), appear inconsistent with a simple model of population growth. Furthermore, amino acid substitutions typically show a larger excess of rare variants (Foxe et al., 2008), suggestive of weak purifying selection across the genome.One limitation with these analyses is that outgroup data have rarely been available, restricting the ability to infer the derived frequency spectrum and thus distinguish new low frequency mutations from high frequency derived variants. Instead, these analyses implicitly rely on the theoretical prediction that the probability that an allele is ancestral is equal to its frequency (Watterson and Guess, 1977). In principle, the polarized frequency spectrum should provide considerably more information on the genome-wide patterns of variation and more power to infer the direction and strength of selection (Sawyer and Hartl, 1992; Akashi, 1999). However, a potential difficulty with the use of an outgroup to infer the ancestral and derived states at a given site is that the outgroup state is typically taken as ancestral under a parsimony assumption. This means that parallel changes could result in a misinference of the ancestral state, and this would generally lead to a skew toward sites with a high frequency of the derived state and, therefore, a potential for generating a spurious signature of positive selection or demographic effect (Baudry and Depaulis, 2003; Hernandez et al., 2007). Furthermore, given differences in effective mutation rates across different classes of sites, there may be biased rates of ancestral misinference, which can also lead to problems when inferring the strength of selection on different types of substitution. Given this potentially confounding effect of ancestral misinference, methods have been proposed to correct the SFS (e.g. Baudry and Depaulis, 2003; Hernandez et al., 2007).Any correction for ancestral misinference must be based on an adequate substitution model. In the case of plant genomes, including the maize (Zea mays) nuclear genome, it is well established that relative mutation rates vary significantly across sites as a function of context or the composition of surrounding nucleotides (Morton, 1995, 2003; Morton et al., 2006; Moore and Stevens, 2008) and similar context dependency has been observed in other genomes (Blake et al., 1992; Hess et al., 1994; Krawczak et al., 1998; Zhao and Boerwinkle, 2002). One prominent feature of context dependency is the CpG effect, or an increased rate of transitions at CG dinucleotides as a result of the relatively rapid deamination of methylated cytosines at many such sites (Bulmer, 1986; Zhao and Boerwinkle, 2002; Morton et al., 2006). More complex patterns of context dependency have also been observed in nuclear DNA of maize, where it has been shown that transition and transversion rates are significantly influenced by local and regional composition, but in different manners, and that the rate of mutation of GC and AT base pairs are affected differently by context (Morton et al., 2006).When complex context dependency exists, correcting for ancestral misinference would require that site context be taken into consideration (Hernandez et al., 2007). Therefore, we begin by analyzing heterogeneity across sites in Arabidopsis as a function of context. We find that mutation dynamics are influenced in a complex manner by both composition of flanking nucleotides and regional A+T content. These findings are compared to the context effects that have been observed in maize (Morton et al., 2006). We then analyze the unfolded SFS, with Arabidopsis lyrata as the outgroup, using the method of Baudry and Depaulis (2003) to account for ancestral misinference. To account for the influence of context on mutation dynamics, sites are partitioned by the number of flanking A/T base pairs because this was found to be a major contributing factor to context effects. Sites were also partitioned by codon position and degeneracy to account, approximately, for functional effects. An SFS was then generated for sites within each of the separate partitions and each spectrum was corrected using mutation parameters for that partition.We find that the excess of high frequency sites cannot be explained by ancestral misinference. In addition, second codon position sites show an excess of low frequency sites and 4-fold degenerate sites show a significant deficit of low frequency sites; both of these features remain after the correction. We suggest that complex demographic history and/or the action of positive selection have had a major effect on genome-wide patterns of variation, and we confirm the predominance of slightly deleterious amino acid polymorphisms in the Arabidopsis genome.  相似文献   

15.
Etienne Rajon  Joanna Masel 《Genetics》2013,193(4):1209-1220
Cryptic genetic sequences have attenuated effects on phenotypes. In the classic view, relaxed selection allows cryptic genetic diversity to build up across individuals in a population, providing alleles that may later contribute to adaptation when co-opted—e.g., following a mutation increasing expression from a low, attenuated baseline. This view is described, for example, by the metaphor of the spread of a population across a neutral network in genotype space. As an alternative view, consider the fact that most phenotypic traits are affected by multiple sequences, including cryptic ones. Even in a strictly clonal population, the co-option of cryptic sequences at different loci may have different phenotypic effects and offer the population multiple adaptive possibilities. Here, we model the evolution of quantitative phenotypic characters encoded by cryptic sequences and compare the relative contributions of genetic diversity and of variation across sites to the phenotypic potential of a population. We show that most of the phenotypic variation accessible through co-option would exist even in populations with no polymorphism. This is made possible by a history of compensatory evolution, whereby the phenotypic effect of a cryptic mutation at one site was balanced by mutations elsewhere in the genome, leading to a diversity of cryptic effect sizes across sites rather than across individuals. Cryptic sequences might accelerate adaptation and facilitate large phenotypic changes even in the absence of genetic diversity, as traditionally defined in terms of alternative alleles.  相似文献   

16.
Ionising radiation induces clustered DNA damage sites which pose a severe challenge to the cell’s repair machinery, particularly base excision repair. To date, most studies have focussed on two-lesion clusters. We have designed synthetic oligonucleotides to give a variety of three-lesion clusters containing abasic sites and 8-oxo-7, 8-dihydroguanine to investigate if the hierarchy of lesion processing dictates whether the cluster is cytotoxic or mutagenic. Clusters containing two tandem 8-oxoG lesions opposing an AP site showed retardation of repair of the AP site with nuclear extract and an elevated mutation frequency after transformation into wild-type or mutY Escherichia coli. Clusters containing bistranded AP sites with a vicinal 8-oxoG form DSBs with nuclear extract, as confirmed in vivo by transformation into wild-type E. coli. Using ung1 E. coli, we propose that DSBs arise via lesion processing rather than stalled replication in cycling cells. This study provides evidence that it is not only the prompt formation of DSBs that has implications on cell survival but also the conversion of non-DSB clusters into DSBs during processing and attempted repair. The inaccurate repair of such clusters has biological significance due to the ultimate risk of tumourigenesis or as potential cytotoxic lesions in tumour cells.  相似文献   

17.
Using coalescent simulations, we study the impact of three different sampling schemes on patterns of neutral diversity in structured populations. Specifically, we are interested in two summary statistics based on the site frequency spectrum as a function of migration rate, demographic history of the entire substructured population (including timing and magnitude of specieswide expansions), and the sampling scheme. Using simulations implementing both finite-island and two-dimensional stepping-stone spatial structure, we demonstrate strong effects of the sampling scheme on Tajima's D (DT) and Fu and Li's D (DFL) statistics, particularly under specieswide (range) expansions. Pooled samples yield average DT and DFL values that are generally intermediate between those of local and scattered samples. Local samples (and to a lesser extent, pooled samples) are influenced by local, rapid coalescence events in the underlying coalescent process. These processes result in lower proportions of external branch lengths and hence lower proportions of singletons, explaining our finding that the sampling scheme affects DFL more than it does DT. Under specieswide expansion scenarios, these effects of spatial sampling may persist up to very high levels of gene flow (Nm > 25), implying that local samples cannot be regarded as being drawn from a panmictic population. Importantly, many data sets on humans, Drosophila, and plants contain signatures of specieswide expansions and effects of sampling scheme that are predicted by our simulation results. This suggests that validating the assumption of panmixia is crucial if robust demographic inferences are to be made from local or pooled samples. However, future studies should consider adopting a framework that explicitly accounts for the genealogical effects of population subdivision and empirical sampling schemes.  相似文献   

18.
Genotype-phenotype correlation of hypertrophic cardiomyopathy (HCM) has been challenging because of the genetic and clinical heterogeneity. To determine the mutation profile of Chinese patients with HCM and to correlate genotypes with phenotypes, we performed a systematic mutation screening of the eight most commonly mutated genes encoding sarcomere proteins in 200 unrelated Chinese adult patients using direct DNA sequencing. A total of 98 mutations were identified in 102 mutation carriers. The frequency of mutations in MYH7, MYBPC3, TNNT2 and TNNI3 was 26.0, 18.0, 4.0 and 3.5 % respectively. Among the 200 genotyped HCM patients, 83 harbored a single mutation, and 19 (9.5 %) harbored multiple mutations. The number of mutations was positively correlated with the maximum wall thickness. We found that neither particular gene nor specific mutation was correlated to clinical phenotype. In summary, the frequency of multiple mutations was greater in Chinese HCM patients than in the Caucasian population. Multiple mutations in sarcomere protein may be a risk factor for left ventricular wall thickness.  相似文献   

19.
Feng Gao  Alon Keinan 《Genetics》2016,202(1):235-245
The site frequency spectrum (SFS) and other genetic summary statistics are at the heart of many population genetic studies. Previous studies have shown that human populations have undergone a recent epoch of fast growth in effective population size. These studies assumed that growth is exponential, and the ensuing models leave an excess amount of extremely rare variants. This suggests that human populations might have experienced a recent growth with speed faster than exponential. Recent studies have introduced a generalized growth model where the growth speed can be faster or slower than exponential. However, only simulation approaches were available for obtaining summary statistics under such generalized models. In this study, we provide expressions to accurately and efficiently evaluate the SFS and other summary statistics under generalized models, which we further implement in a publicly available software. Investigating the power to infer deviation of growth from being exponential, we observed that adequate sample sizes facilitate accurate inference; e.g., a sample of 3000 individuals with the amount of data expected from exome sequencing allows observing and accurately estimating growth with speed deviating by ≥10% from that of exponential. Applying our inference framework to data from the NHLBI Exome Sequencing Project, we found that a model with a generalized growth epoch fits the observed SFS significantly better than the equivalent model with exponential growth (P-value = 3.85 × 10?6). The estimated growth speed significantly deviates from exponential (P-value  ? 10?12), with the best-fit estimate being of growth speed 12% faster than exponential.  相似文献   

20.
鲁有望  王昆华 《遗传》2017,39(6):482-490
结直肠癌(colorectal cancer, CRC)是我国常见的致死性肿瘤类型之一。根据体细胞突变谱预测抗EGFR单抗治疗疗效已成为转移性结直肠癌(metastatic colorectal cancer, mCRC)治疗的标准步骤。由于临床上转移样本难以获得,只能采用原发肿瘤替代进行检测。原发和配对转移肿瘤间的遗传异质性会导致原发灶取样无法代表转移灶突变谱。目前CRC原发和配对转移肿瘤间遗传异质性程度仍存在争议。本文就CRC原发和配对转移基因组谱的对比研究进行了综述,并讨论了原发与配对转移肿瘤遗传异质性形成的原因及应对策略。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号