首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 296 毫秒
1.
Ptak SE  Voelpel K  Przeworski M 《Genetics》2004,167(1):387-397
An ability to predict levels of linkage disequilibrium (LD) between linked markers would facilitate the design of association studies and help to distinguish between evolutionary models. Unfortunately, levels of LD depend crucially on the rate of recombination, a parameter that is difficult to measure. In humans, rates of genetic exchange between markers megabases apart can be estimated from a comparison of genetic and physical maps; these large-scale estimates can then be interpolated to predict LD at smaller ("local") scales. However, if there is extensive small-scale heterogeneity, as has been recently proposed, local rates of recombination could differ substantially from those averaged over much larger distances. We test this hypothesis by estimating local recombination rates indirectly from patterns of LD in 84 genomic regions surveyed by the SeattleSNPs project in a sample of individuals of European descent and of African-Americans. We find that LD-based estimates are significantly positively correlated with map-based estimates. This implies that large-scale, average rates are informative about local rates of recombination. Conversely, although LD-based estimates are based on a number of simplifying assumptions, it appears that they capture considerable information about the underlying recombination rate or at least about the ordering of regions by recombination rate. Using LD-based estimators, we also find evidence for homologous gene conversion in patterns of polymorphism. However, as we demonstrate by simulation, inferences about gene conversion are unreliable, even with extensive data from homogeneous regions of the genome, and are confounded by genotyping error.  相似文献   

2.
Despite the growing consensus on the importance of testing gene-gene interactions in genetic studies of complex diseases, the effect of gene-gene interactions has often been defined as a deviance from genetic additive effects, which is essentially treated as a residual term in genetic analysis and leads to low power in detecting the presence of interacting effects. To what extent the definition of gene-gene interaction at population level reflects the genes' biochemical or physiological interaction remains a mystery. In this article, we introduce a novel definition and a new measure of gene-gene interaction between two unlinked loci (or genes). We developed a general theory for studying linkage disequilibrium (LD) patterns in disease population under two-locus disease models. The properties of using the LD measure in a disease population as a function of the measure of gene-gene interaction between two unlinked loci were also investigated. We examined how interaction between two loci creates LD in a disease population and showed that the mathematical formulation of the new definition for gene-gene interaction between two loci was similar to that of the LD between two loci. This finding motived us to develop an LD-based statistic to detect gene-gene interaction between two unlinked loci. The null distribution and type I error rates of the LD-based statistic for testing gene-gene interaction were validated using extensive simulation studies. We found that the new test statistic was more powerful than the traditional logistic regression under three two-locus disease models and demonstrated that the power of the test statistic depends on the measure of gene-gene interaction. We also investigated the impact of using tagging SNPs for testing interaction on the power to detect interaction between two unlinked loci. Finally, to evaluate the performance of our new method, we applied the LD-based statistic to two published data sets. Our results showed that the P values of the LD-based statistic were smaller than those obtained by other approaches, including logistic regression models.  相似文献   

3.
STRUCTURE is the most widely used clustering software to detect population genetic structure. The last version of this software (STRUCTURE 2.1) has been enhanced recently to take into account the occurrence of linkage disequilibrium (LD) caused by admixture between populations. This last version, however, still does not consider the effects of strong background LD caused by genetic drift, and which may cause spurious results. STRUCTURE authors have, therefore, suggested a rough threshold value of the distance (1.0 cM) between two loci below which the pair of loci should not be used. Because of the sensitiveness of LD to demographic events, the distance between loci is not always a good indicator of the strength of LD. In this study, we examine the link between genomic distance and the strength of the correlation between loci (r(LD)) in a free-ranging population of mouflon (Ovis aries), and we present an empirical test of effect of r(LD) on the clustering results provided by the linkage model in STRUCTURE. We showed that a high r(LD) value increases the probability of detecting spurious clustering. We propose to use r(LD) as an index to base a decision on whether or not to use a pair of loci in a clustering analysis.  相似文献   

4.
Linkage disequilibrium and the mapping of complex human traits.   总被引:30,自引:0,他引:30  
The potential value of haplotypes defined by several single nucleotide polymorphisms has attracted recent interest. With sufficient linkage disequilibrium (LD), haplotypes could be used in association studies to map common alleles that might influence the susceptibility to common diseases, as well as for reconstructing the evolution of the genome. It has been proposed that a globally useful resource need only be based on high frequency variants, identified from a few modest samples. Rapid progress has been made in quantifying the pattern of human LD and haplotypes defined by such common variants within and among populations. However, the quality and utility of the proposed LD-based resource could be seriously compromised if important sampling and analytical factors are overlooked in its design. The LD map should be based on adequately justified criteria defined by sound population genetic principles.  相似文献   

5.
Linkage disequilibrium (LD) in crops, established by domestication and early breeding, can be a valuable basis for mapping the genome. We undertook an assessment of LD in sugarcane (Saccharum spp), characterized by one of the most complex crop genomes, with its high ploidy level (>or=8) and chromosome number (>100) as well as its interspecific origin. Using AFLP markers, we surveyed 1,537 polymorphisms among 72 modern sugarcane cultivars. We exploited information from available genetic maps to determine a relevant statistical threshold that discriminates marker associations due to linkage from other associations. LD is very common among closely linked markers and steadily decreases within a 0-30 cM window. Many instances of linked markers cannot be recognized due to the confounding effect of polyploidy. However, LD within a sample of cultivars appears as efficient as linkage analysis within a controlled progeny in terms of assigning markers to cosegregation groups. Saturating the genome coverage remains a challenge, but applying LD-based mapping within breeding programs will considerably speed up the localization of genes controlling important traits by making use of phenotypic information produced in the course of selection.  相似文献   

6.
Analyses of high-density SNPs in genetic studies have the potential problems of prohibitive genotyping costs and inflated false discovery rates. Current methods select subsets of representative SNPs (tagSNPs) using information either on potential biologic functionality of the SNPs or on the underlying linkage disequilibrium (LD) structure, but not both. Combining the two types of information may lead to more effective tagSNP selection. The proposed method combines both functional and LD information using a weighted factor analysis (WFA) model. The WFA was applied to the dense SNP collection from 129 genes sequenced by the SeattleSNPs Program for Genomic Application. TagSNPs selected by WFA were compared with those selected by an LD-based method. WFA allowed prioritization of SNPs that would otherwise share equivalent ranking due to underlying LD structure alone. Furthermore, WFA consistently included SNPs not selected by function or by LD alone. A literature review of a subset of genes revealed that SNPs selected by WFA were more likely represented in published reports.  相似文献   

7.
Linkage disequilibrium (LD) is the nonrandom association of alleles at two markers. Patterns of LD have biological implications as well as practical ones when designing association studies or conservation programs aimed at identifying the genetic basis of fitness differences within and among populations. However, the temporal dynamics of LD in wild populations has received little empirical attention. In this study, we examined the overall extent of LD, the effect of sample size on the accuracy and precision of LD estimates, and the temporal dynamics of LD in two populations of bighorn sheep (Ovis canadensis) with different demographic histories. Using over 200 microsatellite loci, we assessed two metrics of multi‐allelic LD, D′, and χ′2. We found that both populations exhibited high levels of LD, although the extent was much shorter in a native population than one that was founded via translocation, experienced a prolonged bottleneck post founding, followed by recent admixture. In addition, we observed significant variation in LD in relation to the sample size used, with small sample sizes leading to depressed estimates of the extent of LD but inflated estimates of background levels of LD. In contrast, there was not much variation in LD among yearly cross‐sections within either population once sample size was accounted for. Lack of pronounced interannual variability suggests that researchers may not have to worry about interannual variation when estimating LD in a population and can instead focus on obtaining the largest sample size possible.  相似文献   

8.
Mason Liang  Rasmus Nielsen 《Genetics》2014,197(3):953-967
The distribution of admixture tract lengths has received considerable attention, in part because it can be used to infer the timing of past gene flow events between populations. It is commonly assumed that these lengths can be modeled as independently and identically distributed (iid) exponential random variables. This assumption is fundamental for many popular methods that analyze admixture using hidden Markov models. We compare the expected distribution of admixture tract lengths under a number of population-genetic models to the distribution predicted by the Wright–Fisher model with recombination. We show that under the latter model, the assumption of iid exponential tract lengths does not hold for recent or for ancient admixture events and that relying on this assumption can lead to false positives when inferring the number of admixture events. To further investigate the tract-length distribution, we develop a dyadic interval-based stochastic process for generating admixture tracts. This representation is useful for analyzing admixture tract-length distributions for populations with recent admixture, a scenario in which existing models perform poorly.  相似文献   

9.
Model based methods for genetic clustering of individuals, such as those implemented in structure or ADMIXTURE, allow the user to infer individual ancestries and study population structure. The underlying model makes several assumptions about the demographic history that shaped the analysed genetic data. One assumption is that all individuals are a result of K homogeneous ancestral populations that are all well represented in the data, while another assumption is that no drift happened after the admixture event. The histories of many real world populations do not conform to that model, and in that case taking the inferred admixture proportions at face value might be misleading. We propose a method to evaluate the fit of admixture models based on estimating the correlation of the residual difference between the true genotypes and the genotypes predicted by the model. When the model assumptions are not violated, the residuals from a pair of individuals are not correlated. In the case of a bad fitting admixture model, individuals with similar demographic histories have a positive correlation of their residuals. Using simulated and real data, we show how the method is able to detect a bad fit of inferred admixture proportions due to using an insufficient number of clusters K or to demographic histories that deviate significantly from the admixture model assumptions, such as admixture from ghost populations, drift after admixture events and nondiscrete ancestral populations. We have implemented the method as an open source software that can be applied to both unphased genotypes and low depth sequencing data.  相似文献   

10.
Measures of genetic parental distances (GPD) based on microsatellite loci (D (2) and IR), have been suggested to be better correlated with fitness than individual heterozygosity (H), as they contain information about past events of inbreeding or admixture. We investigated if GPD increased with increasing genetic divergence between parental populations in Drosophila buzzatii and if the measures indicate past events of admixture. Further we evaluated the relationship between GPD, fitness and fluctuating asymmetry (FA) of size and shape. We investigated three populations of Drosophila buzzati, from Argentina, Europe and Australia. From these populations two intraspecific hybridisation lines were made; one between the Argentinean and European populations, which have been separated 200 years and one between the populations from Argentina and Australia, which have been separated 80 years. By doing this we obtained hybrid progeny having different levels of GPD. We found that D (2) and H can be used as indicators of admixture when comparing hybrid individuals with their parentals. IR was not informative. Our results does not exclude the presence of genetic fitness correlations (GFC) over individuals with a broad fitness range from populations in equilibrium, but we doubt the presence of GFC using GPD measures in admixed populations. Shape FA could be a relevant measure for fitness, however, only when comparing populations, not at individual level.  相似文献   

11.
Genome-wide association studies (GWASs) are critically dependent on detailed knowledge of the pattern of linkage disequilibrium (LD) in the human genome. GWASs generate lists of variants, usually SNPs, ranked according to the significance of their association to a trait. Downstream analyses generally focus on the gene or genes that are physically closest to these SNPs and ignore their LD profile with other SNPs. We have developed a flexible R package (LDsnpR) that efficiently assigns SNPs to genes on the basis of both their physical position and their pairwise LD with other SNPs. We used the positional-binning and LD-based-binning approaches to investigate whether including these "LD-based" SNPs would affect the interpretation of three published GWASs on bipolar affective disorder (BP) and of the imputed versions of two of these GWASs. We show how including LD can be important for interpreting and comparing GWASs. In the published, unimputed GWASs, LD-based binning effectively "recovered" 6.1%-8.3% of Ensembl-defined genes. It altered the ranks of the genes and resulted in nonnegligible differences between the lists of the top 2,000 genes emerging from the two binning approaches. It also improved the overall gene-based concordance between independent BP studies. In the imputed datasets, although the increases in coverage (>0.4%) and rank changes were more modest, even greater concordance between the studies was observed, attesting to the potential of LD-based binning on imputed data as well. Thus, ignoring LD can result in the misinterpretation of the GWAS findings and have an impact on subsequent genetic and functional studies.  相似文献   

12.
Genetic data have been widely used to reconstruct the demographic history of populations, including the estimation of migration rates, divergence times and relative admixture contribution from different populations. Recently, increasing interest has been given to the ability of genetic data to distinguish alternative models. One of the issues that has plagued this kind of inference is that ancestral shared polymorphism is often difficult to separate from admixture or gene flow. Here, we applied an approximate Bayesian computation (ABC) approach to select the model that best fits microsatellite data among alternative splitting and admixture models. We performed a simulation study and showed that with reasonably large data sets (20 loci) it is possible to identify with a high level of accuracy the model that generated the data. This suggests that it is possible to distinguish genetic patterns due to past admixture events from those due to shared polymorphism (population split without admixture). We then apply this approach to microsatellite data from an endangered and endemic Iberian freshwater fish species, in which a clustering analysis suggested that one of the populations could be admixed. In contrast, our results suggest that the observed genetic patterns are better explained by a population split model without admixture.  相似文献   

13.
In response to climate changes that have occurred during Pleistocene glacial cycles, taxa associated to steppe vegetation might have followed a pattern of historical evolution in which isolation and fragmentation of populations occurred during the short interglacials and expansion events occurred during the long glacial periods, in contrast to the pattern described for temperate species. Here, we use molecular genetic data to evaluate this idea in a steppe bird with Palaearctic distribution, the little bustard (Tetrax tetrax). Overall, extremely low genetic diversity and differentiation was observed among eight little bustard populations distributed in Spain and France. Mismatch distribution analyses showed that most little bustard populations expanded during cooling periods previous to, and just after, the last interglacial period (127,000-111,000 years before present), when steppe habitats were widespread across Europe. Coalescent-based methods suggested that glacial expansions have resulted in substantial admixture in Western Europe due to the existence of different interglacial refugia. Our results are consistent with a model of evolution and genetic consequences of Pleistocene cycles with low between-population genetic differentiation as a result of short-term isolation periods during interglacials and long-term exchange during glacial periods.  相似文献   

14.
Understanding the genetic background of invading species can be crucial information clarifying why they become invasive. Intraspecific genetic admixture among lineages separated in the native ranges may promote the rate and extent of an invasion by substantially increasing standing genetic variation. Here, we examined the genetic relationships among threespine stickleback that recently colonized Switzerland. This invasion results from several distinct genetic lineages that colonized multiple locations and have since undergone range expansions, where they coexist and admix in parts of their range. Using 17 microsatellites genotyped for 634 individuals collected from 17 Swiss and two non‐Swiss European sites, we reconstruct the invasion of stickleback and investigate the potential and extent of admixture and hybridization among the colonizing lineages from a population genetic perspective. Specifically, we test for an increase in standing genetic variation in populations where multiple lineages coexist. We find strong evidence of massive hybridization early on, followed by what appears to be recent increased genetic isolation and the formation of several new genetically distinguishable populations, consistent with a hybrid ‘superswarm’. This massive hybridization and population formation event(s) occurred over approximately 140 years and likely fuelled the successful invasion of a diverse range of habitats. The implications are that multiple colonizations coupled with hybridization can lead to the formation of new stable genetic populations potentially kick‐starting speciation and adaptive radiation over a very short timescale.  相似文献   

15.
The extent of X‐chromosome linkage disequilibrium (LD) was studied in a southern Brazilian population, and in a pool of samples from Amerindian populations. For this purpose, 11 microsatellites, located mostly in a Xq region comprising ~86 Mb was investigated. The lower Amerindian gene diversity associated with significant differences between the populations studied indicated population structure as the main cause for the higher LD values in the Amerindian pool. On the other hand, the LD levels of the non‐Amerindian Brazilian sample, although less extensive than that of the Amerindians, were probably determined by admixture events. Our results indicated that different demographic histories have significant effects on LD levels of human populations, and provide a first approach to the X‐chromosome ancestry of Amerindian and non‐Amerindian Brazilian populations, being valuable for future studies involving mapping and population genetic studies. Am J Phys Anthropol 2009. © 2009 Wiley‐Liss, Inc.  相似文献   

16.
There has been much recent excitement about the use of genetics to elucidate ancestral history and demography. Whole genome data from humans and other species are revealing complex stories of divergence and admixture that were left undiscovered by previous smaller data sets. A central challenge is to estimate the timing of past admixture and divergence events, for example the time at which Neanderthals exchanged genetic material with humans and the time at which modern humans left Africa. Here, we present a method for using sequence data to jointly estimate the timing and magnitude of past admixture events, along with population divergence times and changes in effective population size. We infer demography from a collection of pairwise sequence alignments by summarizing their length distribution of tracts of identity by state (IBS) and maximizing an analytic composite likelihood derived from a Markovian coalescent approximation. Recent gene flow between populations leaves behind long tracts of identity by descent (IBD), and these tracts give our method power by influencing the distribution of shared IBS tracts. In simulated data, we accurately infer the timing and strength of admixture events, population size changes, and divergence times over a variety of ancient and recent time scales. Using the same technique, we analyze deeply sequenced trio parents from the 1000 Genomes project. The data show evidence of extensive gene flow between Africa and Europe after the time of divergence as well as substructure and gene flow among ancestral hominids. In particular, we infer that recent African-European gene flow and ancient ghost admixture into Europe are both necessary to explain the spectrum of IBS sharing in the trios, rejecting simpler models that contain less population structure.  相似文献   

17.
Maximum-likelihood estimation of admixture proportions from genetic data   总被引:9,自引:0,他引:9  
Wang J 《Genetics》2003,164(2):747-765
For an admixed population, an important question is how much genetic contribution comes from each parental population. Several methods have been developed to estimate such admixture proportions, using data on genetic markers sampled from parental and admixed populations. In this study, I propose a likelihood method to estimate jointly the admixture proportions, the genetic drift that occurred to the admixed population and each parental population during the period between the hybridization and sampling events, and the genetic drift in each ancestral population within the interval between their split and hybridization. The results from extensive simulations using various combinations of relevant parameter values show that in general much more accurate and precise estimates of admixture proportions are obtained from the likelihood method than from previous methods. The likelihood method also yields reasonable estimates of genetic drift that occurred to each population, which translate into relative effective sizes (N(e)) or absolute average N(e)'s if the times when the relevant events (such as population split, admixture, and sampling) occurred are known. The proposed likelihood method also has features such as relatively low computational requirement compared with previous ones, flexibility for admixture models, and marker types. In particular, it allows for missing data from a contributing parental population. The method is applied to a human data set and a wolflike canids data set, and the results obtained are discussed in comparison with those from other estimators and from previous studies.  相似文献   

18.
While genome-wide association studies (GWAS) have primarily examined populations of European ancestry, more recent studies often involve additional populations, including admixed populations such as African Americans and Latinos. In admixed populations, linkage disequilibrium (LD) exists both at a fine scale in ancestral populations and at a coarse scale (admixture-LD) due to chromosomal segments of distinct ancestry. Disease association statistics in admixed populations have previously considered SNP association (LD mapping) or admixture association (mapping by admixture-LD), but not both. Here, we introduce a new statistical framework for combining SNP and admixture association in case-control studies, as well as methods for local ancestry-aware imputation. We illustrate the gain in statistical power achieved by these methods by analyzing data of 6,209 unrelated African Americans from the CARe project genotyped on the Affymetrix 6.0 chip, in conjunction with both simulated and real phenotypes, as well as by analyzing the FGFR2 locus using breast cancer GWAS data from 5,761 African-American women. We show that, at typed SNPs, our method yields an 8% increase in statistical power for finding disease risk loci compared to the power achieved by standard methods in case-control studies. At imputed SNPs, we observe an 11% increase in statistical power for mapping disease loci when our local ancestry-aware imputation framework and the new scoring statistic are jointly employed. Finally, we show that our method increases statistical power in regions harboring the causal SNP in the case when the causal SNP is untyped and cannot be imputed. Our methods and our publicly available software are broadly applicable to GWAS in admixed populations.  相似文献   

19.
Genetic admixture between captive-bred and wild individuals has been demonstrated to affect many individual traits, although little is known about its potential influence on dispersal, an important trait governing the eco-evolutionary dynamics of populations. Here, we quantified and described the spatial distribution of genetic admixture in a brown trout (Salmo trutta) population from a small watershed that was stocked until 1999, and then tested whether or not individual dispersal parameters were related to admixture between wild and captive-bred fish. We genotyped 715 fish at 17 microsatellite loci sampled from both the mainstream and all populated tributaries, as well as 48 fish from the hatchery used to stock the study area. First, we used Bayesian clustering to infer local genetic structure and to quantify genetic admixture. We inferred first generation migrants to identify dispersal events and test which features (genetic admixture, sex and body length) affected dispersal parameters (i.e. probability to disperse, distance of dispersal and direction of the dispersal event). We identified two genetic clusters in the river basin, corresponding to wild fish on the one hand and to fish derived from the captive strain on the other hand, allowing us to define an individual gradient of admixture. Individuals with a strong assignment to the captive strain occurred almost exclusively in some tributaries, and were more likely to disperse towards a tributary than towards a site of the mainstream. Furthermore, dispersal probability increased as the probability of assignment to the captive strain increased, and individuals with an intermediate level of admixture exhibited the lowest dispersal distances. These findings show that various dispersal parameters may be biased by admixture with captive-bred genotypes, and that management policies should take into account the differential spread of captive-bred individuals in wild populations.  相似文献   

20.
Following up on our previous study, we conducted a genome-wide analysis of admixture for two Uyghur population samples (HGDP-UG and PanAsia-UG), collected from the northern and southern regions of Xinjiang in China, respectively. Both HGDP-UG and PanAsia-UG showed a substantial admixture of East-Asian (EAS) and European (EUR) ancestries, with an empirical estimation of ancestry contribution of 53:47 (EAS:EUR) and 48:52 for HGDP-UG and PanAsia-UG, respectively. The effective admixture time under a model with a single pulse of admixture was estimated as 110 generations and 129 generations, or admixture events occurred about 2200 and 2580 years ago for HGDP-UG and PanAsia-UG, respectively, assuming an average of 20 yr per generation. Despite Uyghurs' earlier history compared to other admixture populations, admixture mapping, holds promise for this population, because of its large size and its mixture of ancestry from different continents. We screened multiple databases and identified a genome-wide single-nucleotide polymorphism panel that can distinguish EAS and EUR ancestry of chromosomal segments in Uyghurs. The panel contains 8150 ancestry-informative markers (AIMs) showing large frequency differences between EAS and EUR populations (FST > 0.25, mean FST = 0.43) but small frequency differences (7999 AIMs validated) within both populations (FST < 0.05, mean FST < 0.01). We evaluated the effectiveness of this admixture map for localizing disease genes in two Uyghur populations. To our knowledge, our map constitutes the first practical resource for admixture mapping in Uyghurs, and it will enable studies of diseases showing differences in genetic risk between EUR and EAS populations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号