首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
There is considerable debate about the methodologies used to estimate VNTR (Variable Number of Tandem Repeats) multi-locus genotype frequencies or odds of inclusion in forensic cases. To compare two of the methods in use, allele frequency distributions among six populations were compared and the effect of population heterogeneity on VNTR multi-locus genotype frequency estimation was examined. Genotype frequencies estimated from single population data were one or two orders of magnitude smaller than those estimated by picking the highest allele frequency in a group of subpopulations to estimate genotype frequencies using a ceiling principle. The average change does not appear to be very sensitive to the set of subpopulations used; four locus frequencies still give inclusion odds of one in a million or less. We think that use of the ceiling principle solves both the statistical problem engendered by subpopulation heterogeneity and the legal problem of assuming that the prepetrator and suspect belong to the same subpopulation. The counterintuitive fact of human genetic polymorphism is that it is easier to identify an individual than it is to identify the subpopulation, ethnic group or race to which that individual belongs.  相似文献   

2.
Gao H  Williamson S  Bustamante CD 《Genetics》2007,176(3):1635-1651
Nonrandom mating induces correlations in allelic states within and among loci that can be exploited to understand the genetic structure of natural populations (Wright 1965). For many species, it is of considerable interest to quantify the contribution of two forms of nonrandom mating to patterns of standing genetic variation: inbreeding (mating among relatives) and population substructure (limited dispersal of gametes). Here, we extend the popular Bayesian clustering approach STRUCTURE (Pritchard et al. 2000) for simultaneous inference of inbreeding or selfing rates and population-of-origin classification using multilocus genetic markers. This is accomplished by eliminating the assumption of Hardy-Weinberg equilibrium within clusters and, instead, calculating expected genotype frequencies on the basis of inbreeding or selfing rates. We demonstrate the need for such an extension by showing that selfing leads to spurious signals of population substructure using the standard STRUCTURE algorithm with a bias toward spurious signals of admixture. We gauge the performance of our method using extensive coalescent simulations and demonstrate that our approach can correct for this bias. We also apply our approach to understanding the population structure of the wild relative of domesticated rice, Oryza rufipogon, an important partially selfing grass species. Using a sample of n = 16 individuals sequenced at 111 random loci, we find strong evidence for existence of two subpopulations, which correlates well with geographic location of sampling, and estimate selfing rates for both groups that are consistent with estimates from experimental data (s approximately 0.48-0.70).  相似文献   

3.
Using RAPD markers and one morphological marker, we studied the among- and within-population structure in a selfing annual plant species, Medicago truncatula GAERTN. About 200 individuals, sampled from four populations subdivided into three subpopulations each, were scored for 22 markers. It was found that the within-population variance component accounted for 55% of the total variance, while the among-population variance component accounted for 45%. Eighteen percent of the total variance was due to within-population structure (i.e., among subpopulations). Thus, 37% of the total variance was within subpopulations. Using a multilocus approach, it was found that no multilocus genotype was common to two populations. Two of the four studied populations were composed of few (≤6) multilocus genotypes, whereas the other two had many (≥15) multilocus genotypes. In the most polymorphic population (37 genotypes), only one genotype was found to be common to two subpopulations. Resampling experiments show that, depending on the population, three to 16 polymorphic loci were necessary and sufficient to score all multilocus genotypes in the population. When these data are compared to published results, it appears that on some occasions, the number of genotypes per population of selfing species might be larger than would be expected from the sole consideration of effective population size. The large within-subpopulation genetic variance observed in some populations could be explained by either small neighborhood sizes within subpopulations, or by outcrossing following migration through seed and/or pollen.  相似文献   

4.
Zang Y  Zhang H  Yang Y  Zheng G 《Human heredity》2007,63(3-4):187-195
The population-based case-control design is a powerful approach for detecting susceptibility markers of a complex disease. However, this approach may lead to spurious association when there is population substructure: population stratification (PS) or cryptic relatedness (CR). Two simple approaches to correct for the population substructure are genomic control (GC) and delta centralization (DC). GC uses the variance inflation factor to correct for the variance distortion of a test statistic, and the DC centralizes the non-central chi-square distribution of the test statistic. Both GC and DC have been studied for case-control association studies mainly under a specific genetic model (e.g. recessive, additive or dominant), under which an optimal trend test is available. The genetic model is usually unknown for many complex diseases. In this situation, we study the performance of three robust tests based on the GC and DC corrections in the presence of the population substructure. Our results show that, when the genetic model is unknown, the DC- (or GC-) corrected maximum and Pearson's association test are robust and have good control of Type I error and high power relative to the optimal trend tests in the presence of PS (or CR).  相似文献   

5.
We propose a novel latent-class approach to detect and account for population stratification in a case-control study of association between a candidate gene and a disease. In our approach, population substructure is detected and accounted for using data on additional loci that are in linkage equilibrium within subpopulations but have alleles that vary in frequency between subpopulations. We have tested our approach using simulated data based on allele frequencies in 12 short tandem repeat (STR) loci in four populations in Argentina.  相似文献   

6.
There has been considerable debate in the literature concerning bias in case-control association mapping studies due to population stratification. In this paper, we perform a theoretical analysis of the effects of population stratification by measuring the inflation in the test's type I error (or false-positive rate). Using a model of stratified sampling, we derive an exact expression for the type I error as a function of population parameters and sample size. We give necessary and sufficient conditions for the bias to vanish when there is no statistical association between disease and marker genotype in each of the subpopulations making up the total population. We also investigate the variation of bias with increasing subpopulations and show, both theoretically and by using simulations, that the bias can sometimes be quite substantial even with a very large number of subpopulations. In a companion simulation-based paper (Heiman et al., Part I, this issue), we have focused on the CRR (confounding risk ratio) and its relationship to the type I error in the case of two subpopulations, and have also quantified the magnitude of the type I error that can occur with relatively low CRR values.  相似文献   

7.
There is considerable ethno-linguistic and genetic variation among human populations in Asia, although tracing the origins of this diversity is complicated by migration events. Thailand is at the center of Mainland Southeast Asia (MSEA), a region within Asia that has not been extensively studied. Genetic substructure may exist in the Thai population, since waves of migration from southern China throughout its recent history may have contributed to substantial gene flow. Autosomal SNP data were collated for 438,503 markers from 992 Thai individuals. Using the available self-reported regional origin, four Thai subpopulations genetically distinct from each other and from other Asian populations were resolved by Neighbor-Joining analysis using a 41,569 marker subset. Using an independent Principal Components-based unsupervised clustering approach, four major MSEA subpopulations were resolved in which regional bias was apparent. A major ancestry component was common to these MSEA subpopulations and distinguishes them from other Asian subpopulations. On the other hand, these MSEA subpopulations were admixed with other ancestries, in particular one shared with Chinese. Subpopulation clustering using only Thai individuals and the complete marker set resolved four subpopulations, which are distributed differently across Thailand. A Sino-Thai subpopulation was concentrated in the Central region of Thailand, although this constituted a minority in an otherwise diverse region. Among the most highly differentiated markers which distinguish the Thai subpopulations, several map to regions known to affect phenotypic traits such as skin pigmentation and susceptibility to common diseases. The subpopulation patterns elucidated have important implications for evolutionary and medical genetics. The subpopulation structure within Thailand may reflect the contributions of different migrants throughout the history of MSEA. The information will also be important for genetic association studies to account for population-structure confounding effects.  相似文献   

8.
Kitada S  Kitakado T  Kishino H 《Genetics》2007,177(2):861-873
Populations often have very complex hierarchical structure. Therefore, it is crucial in genetic monitoring and conservation biology to have a reliable estimate of the pattern of population subdivision. F(ST)'s for pairs of sampled localities or subpopulations are crucial statistics for the exploratory analysis of population structures, such as cluster analysis and multidimensional scaling. However, the estimation of F(ST) is not precise enough to reliably estimate the population structure and the extent of heterogeneity. This article proposes an empirical Bayes procedure to estimate locus-specific pairwise F(ST)'s. The posterior mean of the pairwise F(ST) can be interpreted as a shrinkage estimator, which reduces the variance of conventional estimators largely at the expense of a small bias. The global F(ST) of a population generally varies among loci in the genome. Our maximum-likelihood estimates of global F(ST)'s can be used as sufficient statistics to estimate the distribution of F(ST) in the genome. We demonstrate the efficacy and robustness of our model by simulation and by an analysis of the microsatellite allele frequencies of the Pacific herring. The heterogeneity of the global F(ST) in the genome is discussed on the basis of the estimated distribution of the global F(ST) for the herring and examples of human single nucleotide polymorphisms (SNPs).  相似文献   

9.
I. Bonnin  J. M. Prosperi    I. Olivieri 《Genetics》1996,143(4):1795-1805
Two populations of the selfing annual Medicago truncatula Gaertn. (Leguminoseae), each subdivided into three subpopulations, were studied for both metric traits (quantitative characters) and genetic markers (random amplified polymorphic DNA and one morphological, single-locus marker). Hierarchical analyses of variance components show that (1) populations are more differentiated for quantitative characters than for marker loci, (2) the contribution of both within and among subpopulations components of variance to overall genetic variance of these characters is reduced as compared to markers, and (3) at the population level, within population structure is slightly but not significantly larger for markers than for quantitative traits. Under the hypothesis that most markers are neutral, such comparisons may be used to make hypotheses about the strength and heterogeneity of natural selection in the face of genetic drift and gene flow. We thus suggest that in these populations, quantitative characters are under strong divergent selection among populations, and that gene flow is restricted among populations and subpopulations.  相似文献   

10.
Microhabitat heterogeneity can lead to fine‐scale local adaptation when gene flow is restricted, which may be important for the maintenance of genetic variation within populations. This study tested whether microhabitat heterogeneity was associated with trait differences in a population of Arabidopsis lyrata and studied its impact on the genetic variance–covariance (G) matrix. Maternal seed families were collected from dune tops and bottoms, two microhabitats known to vary significantly in water availability. In a common garden experiment, replicate individuals per family were raised under wet and dry conditions, and physiological, morphological and life‐history traits were assessed. Plants from the two microenvironments differed in their response to treatment in two performance components, in stomata density and most strongly in flowering time. Under wet conditions, plants originating from dune bottoms flowered 4 weeks earlier than those from dune tops. Only one of three G‐matrix comparisons revealed that habitat heterogeneity and evolutionary potential were positively linked. The number of independent trait dimensions was larger in the entire population than within subpopulations separated by microhabitat under wet conditions. However, the size of the G‐matrix was no larger in the entire population than within subpopulations separated by microhabitat, and trait correlation structure between microhabitats and treatments was not significantly different. These results indicate that fine‐scale habitat heterogeneity likely led to local adaptation, which weakly affected levels of across‐trait genetic variation.  相似文献   

11.
One of the most pressing issues in spatial genetics concerns sampling. Traditionally, substructure and gene flow are estimated for individuals sampled within discrete populations. Because many species may be continuously distributed across a landscape without discrete boundaries, understanding sampling issues becomes paramount. Given large-scale, geographically broad conservation efforts, researchers are looking for guidance as to the trade-offs between sampling more individuals within a population versus few individuals scattered across more populations. Here, we conducted simulations that address these issues. We first established two archetypical patterns of dispersion: (1) individuals within discrete populations, and (2) continuously distributed individuals with limited dispersal. We used genotypes generated from a spatially-explicit, individual-based program and simulated genetic structure in individuals from nine different population sizes across a landscape that either had barriers to movement (defining discrete populations) or isolation-by-distance patterns (defining continuously distributed individuals). Then, given each pattern of dispersion, we allocated samples across four different sampling strategies for each of the nine population sizes in various configurations for sampling more individuals within a population versus fewer individuals scattered across more populations. We assessed the population genetic substructure with both the population-based metric, F ST, and an individual-based metric, D PS regardless of the true pattern of dispersion to allow us to better understand the effect of incorrectly matching the metric and the distribution (e.g., F ST with continuously distributed individuals, and vice versa). We show that sampling many subpopulations (or sampling areas), thus sampling fewer individuals per subpopulation, overestimates measures of population subdivision with the population-based metric for both patterns of dispersion. In contrast, using the individual-based metric gives the opposite results: sampling too few subpopulations, and many individuals per subpopulation, produces an underestimate of the strength of isolation-by-distance. By comparing all results, we were able to suggest a strong predictive model of a chosen genetic structure metric for elucidating the sampling design trade-offs given each pattern of dispersion and configuration on the landscape.  相似文献   

12.
ABSTRACT: BACKGROUND: Trait variances among genotype groups at a locus are expected to differ in the presence of an interaction between this locus and another locus or environment. A simple maximum test on variance heterogeneity can thus be used to identify potentially interacting single nucleotide polymorphisms (SNPs). RESULTS: We propose a multiple contrast test for variance heterogeneity that compares the mean of Levene residuals for each genotype group with their average as an alternative to a global Levene test. We applied this test to a Bogalusa Heart Study dataset to screen for potentially interacting SNPs across the whole genome that influence a number of quantitative traits. A user-friendly implementation of this method is available in the R statistical software package multcomp. CONCLUSIONS: We show that the proposed multiple contrast test of model-specific variance heterogeneity can be used to test for potential interactions between SNPs and unknown alleles, loci or covariates and provide valuable additional information compared with traditional tests. Although the test is statistically valid for severely unbalanced designs, care is needed in interpreting the results at loci with low allele frequencies.  相似文献   

13.
1. We examined the ecological genetics of the invasive cladoceran Daphnia lumholtzi in a reservoir (Lake Texoma) in the southern USA. This species originates from the Old World subtropics and has spread across North America since the late 1980s after its inadvertent introduction to a reservoir in northeastern Texas. 2. The population genetic structure of D. lumholtzi was examined seasonally on 22 dates over a 3‐year period along a natural temperature and salinity gradient. 3. A two‐allele polymorphism at the phosphoglucose isomerase (PGI) locus was detected, while other loci tested showed no variation. Significant temporal heterogeneity existed in genotype frequencies with a major shift between summer and autumn. 4. Results of a multiple linear regression analysis revealed that a significant amount of variation was explained by day length, temperature and conductivity. Additionally, significant spatial heterogeneity of genotype frequencies between lake stations was observed, but was restricted to the summer. 5. Clonal isolates were used in controlled laboratory temperature and salinity tolerance experiments. Results suggest that salinity and temperature tolerance differed between PGI genotypes. 6. Genotype × environment interactions may play a significant role in the micro‐evolutionary dynamics of this invasive species and may have facilitated its rapid expansion across the North American continent.  相似文献   

14.
This report explores how the heterogeneity of variances affects randomization tests used to evaluate differences in the asymptotic population growth rate, λ. The probability of Type I error was calculated in four scenarios for populations with identical λ but different variance of λ: (1) Populations have different projection matrices: the same λ may be obtained from different sets of vital rates, which gives room for different variances of λ. (2) Populations have identical projection matrices but reproductive schemes differ and fecundity in one of the populations has a larger associated variance. The two other scenarios evaluate a sampling artifact as responsible for heterogeneity of variances. The same population is sampled twice, (3) with the same sampling design, or (4) with different sampling effort for different stages. Randomization tests were done with increasing differences in sample size between the two populations. This implies additional differences in the variance of λ. The probability of Type I error keeps at the nominal significance level (α = .05) in Scenario 3 and with identical sample sizes in the others. Tests were too liberal, or conservative, under a combination of variance heterogeneity and different sample sizes. Increased differences in sample size exacerbated the difference between observed Type I error and the nominal significance level. Type I error increases or decreases depending on which population has a larger sample size, the population with the smallest or the largest variance. However, by their own, sample size is not responsible for changes in Type I errors.  相似文献   

15.
Several years ago it was reported that rare HRAS1 VNTR alleles occurred more frequently in U.S. Caucasian cancer patients than in unaffected controls. Such an association, in theory, could be caused by undetected population heterogeneity. Also, in a study clearly relevant to this issue, it was recently reported that significant deviations from Hardy-Weinberg equilibrium exist at this locus in a sample of U.S. Caucasians. These considerations motivate our population genetic analysis of the HRAS1 locus. From published studies of the HRAS1 VNTR locus, which classified alleles into types, we found only small differences in the allele frequency distributions of samples from various European nations, although there were larger differences among ethnic groups (African American, Caucasian, and Oriental). In an analysis of variation of rare-allele frequencies among samples from four European nations, most of the variance was attributable to molecular methodology, and very samples from four European nations, most of the variance was attributable to molecular methodology, and very little of the variance was accounted for by nationality. In addition, we showed that mixture of European subpopulations should result in only minor deviations from expected genotype proportions in a Caucasian database and demonstrated that there was no significant deviation from Hardy-Weinberg equilibrium in our HRAS1 data.  相似文献   

16.
It is well-known that population substructure may lead to confounding in case–control association studies. Here, we examined genetic structure in a large racially and ethnically diverse sample consisting of five ethnic groups of the Multiethnic Cohort study (African Americans, Japanese Americans, Latinos, European Americans and Native Hawaiians) using 2,509 SNPs distributed across the genome. Principal component analysis on 6,213 study participants, 18 Native Americans and 11 HapMap III populations revealed four important principal components (PCs): the first two separated Asians, Europeans and Africans, and the third and fourth corresponded to Native American and Native Hawaiian (Polynesian) ancestry, respectively. Individual ethnic composition derived from self-reported parental information matched well to genetic ancestry for Japanese and European Americans. STRUCTURE-estimated individual ancestral proportions for African Americans and Latinos are consistent with previous reports. We quantified the East Asian (mean 27%), European (mean 27%) and Polynesian (mean 46%) ancestral proportions for the first time, to our knowledge, for Native Hawaiians. Simulations based on realistic settings of case–control studies nested in the Multiethnic Cohort found that the effect of population stratification was modest and readily corrected by adjusting for race/ethnicity or by adjusting for top PCs derived from all SNPs or from ancestry informative markers; the power of these approaches was similar when averaged across causal variants simulated based on allele frequencies of the 2,509 genotyped markers. The bias may be large in case-only analysis of gene by gene interactions but it can be corrected by top PCs derived from all SNPs.  相似文献   

17.
Gomez-Raya L 《Genetics》2001,157(3):1357-1367
A maximum-likelihood method to estimate the recombination fraction and its sampling variance using informative and noninformative half-sib offspring is derived. Estimates of the recombination fraction are biased up to 20 cM when noninformative offspring are discarded. In certain scenarios, the sampling variance can be increased or reduced up to fivefold due to the bias in estimating the recombination fraction and the LOD score can be reduced up to 5 units when discarding noninformative offspring. Comparison of the estimates of recombination fraction, map distance, and LOD score when constructing a genetic map with 251 two-point linkage analyses and six families of Norwegian cattle was carried out to evaluate the implications of discarding noninformative offspring in practical situations. The average discrepancies in absolute value (average difference when using and neglecting noninformative offspring) were 0.0146, 1.64 cM, and 2.61 for the recombination fraction, map distance, and the LOD score, respectively. A method for simultaneous estimation of allele frequencies in the dam population and a transmission disequilibrium parameter is proposed. This method might account for the bias in estimating allele frequencies in the dam population when the half-sib offspring is selected for production traits.  相似文献   

18.
A meta-analysis was performed with the aim of re-evaluating the role of the peroxisome proliferator activated receptor alpha (PPARA) gene intron 7 G/C polymorphism (rs4253778) in athletes’ high ability in endurance sports. Design: A meta-analysis of case control studies assessing the association between the G/C polymorphisms of the PPARA gene and endurance sports was conducted. The Cochrane Review Manager software was used to compare the genotype and allele frequencies between endurance athletes and controls to determine whether a genetic variant is more common in athletes than in the general population. Five studies, encompassing 760 endurance athletes and 1792 controls, fulfilled our inclusion criteria. The pooled odds ratio (and confidence intervals, CIs) for the G allele compared to the C allele was 1.65 (95% CI 1.39-1.96). The pooled OR for the GG genotype compared to the GC genotype was 1.79 (95% CI 1.44-2.22), and for the GG genotype compared to the CC genotype 2.37 (95% CI 1.40-3.99). There was no evidence of heterogeneity (I2 =0%) or of publication bias. Athletes with high ability in endurance sports had a higher frequency of the GG genotype and G allele.  相似文献   

19.
A population sample from people of diverse ethnic origins living in New Zealand serves as a database to test methods for inference of population subdivision. The initial null hypothesis, that the population sample is homogeneous across ethnic groups, is easily rejected by likelihood ratio tests. Beyond this, methods for quantifying subdivision can be based on the probability of drawing alleles identical by descent (F ST ), probabilities of matching multiple locus genotypes, and occurrence of unique alleles. Population genetic theory makes quantitative predictions about the relation betweenF ST , population sizes, and rates of migration and mutation. Some VNTR loci have mutation rates of 10–2 per generation, but, contrary to theory, we find no consistent association between the degree of population subdivision and mutation rate. Quantification of population substructure also allows us to relate the magnitudes of genetic distances between ethnic groups in New Zealand to the colonization history of the country. The data suggests that the closest relatives to the Maori are Polynesians, and that no severe genetic bottleneck occurred when the Maori colonized New Zealand. One of the central points of contention regarding the application of VNTR loci in forensics is the appropriate means for estimating match probabilities. Simulations were performed to test the merits of the product rule in the face of subpopulation heterogeneity. Population heterogeneity results in large differences in estimates of multilocus genotype frequencies depending on which subpopulation is used for reference allele frequencies, but, of greater importance for forensic purposes, no five locus genotype had an expected frequency greater than 10–6. Although this implies that a match with an innocent individual is unlikely, in a large urban area such chance matches are going to occur.Editor's commentsA side-benefit of the collection of DNA data from human populations is the light it may shed on human evolution. The authors discuss the colonization history of New Zealand in the light of such data. From a forensic viewpoint, too much should not be made of the differences between the major ethnic groups within New Zealand, as the forensic community in that country maintains separate databases for Caucasian, Maori and Pacific Islander (Buckletonet al., 1987). It will be of interest in the future to examine subdivision within these groups, as opposed to within the country as a whole. The authors' comments on testing for independence will need to read along with the findings of Zaykinet al. and Maiste and Weir in this volume. The authors had not seen the Budowleet al. (1994) rebuttal to the paper of Kraneet al. (1992).  相似文献   

20.
Golubeva NA  Zhigachev AI 《Genetika》2007,43(8):1079-1083
The population of domestic cats from the city of Armavir has been examined. A high frequency of gene O was revealed in the population. Differences among three subpopulations estimated using two genetic distances showed heterogeneity of the Armavir cat population. The extreme samples showed highly significant differences (P < 0.01; chi2[6] = 24.67), likely explained by the structural features of the synantropous population and human-driven frequency- dependent selection operating in it. The feline population of Armavir underwent significant changes during the past two decades. The d(ij) coefficient in it was 0.093; D(p) = 0.05. The frequencies of genes orange and Long-hair have increased in the general population. The frequency of gene dilution has decreased. These changes may have occurred because of genetic exchange with purebred domestic cats that have become more popular as pets in the recent years.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号