首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The strategy of bulk DNA sampling has been a valuable method for studying large numbers of individuals through genetic markers. The application of this strategy for discrimination among germplasm sources was analyzed through information theory, considering the case of polymorphic alleles scored binarily for their presence or absence in DNA pools. We defined the informativeness of a set of marker loci in bulks as the mutual information between genotype and population identity, composed by two terms: diversity and noise. The first term is the entropy of bulk genotypes, whereas the noise term is measured through the conditional entropy of bulk genotypes given germplasm sources. Thus, optimizing marker information implies increasing diversity and reducing noise. Simple formulas were devised to estimate marker information per allele from a set of estimated allele frequencies across populations. As an example, they allowed optimization of bulk size for SSR genotyping in maize, from allele frequencies estimated in a sample of 56 maize populations. It was found that a sample of 30 plants from a random mating population is adequate for maize germplasm SSR characterization. We analyzed the use of divided bulks to overcome the allele dilution problem in DNA pools, and concluded that samples of 30 plants divided into three bulks of 10 plants are efficient to characterize maize germplasm sources through SSR with a good control of the dilution problem. We estimated the informativeness of 30 SSR loci from the estimated allele frequencies in maize populations, and found a wide variation of marker informativeness, which positively correlated with the number of alleles per locus.  相似文献   

2.
We obtained fresh dung samples from 202 (133 mother-offspring pairs) savannah elephants (Loxodonta africana) in Samburu, Kenya, and genotyped them at 20 microsatellite loci to assess genotyping success and errors. A total of 98.6% consensus genotypes was successfully obtained, with allelic dropout and false allele rates at 1.6% (n = 46) and 0.9% (n = 37) of heterozygous and total consensus genotypes, respectively, and an overall genotyping error rate of 2.5% based on repeat typing. Mendelian analysis revealed consistent inheritance in all but 38 allelic pairs from mother-offspring, giving an average mismatch error rate of 2.06%, a possible result of null alleles, mutations, genotyping errors, or inaccuracy in maternity assignment. We detected no evidence for large allele dropout, stuttering, or scoring error in the dataset and significant Hardy-Weinberg deviations at only two loci due to heterozygosity deficiency. Across loci, null allele frequencies were low (range: 0.000-0.042) and below the 0.20 threshold that would significantly bias individual-based studies. The high genotyping success and low errors observed in this study demonstrate reliability of the method employed and underscore the application of simple pedigrees in noninvasive studies. Since none of the sires were included in this study, the error rates presented are just estimates.  相似文献   

3.
Selective genotyping of one or both phenotypic extremes of a population can be used to detect linkage between markers and quantitative trait loci (QTL) in situations in which full-population genotyping is too costly or not feasible, or where the objective is to rapidly screen large numbers of potential donors for useful alleles with large effects. Data may be subjected to 'trait-based' analysis, in which marker allele frequencies are compared between classes of progeny defined based on trait values, or to 'marker-based' analysis, in which trait means are compared between progeny classes defined based on marker genotypes. Here, bidirectional and unidirectional selective genotyping were simulated, using population sizes and selection intensities relevant to cereal breeding. Control of Type I error was usually adequate with marker-based analysis of variance or trait-based testing using the normal approximation of the binomial distribution. Bidirectional selective genotyping was more powerful than unidirectional. Trait-based analysis and marker-based analysis of variance were about equally powerful. With genotyping of the best 30 out of 500 lines (6%), a QTL explaining 15% of the phenotypic variance could be detected with a power of 0.8 when tests were conducted at a marker 10 cM from the QTL. With bidirectional selective genotyping, QTL with smaller effects and (or) QTL farther from the nearest marker could be detected. Similar QTL detection approaches were applied to data from a population of 436 recombinant inbred rice lines segregating for a large-effect QTL affecting grain yield under drought stress. That QTL was reliably detected by genotyping as few as 20 selected lines (4.5%). In experimental populations, selective genotyping can reduce costs of QTL detection, allowing larger numbers of potential donors to be screened for useful alleles with effects across different backgrounds. In plant breeding programs, selective genotyping can make it possible to detect QTL using even a limited number of progeny that have been retained after selection.  相似文献   

4.
DNA degradation, low DNA concentrations and primer‐site mutations may result in the incorrect assignment of microsatellite genotypes, potentially biasing population genetic analyses. micro ‐checker is windows ®‐based software that tests the genotyping of microsatellites from diploid populations. The program aids identification of genotyping errors due to nonamplified alleles (null alleles), short allele dominance (large allele dropout) and the scoring of stutter peaks, and also detects typographic errors. micro ‐checker estimates the frequency of null alleles and, importantly, can adjust the allele and genotype frequencies of the amplified alleles, permitting their use in further population genetic analysis. micro ‐checker can be freely downloaded from http://www.microchecker.hull.ac.uk/ .  相似文献   

5.
Population size information is critical for managing endangered or harvested populations. Population size can now be estimated from non-invasive genetic sampling. However, pitfalls remain such as genotyping errors (allele dropout and false alleles at microsatellite loci). To evaluate the feasibility of non-invasive sampling (e.g., for population size estimation), a pilot study is required. Here, we present a pilot study consisting of (i) a genetic step to test loci amplification and to estimate allele frequencies and genotyping error rates when using faecal DNA, and (ii) a simulation step to quantify and minimise the effects of errors on estimates of population size. The pilot study was conducted on a population of red deer in a fenced natural area of 5440 ha, in France. Twelve microsatellite loci were tested for amplification and genotyping errors. The genotyping error rates for microsatellite loci were 0–0.83 (mean=0.2) for allele dropout rates and 0–0.14 (mean=0.02) for false allele rates, comparable to rates encountered in other non-invasive studies. Simulation results suggest we must conduct 6 PCR amplifications per sample (per locus) to achieve approximately 97% correct genotypes. The 3% error rate appears to have little influence on the accuracy and precision of population size estimation. This paper illustrates the importance of conducting a pilot study (including genotyping and simulations) when using non-invasive sampling to study threatened or managed populations.  相似文献   

6.
Five polymorphic microsatellite loci were identified in the black scallop Mimachlamys varia after construction of a genomic library enriched for (GT)n. To examine the transmission pattern of microsatellite alleles, several families were created and genotypes scored for three loci. The expected Mendelian ratios were found in 12 of 14 segregations examined. Unexpected segregations may be explained by a genotyping error (allelic dropout), given that when a specific allele was treated as dominant, the phenotypic ratios conformed to Mendelian expectations. The five loci were also examined in two samples from the Spanish coast. The two localities displayed similar mean values for the number of alleles per locus (7.2-8.4), allelic richness (7.2-7.9), and observed (0.389-0.484) and expected heterozygosity (0.545-0.618). Significant Hardy-Weinberg deviations were observed at three loci, with heterozygote deficiency occurring in all cases. Global multilocus θ value and allele frequencies at one locus revealed significant differentiation between the two localities.  相似文献   

7.
Allelic dropout is a commonly observed source of missing data in microsatellite genotypes, in which one or both allelic copies at a locus fail to be amplified by the polymerase chain reaction. Especially for samples with poor DNA quality, this problem causes a downward bias in estimates of observed heterozygosity and an upward bias in estimates of inbreeding, owing to mistaken classifications of heterozygotes as homozygotes when one of the two copies drops out. One general approach for avoiding allelic dropout involves repeated genotyping of homozygous loci to minimize the effects of experimental error. Existing computational alternatives often require replicate genotyping as well. These approaches, however, are costly and are suitable only when enough DNA is available for repeated genotyping. In this study, we propose a maximum-likelihood approach together with an expectation-maximization algorithm to jointly estimate allelic dropout rates and allele frequencies when only one set of nonreplicated genotypes is available. Our method considers estimates of allelic dropout caused by both sample-specific factors and locus-specific factors, and it allows for deviation from Hardy–Weinberg equilibrium owing to inbreeding. Using the estimated parameters, we correct the bias in the estimation of observed heterozygosity through the use of multiple imputations of alleles in cases where dropout might have occurred. With simulated data, we show that our method can (1) effectively reproduce patterns of missing data and heterozygosity observed in real data; (2) correctly estimate model parameters, including sample-specific dropout rates, locus-specific dropout rates, and the inbreeding coefficient; and (3) successfully correct the downward bias in estimating the observed heterozygosity. We find that our method is fairly robust to violations of model assumptions caused by population structure and by genotyping errors from sources other than allelic dropout. Because the data sets imputed under our model can be investigated in additional subsequent analyses, our method will be useful for preparing data for applications in diverse contexts in population genetics and molecular ecology.  相似文献   

8.
Methods to infer parent numbers from offspring genotypes either determine the minimum number of parents required to explain alleles and multilocus genotypes detected in the offspring or use models to incorporate information on population allele frequencies and allele segregation. Disparate results by different approaches suggest that one or perhaps all methods are subject to bias. Here, we investigate the performance of minimum parent number estimates, maximum likelihood, and Bayesian analyses (programs COLONY and PARENTAGE) with respect to marker information content in simulated data sets without knowledge of parental genotypes. Offspring families of different sizes were assumed to share one parent and to be sired by 1 or 5 additional parents. All methods committed large errors in terms of underestimation (minimum value) and overestimation (COLONY), or both (PARENTAGE) of parent numbers, unless the data were highly informative, and their relative performances depended on full-sib group sizes and sire numbers. Increasing the number of markers with low gene diversity (H(e) < or = 0.68) yielded only slow improvement of the results, but all 3 methods performed well with 5-7 markers of H(e) = 0.84. We emphasize the importance of high marker polymorphism for inferring parent numbers and individual parent contributions, as well as for the detection of monogamous reproduction.  相似文献   

9.
Rannala B  Qiu WG  Dykhuizen DE 《Genetics》2000,155(2):499-508
Recent breakthroughs in molecular technology, most significantly the polymerase chain reaction (PCR) and in situ hybridization, have allowed the detection of genetic variation in bacterial communities without prior cultivation. These methods often produce data in the form of the presence or absence of alleles or genotypes, however, rather than counts of alleles. Using relative allele frequencies from presence-absence data as estimates of population allele frequencies tends to underestimate the frequencies of common alleles and overestimate those of rare ones, potentially biasing the results of a test of neutrality in favor of balancing selection. In this study, a maximum-likelihood estimator (MLE) of bacterial allele frequencies designed for use with presence-absence data is derived using an explicit stochastic model of the host infection (or bacterial sampling) process. The performance of the MLE is evaluated using computer simulation and a method is presented for evaluating the fit of estimated allele frequencies to the neutral infinite alleles model (IAM). The methods are applied to estimate allele frequencies at two outer surface protein loci (ospA and ospC) of the Lyme disease spirochete, Borrelia burgdorferi, infecting local populations of deer ticks (Ixodes scapularis) and to test the fit to a neutral IAM.  相似文献   

10.
Aggregate, or explosive, breeding is widespread among vertebrates and likely increases the probability of multiple paternity. We assessed paternity in seven field-collected clutches of the explosively breeding spotted salamander (Ambystoma maculatum) using 10 microsatellite loci to determine the frequency of multiple paternity and the number of males contributing to a female's clutch. Using the Minimum Method of allele counts, multiple paternity was evident in 70% of these egg masses. Simple allele counts underestimate the number of contributing males because this method cannot distinguish multiple fathers with common or similar alleles. Therefore, we used computer simulations to estimate from the offspring genotypes the most likely number of contributing fathers given the distributions of allele frequencies in this population. We determined that two to eight males may contribute to A. maculatum clutches; therefore, multiple paternity is a common strategy in this aggregate breeding species. In aggregate mating systems competition for mates can be intense, thus differential reproductive success (reproductive skew) among males contributing to a female's clutch could be a probable outcome. We use our data to evaluate the potential effect of reproductive skew on estimates of the number of contributing males. We simulated varying scenarios of differential male reproductive success, ranging from equal contribution to high reproductive skew among contributing sires in multiply sired clutches. Our data suggest that even intermediate levels of reproductive skew decrease confidence substantially in estimates of the number of contributing sires when parental genotypes are unknown.  相似文献   

11.
Studies on Persea americana have been addressed in different ways with biochemical and molecular techniques. Microsatellites are able to detect multiple alleles for particular loci and are therefore a useful tool to study genealogical relationships, population structures and genetic mapping. Ninety-six samples from 49 cultivars including three horticultural groups and hybrids were collected from the avocado germplasm bank at INIA-CENIAP (Venezuela). A modified DNA extraction protocol was performed. Forty microsatellites were selected from previous references, PCR amplifications were performed, and presence/absence, size, and number of alleles were evaluated on polyacrylamide gels. Attributes for polymorphic alleles were analyzed with POPGENE, and genetic diversity was calculated by effective sample size, number of alleles per locus (Na), effective number of alleles (Ne), Shannon information index (In), observed heterozygosis (H), expected heterozygosity (He), Wright’s fixation index (Fis), and allele frequencies. Only 14 primers were amplified, and AVT106 primer resulted monomorphic. Unique genotypes for each sample were obtained. Nine loci showed allele patterns that can be useful for taxonomic identification of cultivars or varieties. Comparing values of Fis with Ho and He, we found a direct relationship where low heterozygosis alleles identified in the population may affect the expected level. Allele frequencies ranged from 0.5632 to 0.0105. For all loci, at least one rare allele was observed. With the available information from genetic analysis, an identifying system was implemented for selected avocado cultivars maintained at the INIA-CENIAP Venezuelan germplasm bank on the basis of molecular data.  相似文献   

12.
Nuclear SSRs are notorious for having relatively high frequencies of null alleles, i.e. alleles that fail to amplify and are thus recessive and undetected in heterozygotes. In this paper, we compare two kinds of approaches for estimating null allele frequencies at seven nuclear microsatellite markers in three French Fagus sylvatica populations: (1) maximum likelihood methods that compare observed and expected homozygote frequencies in the population under the assumption of Hardy-Weinberg equilibrium and (2) direct null allele frequency estimates from progeny where parent genotypes are known. We show that null allele frequencies are high in F. sylvatica (7.0% on average with the population method, 5.1% with the progeny method), and that estimates are consistent between the two approaches, especially when the number of sampled maternal half-sib progeny arrays is large. With null allele frequencies ranging between 5% and 8% on average across loci, population genetic parameters such as genetic differentiation (F ST) may be mostly unbiased. However, using markers with such average prevalence of null alleles (up to 15% for some loci) can be seriously misleading in fine scale population studies and parentage analysis.  相似文献   

13.
Populations of Plasmodium falciparum show striking differences in linkage disequilibrium, population differentiation and diversity, but only fragmentary data exists on the genetic structure of Plasmodium vivax. We genotyped nine tandem repeat loci bearing 2-8 bp motifs from 345 P. vivax infections collected from three Asian countries and from five locations in Colombia. We observed 9-37 alleles per locus and high diversity (He=0.72-0.79, mean=0.75) in all countries. Numbers of multiple clone infections varied considerably: these were rare in Colombia and India, but > 60% of isolates carried multiple alleles in at least one locus in Thailand and Laos. However, only one or two of the nine loci show >1 allele in many samples, suggesting that mutation within infections may result in overestimation of true multiple carriage rates. Identical nine-locus genotypes were frequently found in Colombian populations, contributing to strong linkage disequilibrium. These identical genotypes were strongly clustered in time, consistent with epidemic transmission of clones and subsequent breakdown of allelic associations, suggesting high rates of inbreeding and low effective recombination rates in this country. In contrast, identical genotypes were rare and loci were randomly associated in all three Asian populations, consistent with higher rates of outcrossing and recombination. We observed low but significant differentiation between different Asian countries (standardized FST = 0.13-0.45). In comparison, we see greater differentiation between collection locations within Colombia (standardized FST = 0.4-0.7), and strong differentiation between continents (standardized FST = 0.48-0.79). The observed heterogeneity in multiple clone carriage rates, linkage disequilibrium and population differentiation are similar in some, but not all, respects to those observed in P. falciparum, and have important implications for the design of association mapping studies, and interpretation of P. vivax epidemiology.  相似文献   

14.
ABSTRACT: BACKGROUND: Trait variances among genotype groups at a locus are expected to differ in the presence of an interaction between this locus and another locus or environment. A simple maximum test on variance heterogeneity can thus be used to identify potentially interacting single nucleotide polymorphisms (SNPs). RESULTS: We propose a multiple contrast test for variance heterogeneity that compares the mean of Levene residuals for each genotype group with their average as an alternative to a global Levene test. We applied this test to a Bogalusa Heart Study dataset to screen for potentially interacting SNPs across the whole genome that influence a number of quantitative traits. A user-friendly implementation of this method is available in the R statistical software package multcomp. CONCLUSIONS: We show that the proposed multiple contrast test of model-specific variance heterogeneity can be used to test for potential interactions between SNPs and unknown alleles, loci or covariates and provide valuable additional information compared with traditional tests. Although the test is statistically valid for severely unbalanced designs, care is needed in interpreting the results at loci with low allele frequencies.  相似文献   

15.
Various spatial autocorrelation statistics have been widely used both in theoretical population genetics and to study the spatial distribution of diploid genotypes in many plant and animal populations. However, previous simulation studies have considered only diallelic loci. In this paper, we use a large number of space-time simulations to characterize for the first time the parametric and statistical values of Moran's I-statistics for converted individual genotypes as well as for join-count statistics. A wide range of levels of dispersal and numbers of alleles and allele frequencies are modelled and the results reveal the different general effects of each of these factors on these statistics. We also examine the range of appropriate sampling designs and sizes for which predicted values can be interpolated for specific sampling schemes for any given population genetic field survey. Numbers of alleles and allele frequencies each affect some statistics but not others. The results indicate generally low standard deviations. The results also develop precise and efficient methods of estimating gene dispersal, based on the various autocorrelation measures of standing spatial patterns of genetic variation within populations. The results also extend these methods to loci with multiple alleles, typical of those studied through modern molecular methods.  相似文献   

16.
ABSTRACT Use of non-invasive sources of DNA, such as hair or scat, to obtain a genetic mark for population estimates is becoming commonplace. Unfortunately, with such marks, potentials for genotyping errors and for the shadow effect have resulted in use of many loci and amplification of each specimen many times at each locus, drastically increasing time and cost of obtaining a population estimate. We proposed a method, the Genotyping Uncertainty Added Variance Adjustment (GUAVA), which statistically adjusts for genotyping errors and the shadow effect, thereby allowing use of fewer loci and one amplification of each specimen per locus. Using allele frequencies and estimates of genotyping error rates, we determined, for each pair of specimens, the probability that the pair was obtained from the same individual, whether or not their observed genotypes match. Using these probabilities, we reconstructed possible capture history matrices and used this distribution to obtain a population estimate. With simulated data, we consistently found our estimates had lower bias and smaller variance than estimates based on single amplifications in which genotyping error was ignored and that were comparable to estimates based on data free of genotyping errors. We also demonstrated the method on a fecal DNA data set from a population of red wolves (Canis rufus). The GUAVA estimate based on only one amplification genotypes compares favorably to the estimate based on consensus genotypes. A program to conduct the analysis is available from the first author for UNIX or Windows platforms. Application of GUAVA may allow for increased accuracy in population estimates at reduced cost.  相似文献   

17.
Duplicated loci, for example those associated with major histocompatibility complex (MHC) genes, often have similar DNA sequences that can be coamplified with a pair of primers. This results in genotyping difficulties and inaccurate analyses. Here, we present a method to assign alleles to different loci in amplifications of duplicated loci. This method simultaneously considers several factors that may each affect correct allele assignment. These are the sharing of identical alleles among loci, null alleles, copy number variation, negative amplification, heterozygote excess or heterozygote deficiency, and linkage disequilibrium. The possible multilocus genotypes are extracted from the alleles for each individual and weighted to estimate the allele frequencies. The likelihood of an allele configuration is calculated and is optimized with a heuristic algorithm. Monte‐Carlo simulations and three empirical MHC data sets are used as examples to evaluate the efficacy of our method under different conditions. Our new software, mhc‐typer V1.1, is freely available at https://github.com/huangkang1987/mhc-typer .  相似文献   

18.
Case-control studies are used to map loci associated with a genetic disease. The usual case-control study tests for significant differences in frequencies of alleles at marker loci. In this paper, we consider the problem of comparing two or more marker loci simultaneously and testing for significant differences in haplotype rather than allele frequencies. We consider two situations. In the first, genotypes at marker loci are resolved into haplotypes by making use of biochemical methods or by genotyping family members. In the second, genotypes at marker loci are not resolved into haplotypes, but, by assuming random mating, haplotypes can be inferred using a likelihood method such as the expectation-maximization (EM) algorithm. We assume that a causative locus has two alleles with a multiplicative effect on the penetrance of a disease, with one allele increasing the penetrance by a factor pi. We find, for small values of pi-1 and large sample sizes, asymptotic results that predict the statistical power of a test for significant differences in haplotype frequencies between cases and a random sample of the population, both when haplotypes can be resolved and when haplotypes have to be inferred. The increase in power when haplotypes can be resolved can be expressed as a ratio R, which is the increase in sample size needed to achieve the same power when haplotypes are resolved over when they are not resolved. In general, R depends on the pattern of linkage disequilibrium between the causative allele and the marker haplotypes but is independent of the frequency of the causative allele and, to a first approximation, is independent of pi. For the special situation of two di-allelic marker loci, we obtain a simple expression for R and its upper bound.  相似文献   

19.
Li WH 《Genetics》1978,90(2):349-382
Formulae are developed for the distribution of allele frequencies (the frequency spectrum), the mean number of alleles in a sample, and the mean and variance of heterozygosity under mutation pressure and under either genic or recessive selection. Numerical computations are carried out by using these formulae and Watterson's (1977) formula for the distribution of allele frequencies under overdominant selection. The following properties are observed: (1) The effect of selection on the distribution of allele frequencies is slight when 4Ns 相似文献   

20.
Collecting faeces is viewed as a potentially efficient way to sample elusive animals. Nonetheless, any biases in estimates of population composition associated with such sampling remain uncharacterized. The goal of this study was to compare estimates of genetic composition and sex ratio derived from Eurasian otter Lutra lutra spraints (faeces) with estimates derived from carcasses. Twenty per cent of 426 wild-collected spraints from SW England yielded composite genotypes for 7-9 microsatellites and the SRY gene. The expected number of incorrect spraint genotypes was negligible, given the proportions of allele dropout and false allele detection estimated using paired blood and spraint samples of three captive otters. Fifty-two different spraint genotypes were detected and compared with genotypes of 70 otter carcasses from the same area. Carcass and spraint genotypes did not differ significantly in mean number of alleles, mean unbiased heterozygosity or sex ratio, although statistical power to detect all but large differences in sex ratio was low. The genetic compositions of carcass and spraint genotypes were very similar according to confidence intervals of theta and two methods for assigning composite genotypes to groups. A distinct group of approximately 11 carcass and spraint genotypes was detected using the latter methods. The results suggest that spraints can yield unbiased estimates of population genetic composition and sex ratio.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号