首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Null alleles are alleles that for various reasons fail to amplify in a PCR assay. The presence of null alleles in microsatellite data is known to bias the genetic parameter estimates. Thus, efficient detection of null alleles is crucial, but the methods available for indirect null allele detection return inconsistent results. Here, our aim was to compare different methods for null allele detection, to explain their respective performance and to provide improvements. We applied several approaches to identify the ‘true’ null alleles based on the predictions made by five different methods, used either individually or in combination. First, we introduced simulated ‘true’ null alleles into 240 population data sets and applied the methods to measure their success in detecting the simulated null alleles. The single best‐performing method was ML‐NullFreq_frequency. Furthermore, we applied different noise reduction approaches to improve the results. For instance, by combining the results of several methods, we obtained more reliable results than using a single one. Rule‐based classification was applied to identify population properties linked to the false discovery rate. Rules obtained from the classifier described which population genetic estimates and loci characteristics were linked to the success of each method. We have shown that by simulating ‘true’ null alleles into a population data set, we may define a null allele frequency threshold, related to a desired true or false discovery rate. Moreover, using such simulated data sets, the expected null allele homozygote frequency may be estimated independently of the equilibrium state of the population.  相似文献   

2.
The advances made in statistical methods to detect selection from DNA sequence variation has resulted in an enormous increase in the number of studies reporting positive selection. However, a disadvantage of such statistical tests is that often no insight into the actual source of selection is obtained. Finer understanding of evolution can be obtained when those statistical tests are combined with field observations on allele frequencies. We assessed whether the metallothionein (mt) gene of Orchesella cincta (Collembola), which codes for a metal-binding protein, is subject to selection, by investigating alleles and allele frequencies among European metal-stressed and reference populations. Eight highly divergent alleles were resolved in Northwest Europe. At the nucleotide level, a total of 51 polymorphic sites (five of them implying amino-acid changes) were observed. Although statistical tests applied to the sequences alone showed no indication of selection, a G-test rejected the null hypothesis that alleles are homogeneously distributed over metal-stressed and reference populations. Analysis of molecular variance assigned a small, but significant amount of the total variance to differences between metal-stressed and non-stressed populations. In addition, it was shown that metal-stressed populations tend to be more genetically diversified at this locus than non-stressed ones. These results suggest that the mt gene and its surrounding DNA region are affected by environmental metal contamination. This study illustrates that, in addition to statistical tests, field observations on allele frequencies are needed to gain understanding of selection and adaptive evolution.  相似文献   

3.
Microsatellite loci are widely used in population genetic studies, but the presence of null alleles may lead to biased results. Here, we assessed five methods that indirectly detect null alleles and found large inconsistencies among them. Our analysis was based on 20 microsatellite loci genotyped in a natural population of Microtus oeconomus sampled during 8 years, together with 1200 simulated populations without null alleles, but experiencing bottlenecks of varying duration and intensity, and 120 simulated populations with known null alleles. In the natural population, 29% of positive results were consistent between the methods in pairwise comparisons, and in the simulated data set, this proportion was 14%. The positive results were also inconsistent between different years in the natural population. In the null‐allele‐free simulated data set, the number of false positives increased with increased bottleneck intensity and duration. We also found a low concordance in null allele detection between the original simulated populations and their 20% random subsets. In the populations simulated to include null alleles, between 22% and 42% of true null alleles remained undetected, which highlighted that detection errors are not restricted to false positives. None of the evaluated methods clearly outperformed the others when both false‐positive and false‐negative rates were considered. Accepting only the positive results consistent between at least two methods should considerably reduce the false‐positive rate, but this approach may increase the false‐negative rate. Our study demonstrates the need for novel null allele detection methods that could be reliably applied to natural populations.  相似文献   

4.
Microsatellite null alleles are found to a varying degree across all taxa. They are problematic as they may inflate measures of genetic differentiation and create false homozygotes. Although there are several methods for correcting allele frequencies for null alleles and enable estimations of F(ST), much less is known about how null alleles affect assignment testing. Data presented here, based on simulations, show that the percentage of correctly assigned individuals in model-based clustering and Bayesian assignment methods were slightly, though significantly, reduced in the presence of null alleles (frequency range from 0.000 to 0.913). The bias in assignment tests caused by null alleles lead to a slight reduction in the power to correctly assigned individuals (0.2 and 1.0 percent units for STRUCTURE- and 2.4 percent units for GENECLASS-based assignment tests). Further, the presence of null alleles caused a small, however, significant overestimation of F(ST). Consequently, microsatellite loci affected by null alleles would probably not alter the overall outcome of assignment testing and could therefore be included in these types of studies. Nevertheless, loci prone to null alleles should be used with caution as they lower the power of assignment tests and alter the accuracy of F(ST), and loci less prone to null alleles should always be preferred.  相似文献   

5.
微卫星分子标记因其开发便捷、突变率高、成本较低等优势一直被广泛应用于群体遗传学、保护生物学和分子生态学研究中。近年来二代测序技术、多重PCR方法以及毛细管电泳等新技术的发展和完善,极大地提高了微卫星分子标记的开发和使用效率并降低了使用成本。但是在开展微卫星实验过程中普遍存在的无效等位基因(或称为哑等位基因,null alleles)会对研究结果造成偏差,是微卫星分子标记应用中的最大缺陷之一。然而,长期以来无效等位基因的检测问题并未受到研究者的足够重视。本文通过对国内外相关文献查阅,在对无效等位基因有一个较为深入和全面认识的基础上,对目前无效等位基因的主要检测方法进行全面的介绍和深入的比较。最后,结合研究实例总结出植物微卫星分子标记研究中无效等位基因检测的有效办法。  相似文献   

6.
Various tests of the hypothesis of selective neutrality based on gene frequency are now available. These tests take as null hypothesis the concept of “strict neutrality”: all new mutants are required to be selectively identical to each other. For evolutionary questions, however, (as opposed to those of genetic polymorphism), a wider null hypothesis might be of interest. Since deleterious alleles have essentially no evolutionary importance, one might wish to test the null hypothesis that only neutral or deleterious mutations occur. The principal alternative to this hypothesis is that there exists heterotic selection of some form for some alleles tending to maintain a level of genetic polymorphism higher than that under neutrality. In this paper an assessment is made of the usefulness of a test of strict neutrality first proposed by this author (Ewens, 1972) as a test of null hypothesis of “generalized neutrality,” i.e. that only neutral or deleterious alleles occur. At the same time some remarks will be made about estimation of the fundamental parameter θ defining these processes.  相似文献   

7.
Gene set enrichment tests (a.k.a. functional enrichment analysis) are among the most frequently used methods in computational biology. Despite this popularity, there are concerns that these methods are being applied incorrectly and the results of some peer-reviewed publications are unreliable. These problems include the use of inappropriate background gene lists, lack of false discovery rate correction and lack of methodological detail. To ascertain the frequency of these issues in the literature, we performed a screen of 186 open-access research articles describing functional enrichment results. We find that 95% of analyses using over-representation tests did not implement an appropriate background gene list or did not describe this in the methods. Failure to perform p-value correction for multiple tests was identified in 43% of analyses. Many studies lacked detail in the methods section about the tools and gene sets used. An extension of this survey showed that these problems are not associated with journal or article level bibliometrics. Using seven independent RNA-seq datasets, we show misuse of enrichment tools alters results substantially. In conclusion, most published functional enrichment studies suffered from one or more major flaws, highlighting the need for stronger standards for enrichment analysis.  相似文献   

8.
Many of malaria's signs and symptoms are indistinguishable from those of other febrile diseases. Detection of the presence of Plasmodium parasites is essential, therefore, to guide case management. Improved diagnostic tools are required to enable targeted treatment of infected individuals. In addition, field-ready diagnostic tools for mass screening and surveillance that can detect asymptomatic infections of very low parasite densities are needed to monitor transmission reduction and ensure elimination. Antibody-based tests for infection and novel methods based on biomarkers need further development and validation, as do methods for the detection and treatment of Plasmodium vivax. Current rapid diagnostic tests targeting P. vivax are generally less effective than those targeting Plasmodium falciparum. Moreover, because current drugs for radical cure may cause serious side effects in patients with glucose-6-phosphate dehydrogenase (G6PD) deficiency, more information is needed on the distribution of G6PD-deficiency variants as well as tests to identify at-risk individuals. Finally, in an environment of very low or absent malaria transmission, sustaining interest in elimination and maintaining resources will become increasingly important. Thus, research is required into the context in which malaria diagnostic tests are used, into diagnostics for other febrile diseases, and into the integration of these tests into health systems.  相似文献   

9.
Testing for random mating of a population is important in population genetics, because deviations from randomness of mating may indicate inbreeding, population stratification, natural selection, or sampling bias. However, current methods use only observed numbers of genotypes and alleles, and do not take advantage of the fact that the advent of sequencing technology provides an opportunity to investigate this topic in unprecedented detail. To address this opportunity, a novel statistical test for random mating is required in population genomics studies for which large sequencing datasets are generally available. Here, we propose a Monte-Carlo-based-permutation test (MCP) as an approach to detect random mating. Computer simulations used to evaluate the performance of the permutation test indicate that its type I error is well controlled and that its statistical power is greater than that of the commonly used chi-square test (CHI). Our simulation study shows the power of our test is greater for datasets characterized by lower levels of migration between subpopulations. In addition, test power increases with increasing recombination rate, sample size, and divergence time of subpopulations. For populations exhibiting limited migration and having average levels of population divergence, the statistical power approaches 1 for sequences longer than 1Mbp and for samples of 400 individuals or more. Taken together, our results suggest that our permutation test is a valuable tool to detect random mating of populations, especially in population genomics studies.  相似文献   

10.
Meirmans PG 《Molecular ecology》2012,21(12):2839-2846
The genetic population structure of many species is characterised by a pattern of isolation by distance (IBD): due to limited dispersal, individuals that are geographically close tend to be genetically more similar than individuals that are far apart. Despite the ubiquity of IBD in nature, many commonly used statistical tests are based on a null model that is completely non-spatial, the Island model. Here, I argue that patterns of spatial autocorrelation deriving from IBD present a problem for such tests as it can severely bias their outcome. I use simulated data to illustrate this problem for two widely used types of tests: tests of hierarchical population structure and the detection of loci under selection. My results show that for both types of tests the presence of IBD can indeed lead to a large number of false positives. I therefore argue that all analyses in a study should take the spatial dependence in the data into account, unless it can be shown that there is no spatial autocorrelation in the allele frequency distribution that is under investigation. Thus, it is urgent to develop additional statistical approaches that are based on a spatially explicit null model instead of the non-spatial Island model.  相似文献   

11.
Microsatellite markers are important tools in population, conservation and forensic studies and are frequently used for species delineation, the detection of hybridization and introgression. Therefore, marker sets that amplify variable DNA regions in two species are required; however, cross-species amplification is often difficult, as genotyping errors such as null alleles may occur. To estimate the level of potential misidentifications based on genotyping errors, we compared the occurrence of parental alleles in laboratory and natural Daphnia hybrids (Daphnia longispina group). We tested a set of 12 microsatellite loci with regard to their suitability for unambiguous species and hybrid class identification using F(1) hybrids bred in the laboratory. Further, a large set of 44 natural populations of D. cucullata, D. galeata and D. longispina (1715 individuals) as well as their interspecific hybrids were genotyped to validate the discriminatory power of different marker combinations. Species delineation using microsatellite multilocus genotypes produced reliable results for all three studied species using assignment tests. Daphnia galeata × cucullata hybrid detection was limited due to three loci exhibiting D. cucullata-specific null alleles, which most likely are caused by differences in primer-binding sites of parental species. Overall, discriminatory power in hybrid detection was improved when a subset of markers was identified that amplifies equally well in both species.  相似文献   

12.
Observed variations in rates of taxonomic diversification have been attributed to a range of factors including biological innovations, ecosystem restructuring, and environmental changes. Before inferring causality of any particular factor, however, it is critical to demonstrate that the observed variation in diversity is significantly greater than that expected from natural stochastic processes. Relative tests that assess whether observed asymmetry in species richness between sister taxa in monophyletic pairs is greater than would be expected under a symmetric model have been used widely in studies of rate heterogeneity and are particularly useful for groups in which paleontological data are problematic. Although one such test introduced by Slowinski and Guyer a decade ago has been applied to a wide range of clades and evolutionary questions, the statistical behavior of the test has not been examined extensively, particularly when used with Fisher's procedure for combining probabilities to analyze data from multiple independent taxon pairs. Here, certain pragmatic difficulties with the Slowinski-Guyer test are described, further details of the development of a recently introduced likelihood-based relative rates test are presented, and standard simulation procedures are used to assess the behavior of the two tests in a range of situations to determine: (1) the accuracy of the tests' nominal Type I error rate; (2) the statistical power of the tests; (3) the sensitivity of the tests to inclusion of taxon pairs with few species; (4) the behavior of the tests with datasets comprised of few taxon pairs; and (5) the sensitivity of the tests to certain violations of the null model assumptions. Our results indicate that in most biologically plausible scenarios, the likelihood-based test has superior statistical properties in terms of both Type I error rate and power, and we found no scenario in which the Slowinski-Guyer test was distinctly superior, although the degree of the discrepancy varies among the different scenarios. The Slowinski-Guyer test tends to be much more conservative (i.e., very disinclined to reject the null hypothesis) in datasets with many small pairs. In most situations, the performance of both the likelihood-based test and particularly the Slowinski-Guyer test improve when pairs with few species are excluded from the computation, although this is balanced against a decline in the tests' power and accuracy as fewer pairs are included in the dataset. The performance of both tests is quite poor when they are applied to datasets in which the taxon sizes do not conform to the distribution implied by the usual null model. Thus, results of analyses of taxonomic rate heterogeneity using the Slowinski-Guyer test can be misleading because the test's ability to reject the null hypothesis (equal rates) when true is often inaccurate and its ability to reject the null hypothesis when the alternative (unequal rates) is true is poor, particularly when small taxon pairs are included. Although not always perfect, the likelihood-based test provides a more accurate and powerful alternative as a relative rates test.  相似文献   

13.
Quantifying patterns of temporal trends in species assemblages is an important analytical challenge in community ecology. We describe methods of analysis that can be applied to a matrix of counts of individuals that is organized by species (rows) and time-ordered sampling periods (columns). We first developed a bootstrapping procedure to test the null hypothesis of random sampling from a stationary species abundance distribution with temporally varying sampling probabilities. This procedure can be modified to account for undetected species. We next developed a hierarchical model to estimate species-specific trends in abundance while accounting for species-specific probabilities of detection. We analysed two long-term datasets on stream fishes and grassland insects to demonstrate these methods. For both assemblages, the bootstrap test indicated that temporal trends in abundance were more heterogeneous than expected under the null model. We used the hierarchical model to estimate trends in abundance and identified sets of species in each assemblage that were steadily increasing, decreasing or remaining constant in abundance over more than a decade of standardized annual surveys. Our methods of analysis are broadly applicable to other ecological datasets, and they represent an advance over most existing procedures, which do not incorporate effects of incomplete sampling and imperfect detection.  相似文献   

14.
Simultaneous estimation of null alleles and inbreeding coefficients   总被引:1,自引:0,他引:1  
Although microsatellites are a very efficient tool for many population genetics applications, they may occasionally produce "null" alleles, which, when present in high proportion, may affect estimates of key parameters such as inbreeding and relatedness coefficients or measures of genetic differentiation. In order to account for the presence of null alleles, it is first necessary to estimate their frequency within studied populations. However, the commonly used null allele frequency estimators are not of general applicability because they can produce upwardly biased estimates when a population under study experiences some inbreeding. In such a case, 2 formerly described approaches, population inbreeding model and individual inbreeding model, can be applied for simultaneous estimation of null allele frequencies and of the inbreeding coefficient. In this study, we demonstrate the properties and utility of these 2 methods and show that they outperform the commonly used approaches in the estimation of null allele frequencies based on genotypic data. The methods are applied to empirical data from a natural population of European beech (Fagus sylvatica L.), and results are briefly discussed. The methods presented in this paper are implemented in the Windows-based user-friendly INEST computer program (available free of charge at http://genetyka.ukw.edu.pl/INEst10_setup.exe).  相似文献   

15.
Short VNTR alleles that go undetected after conventional Southern blot hybridization may constitute an alternative explanation for the heterozygosity deficiency observed at some minisatellite loci. To examine this hypothesis, we have employed a screening procedure based on PCR amplification of those individuals classified as homozygotes in our databases for the loci D1S7, D7S21, and D12S11. The results obtained indicate that the frequency of these short alleles is related to the heterozygosity deficiency observed. For the most polymorphic locus, D1S7, approximately 60% of those individuals previously classified as homozygotes were in fact heterozygotes for a short allele. After the inclusion of these new alleles, the agreement between observed and expected heterozygosity, along with other statistical tests employed, provide additional evidence for lack of population substructuring. Comparisons of allele frequency distributions reveal greater differences between racial groups than between closely related populations.  相似文献   

16.
Klasen JR  Piepho HP  Stich B 《Heredity》2012,108(6):626-632
A major goal of today's biology is to understand the genetic basis of quantitative traits. This can be achieved by statistical methods that evaluate the association between molecular marker variation and phenotypic variation in different types of mapping populations. The objective of this work was to evaluate the statistical power of quantitative trait loci (QTL) detection of various multi-parental mating designs, as well as to assess the reasons for the observed differences. Our study was based on an empirical data of 20 Arabidopsis thaliana accessions, which have been selected to capture the maximum genetic diversity. The examined mating designs differed strongly with respect to the statistical power to detect QTL. We observed the highest power to detect QTL for the diallel cross with random mating design. The results of our study suggested that performing sibling mating within subpopulations of joint-linkage mapping populations has the potential to considerably increase the power for QTL detection. Our results, however, revealed that using designs in which more than two parental alleles segregate in each subpopulation increases the power even more.  相似文献   

17.
In allometry, researchers are commonly interested in estimating the slope of the major axis or standardized major axis (methods of bivariate line fitting related to principal components analysis). This study considers the robustness of two tests for a common slope amongst several axes. It is of particular interest to measure the robustness of these tests to slight violations of assumptions that may not be readily detected in sample datasets. Type I error is estimated in simulations of data generated with varying levels of nonnormality, heteroscedasticity and nonlinearity. The assumption failures introduced in simulations were difficult to detect in a moderately sized dataset, with an expert panel only able to correct detect assumption violations 34-45% of the time. While the common slope tests were robust to nonnormal and heteroscedastic errors from the line, Type I error was inflated if the two variables were related in a slightly nonlinear fashion. Similar results were also observed for the linear regression case. The common slope tests were more liberal when the simulated data had greater nonlinearity, and this effect was more evident when the underlying distribution had longer tails than the normal. This result raises concerns for common slopes testing, as slight nonlinearities such as those in simulations are often undetectable in moderately sized datasets. Consequently, practitioners should take care in checking for nonlinearity and interpreting the results of a test for common slope. This work has implications for the robustness of inference in linear models in general.  相似文献   

18.
A method is described to discover if a gene carries one or more allelic mutations that confer risk for any specified common disease. The method does not depend upon genetic linkage of risk-conferring mutations to high frequency genetic markers such as single nucleotide polymorphisms. Instead, the sums of allelic mutation frequencies in case and control cohorts are determined and a statistical test is applied to discover if the difference in these sums is greater than would be expected by chance. A statistical model is presented that defines the ability of such tests to detect significant gene-disease relationships as a function of case and control cohort sizes and key confounding variables: zygosity and genicity, environmental risk factors, errors in diagnosis, limits to mutant detection, linkage of neutral and risk-conferring mutations, ethnic diversity in the general population and the expectation that among all exonic mutants in the human genome greater than 90% will be neutral with regard to any effect on disease risk. Means to test the null hypothesis for, and determine the statistical power of, each test are provided. For this "cohort allelic sums test" or "CAST", the statistical model and test are provided as an Excel program, CASTAT(c) at . Based on genetics, technology and statistics, a strategy of enumerating the mutant alleles carried in the exons and splice sites of the estimated approximately 25,000 human genes in case cohort samples of 10,000 persons for each of 100 common diseases is proposed and evaluated: A wide range of possible conditions of multi-allelic or mono-allelic and monogenic, multigenic or polygenic (including epistatic) risk are found to be detectable using the statistical criteria of 1 or 10 "false positive" gene associations approximately 25,000 gene-disease pair-wise trials and a statistical power of >0.8. Using estimates of the distribution of both neutral and gene-inactivating nondeleterious mutations in humans and the sensitivity of the test to multigenic or multicausal risk, it is estimated that about 80% of nullizygous, heterozygous and functionally dominant gene-common disease associations may be discovered. Limitations include relative insensitivity of CAST to about 60% of possible associations given homozygous (wild type) risk and, more rarely, other stochastic limits when the frequency of mutations in the case cohort approaches that of the control cohort and biases such as absence of genetic risk masked by risk derived from a shared cultural environment.  相似文献   

19.
Studies of genetics and ecology often require estimates of relatedness coefficients based on genetic marker data. However, with the presence of null alleles, an observed genotype can represent one of several possible true genotypes. This results in biased estimates of relatedness. As the numbers of marker loci are often limited, loci with null alleles cannot be abandoned without substantial loss of statistical power. Here, we show how loci with null alleles can be incorporated into six estimators of relatedness (two novel). We evaluate the performance of various estimators before and after correction for null alleles. If the frequency of a null allele is <0.1, some estimators can be used directly without adjustment; if it is >0.5, the potency of estimation is too low and such a locus should be excluded. We make available a software package entitled PolyRelatedness v1.6, which enables researchers to optimize these estimators to best fit a particular data set.  相似文献   

20.
Hirotsu C  Aoki S  Inada T  Kitao Y 《Biometrics》2001,57(3):769-778
The association analysis between the disease and genetic alleles is one of the simple methods for localizing the susceptibility locus in the genes. For revealing the association, several statistical tests have been proposed without discussing explicitly the alternative hypotheses. We therefore specify two types of alternative hypotheses (i.e., there is only one susceptibility allele in the locus, and there is an extension or shortening of alleles associated with the disease) and derive exact tests for the respective hypotheses. We also propose to combine these two tests when the prior knowledge is not sufficient enough to specify one of these two hypotheses. In particular, these ideas are extended to the haplotype analysis of three-way association between the disease and bivariate allele frequencies at two closely linked loci. As a by-product, a factorization of the probability distribution of the three-way cell frequencies under the null hypothesis of no three-way interaction is obtained.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号