首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Information on statistical power is critical when planning investigations and evaluating empirical data, but actual power estimates are rarely presented in population genetic studies. We used computer simulations to assess and evaluate power when testing for genetic differentiation at multiple loci through combining test statistics or P values obtained by four different statistical approaches, viz. Pearson's chi-square, the log-likelihood ratio G-test, Fisher's exact test, and an F(ST)-based permutation test. Factors considered in the comparisons include the number of samples, their size, and the number and type of genetic marker loci. It is shown that power for detecting divergence may be substantial for frequently used sample sizes and sets of markers, also at quite low levels of differentiation. The choice of statistical method may be critical, though. For multi-allelic loci such as microsatellites, combining exact P values using Fisher's method is robust and generally provides a high resolving power. In contrast, for few-allele loci (e.g. allozymes and single nucleotide polymorphisms) and when making pairwise sample comparisons, this approach may yield a remarkably low power. In such situations chi-square typically represents a better alternative. The G-test without Williams's correction frequently tends to provide an unduly high proportion of false significances, and results from this test should be interpreted with great care. Our results are not confined to population genetic analyses but applicable to contingency testing in general.  相似文献   

2.
Zhu C  Zhang R 《Heredity》2007,98(6):401-410
The triple test cross (TTC) is an experimental design for detecting epistasis and estimating the components of genetic variance for quantitative traits. In this paper, we extend the analysis to include molecular information. The statistical power of the mating design was assessed under a model assuming that a finite number of loci affect the trait in question. Formulae are developed for the analysis with or without marker information relating to the recombination fraction between loci, the genetical properties of quantitative trait controlled by the quantitative trait loci (QTL), the linkage phases of the parents and population size. Application of these formulae showed that the recombination fraction between genes and the magnitude and the types of epistasis have important interactions in their effects on power. The results demonstrate that the TTC may have increased power to detect epistasis when marker information is present. However, the simulation experiments show that the standard deviation of the estimated expected mean square was higher with one marker than that with two, whereas the corresponding value without marker information was the lowest. In addition, we demonstrate that the relative position of QTL and markers and the number of markers can both affect the power of epistatic detection.  相似文献   

3.
Genetic studies of sexual isolation in Drosophila have generally failed to fully evaluate the effects of their sample size and recombination between markers on their conclusions. In this study we evaluate recombinational distances between markers in Drosophila pseudoobscura and D. persimilis, a species pair in which numerous genetic mapping studies have been performed. We conclude that, contrary to assertions, the inversions that distinguish these two species still allow for much recombination within most of their chromosome arms in F1 hybrid females. Such recombination may have caused previous mapping studies in these species to miss (or grossly underestimate) the effects of several genomic regions. We also evaluate the effects of sample size and recombination on genetic studies of sexual isolation in other Drosophila species groups. We conclude that some of these studies may have been heavily biased toward detecting only genes of large effect. Future studies of sexual isolation should be preceded by detailed statistical power analyses that determine the effects of recombination and sample size in the species pair being studied to avoid these complications.  相似文献   

4.
Dole J  Weber DF 《Genetics》2007,177(4):2309-2319
The genetic basis of variation in recombination in higher plants is polygenic and poorly understood, despite its theoretical and practical importance. Here a method of detecting quantitative trait loci (QTL) influencing recombination in recombinant inbred lines (RILs) is proposed that relies upon the fact that genotype data within RILs carry the signature of past recombination. Behavior of the segregational genetic variance in numbers of chromosomal crossovers (recombination) over generations is described for self-, full-sib-, and half-sib-generated RILs with no dominance in true crossovers. This genetic variance, which as a fraction of the total phenotypic variance contributes to the statistical power of the method, was asymptotically greatest with half sibbing, less with sibbing, and least with selfing. The statistical power to detect a recombination QTL declined with diminishing QTL effect, genome target size, and marker density. For reasonably tight marker linkage power was greater with less intense inbreeding for later generations and vice versa for early generations. Generational optima for segregation variance and statistical power were found, whose onset and narrowness varied with marker density and mating design, being more pronounced for looser marker linkage. Application of this method to a maize RIL population derived from inbred lines Mo17 and B73 and developed by selfing suggested two putative QTL (LOD > 2.4) affecting certain chromosomes, and using a canonical transformation another putative QTL was detected. However, permutation tests failed to support their presence (experimentwise alpha = 0.05). Other populations with more statistical power and chosen specifically for recombination QTL segregation would be more effective.  相似文献   

5.
Testing for random mating of a population is important in population genetics, because deviations from randomness of mating may indicate inbreeding, population stratification, natural selection, or sampling bias. However, current methods use only observed numbers of genotypes and alleles, and do not take advantage of the fact that the advent of sequencing technology provides an opportunity to investigate this topic in unprecedented detail. To address this opportunity, a novel statistical test for random mating is required in population genomics studies for which large sequencing datasets are generally available. Here, we propose a Monte-Carlo-based-permutation test (MCP) as an approach to detect random mating. Computer simulations used to evaluate the performance of the permutation test indicate that its type I error is well controlled and that its statistical power is greater than that of the commonly used chi-square test (CHI). Our simulation study shows the power of our test is greater for datasets characterized by lower levels of migration between subpopulations. In addition, test power increases with increasing recombination rate, sample size, and divergence time of subpopulations. For populations exhibiting limited migration and having average levels of population divergence, the statistical power approaches 1 for sequences longer than 1Mbp and for samples of 400 individuals or more. Taken together, our results suggest that our permutation test is a valuable tool to detect random mating of populations, especially in population genomics studies.  相似文献   

6.
Recent studies suggest that variation in complex disorders (e.g., schizophrenia) is explained by a large number of genetic variants with small effect size (Odds Ratio ≈ 1.05-1.1). The statistical power to detect these genetic variants in Genome Wide Association (GWA) studies with large numbers of cases and controls (v 15,000) is still low. As it will be difficult to further increase sample size, we decided to explore an alternative method for analyzing GWA data in a study of schizophrenia, dramatically reducing the number of statistical tests. The underlying hypothesis was that at least some of the genetic variants related to a common outcome are collocated in segments of chromosomes at a wider scale than single genes. Our approach was therefore to study the association between relatively large segments of DNA and disease status. An association test was performed for each SNP and the number of nominally significant tests in a segment was counted. We then performed a permutation-based binomial test to determine whether this region contained significantly more nominally significant SNPs than expected under the null hypothesis of no association, taking linkage into account. Genome Wide Association data of three independent schizophrenia case/control cohorts with European ancestry (Dutch, German, and US) using segments of DNA with variable length (2 to 32 Mbp) was analyzed. Using this approach we identified a region at chromosome 5q23.3-q31.3 (128-160 Mbp) that was significantly enriched with nominally associated SNPs in three independent case-control samples. We conclude that considering relatively wide segments of chromosomes may reveal reliable relationships between the genome and schizophrenia, suggesting novel methodological possibilities as well as raising theoretical questions.  相似文献   

7.
Klasen JR  Piepho HP  Stich B 《Heredity》2012,108(6):626-632
A major goal of today's biology is to understand the genetic basis of quantitative traits. This can be achieved by statistical methods that evaluate the association between molecular marker variation and phenotypic variation in different types of mapping populations. The objective of this work was to evaluate the statistical power of quantitative trait loci (QTL) detection of various multi-parental mating designs, as well as to assess the reasons for the observed differences. Our study was based on an empirical data of 20 Arabidopsis thaliana accessions, which have been selected to capture the maximum genetic diversity. The examined mating designs differed strongly with respect to the statistical power to detect QTL. We observed the highest power to detect QTL for the diallel cross with random mating design. The results of our study suggested that performing sibling mating within subpopulations of joint-linkage mapping populations has the potential to considerably increase the power for QTL detection. Our results, however, revealed that using designs in which more than two parental alleles segregate in each subpopulation increases the power even more.  相似文献   

8.
T. M. Barnes  Y. Kohara  A. Coulson    S. Hekimi 《Genetics》1995,141(1):159-179
The genetic map of each Caenorhabditis elegans chromosome has a central gene cluster (less pronounced on the X chromosome) that contains most of the mutationally defined genes. Many linkage group termini also have clusters, though involving fewer loci. We examine the factors shaping the genetic map by analyzing the rate of recombination and gene density across the genome using the positions of cloned genes and random cDNA clones from the physical map. Each chromosome has a central gene-dense region (more diffuse on the X) with discrete boundaries, flanked by gene-poor regions. Only autosomes have reduced rates of recombination in these gene-dense regions. Cluster boundaries appear discrete also by recombination rate, and the boundaries defined by recombination rate and gene density mostly, but not always, coincide. Terminal clusters have greater gene densities than the adjoining arm but similar recombination rates. Thus, unlike in other species, most exchange in C. elegans occurs in gene-poor regions. The recombination rate across each cluster is constant and similar; and cluster size and gene number per chromosome are independent of the physical size of chromosomes. We propose a model of how this genome organization arose.  相似文献   

9.
Genetic exchange between isolated populations, or introgression between species, serves as a key source of novel genetic material on which natural selection can act. While detecting historical gene flow from DNA sequence data is of much interest, many existing methods can be limited by requirements for deep population genomic sampling. In this paper, we develop a scalable genealogy-based method to detect candidate signatures of gene flow into a given population when the source of the alleles is unknown. Our method does not require sequenced samples from the source population, provided that the alleles have not reached fixation in the sampled recipient population. The method utilizes recent advances in algorithms for the efficient reconstruction of ancestral recombination graphs, which encode genealogical histories of DNA sequence data at each site, and is capable of detecting the signatures of gene flow whose footprints are of length up to single genes. Further, we employ a theoretical framework based on coalescent theory to test for statistical significance of certain recombination patterns consistent with gene flow from divergent sources. Implementing these methods for application to whole-genome sequences of environmental yeast isolates, we illustrate the power of our approach to highlight loci with unusual recombination histories. By developing innovative theory and methods to analyze signatures of gene flow from population sequence data, our work establishes a foundation for the continued study of introgression and its evolutionary relevance.  相似文献   

10.
EM算法是在不完全信息资料下实现参数极大似然估计的一种通用方法.本文导出了双位点不同标记类型,包括共显性-共显性,共显性-显性和显性-显性3种模式下,估计遗传重组率的EM算法,以及获得重组率抽样方差的Bootstrap方法;并将之推广到部分个体缺失标记基因型(未检测到电泳谱带)下的重组率估计.通过大量Monte Carlo模拟研究发现: (1)连锁紧密时,样本容量对重组率的估计影响不大;连锁松散时,需要较大样本容量才可检测到连锁以及实现重组率的较精确估计.(2)用包含缺失标记的所有个体估计重组率比仅用其中的非缺失标记个体估计更准确,且可显著提高连锁检测的统计功效.  相似文献   

11.
P. J. Ward 《Genetics》1990,125(3):655-667
Recent developments have related quantitative trait expression to metabolic flux. The present paper investigates some implications of this for statistical aspects of polygenic inheritance. Expressions are derived for the within-sibship genetic mean and genetic variance of metabolic flux given a pair of parental, diploid, n-locus genotypes. These are exact and hold for arbitrary numbers of gene loci, arbitrary allelic values at each locus, and for arbitrary recombination fractions between adjacent gene loci. The within-sibship, genetic variance is seen to be simply a measure of parental heterozygosity plus a measure of the degree of linkage coupling within the parental genotypes. Approximations are given for the within-sibship phenotypic mean and variance of metabolic flux. These results are applied to the problem of attaining adequate statistical power in a test of association between allozymic variation and inter-individual variation in metabolic flux. Simulations indicate that statistical power can be greatly increased by augmenting the data with predictions and observations on progeny statistics in relation to parental allozyme genotypes. Adequate power may thus be attainable at small sample sizes, and when allozymic variation is scored at a only small fraction of the total set of loci whose catalytic products determine the flux.  相似文献   

12.
Six samples containing extremely high concentration of Pb,Zn,and Cd were obtained from the layers of 5-10 cm and 25-30 cm three tailing piles,with ages of about 10,20 and more than 80 years,respectively.Then,48 bacterial strains were obtained from these samples,and subsequently their phylogenetic positions were determined by analysis on the partial sequence of 16S rRNA gene (fragment length ranging from 474 to 708 bp).These isolates were members of the Arthrobacter genus,phylogenetically close to A.keyseri and A.ureafaciens,with sequence ranging from 99.1%to 100%.Furthermore,genetic variation between subpopulations from different samples was revealed by analysis on their randomly amplified polymorphic DNA profile.Nei genetic distance showed that the greatest differentiation occurred between subpopulation A and C.Notably,either genetic distance between subpopulations from the layers of 5-10 cm and 25-30 cm of each tailing pile or between same layers of different tailing pile increased with the history of tailings.Moreover,correlation analysis showed that soluble Pb has a significantly negative relationship with Nei'gene diversity of subpopulation.It was assumed that soluble Pb may be responsible for the reduced genetic diversity of the Arthrobacter population.Our data provided evidence that genetic differentiation of microbial populations was consistent with the changes of environmental factors,particularly heavy metals.  相似文献   

13.
A method is described to discover if a gene carries one or more allelic mutations that confer risk for any specified common disease. The method does not depend upon genetic linkage of risk-conferring mutations to high frequency genetic markers such as single nucleotide polymorphisms. Instead, the sums of allelic mutation frequencies in case and control cohorts are determined and a statistical test is applied to discover if the difference in these sums is greater than would be expected by chance. A statistical model is presented that defines the ability of such tests to detect significant gene-disease relationships as a function of case and control cohort sizes and key confounding variables: zygosity and genicity, environmental risk factors, errors in diagnosis, limits to mutant detection, linkage of neutral and risk-conferring mutations, ethnic diversity in the general population and the expectation that among all exonic mutants in the human genome greater than 90% will be neutral with regard to any effect on disease risk. Means to test the null hypothesis for, and determine the statistical power of, each test are provided. For this "cohort allelic sums test" or "CAST", the statistical model and test are provided as an Excel program, CASTAT(c) at . Based on genetics, technology and statistics, a strategy of enumerating the mutant alleles carried in the exons and splice sites of the estimated approximately 25,000 human genes in case cohort samples of 10,000 persons for each of 100 common diseases is proposed and evaluated: A wide range of possible conditions of multi-allelic or mono-allelic and monogenic, multigenic or polygenic (including epistatic) risk are found to be detectable using the statistical criteria of 1 or 10 "false positive" gene associations approximately 25,000 gene-disease pair-wise trials and a statistical power of >0.8. Using estimates of the distribution of both neutral and gene-inactivating nondeleterious mutations in humans and the sensitivity of the test to multigenic or multicausal risk, it is estimated that about 80% of nullizygous, heterozygous and functionally dominant gene-common disease associations may be discovered. Limitations include relative insensitivity of CAST to about 60% of possible associations given homozygous (wild type) risk and, more rarely, other stochastic limits when the frequency of mutations in the case cohort approaches that of the control cohort and biases such as absence of genetic risk masked by risk derived from a shared cultural environment.  相似文献   

14.
A standardized genetic differentiation measure   总被引:1,自引:0,他引:1  
Interpretation of genetic differentiation values is often problematic because of their dependence on the level of genetic variation. For example, the maximum level of GST is less than the average within population homozygosity so that for highly variable loci, even when no alleles are shared between subpopulations, GST may be low. To remedy this difficulty, a standardized measure of genetic differentiation is introduced here, one which has the same range, 0-1, for all levels of genetic variation. With this measure, the magnitude is the proportion of the maximum differentiation possible for the level of subpopulation homozygosity observed. This is particularly important for situations in which the mutation rate is of the same magnitude or higher than the rate of gene flow. The standardized measure allows comparison between loci with different levels of genetic variation, such as allozymes and microsatellite loci, or mtDNA and Y-chromosome genes, and for genetic differentiation for organisms with different effective population sizes.  相似文献   

15.
To date, most genetic analyses of phenotypes have focused on analyzing single traits or analyzing each phenotype independently. However, joint epistasis analysis of multiple complementary traits will increase statistical power and improve our understanding of the complicated genetic structure of the complex diseases. Despite their importance in uncovering the genetic structure of complex traits, the statistical methods for identifying epistasis in multiple phenotypes remains fundamentally unexplored. To fill this gap, we formulate a test for interaction between two genes in multiple quantitative trait analysis as a multiple functional regression (MFRG) in which the genotype functions (genetic variant profiles) are defined as a function of the genomic position of the genetic variants. We use large-scale simulations to calculate Type I error rates for testing interaction between two genes with multiple phenotypes and to compare the power with multivariate pairwise interaction analysis and single trait interaction analysis by a single variate functional regression model. To further evaluate performance, the MFRG for epistasis analysis is applied to five phenotypes of exome sequence data from the NHLBI’s Exome Sequencing Project (ESP) to detect pleiotropic epistasis. A total of 267 pairs of genes that formed a genetic interaction network showed significant evidence of epistasis influencing five traits. The results demonstrate that the joint interaction analysis of multiple phenotypes has a much higher power to detect interaction than the interaction analysis of a single trait and may open a new direction to fully uncovering the genetic structure of multiple phenotypes.  相似文献   

16.
Goss EM  Kreitman M  Bergelson J 《Genetics》2005,169(1):21-35
Species-level genetic diversity and recombination in bacterial pathogens of wild plant populations have been nearly unexplored. Pseudomonas viridiflava is a common natural bacterial pathogen of Arabidopsis thaliana, for which pathogen defense genes and mechanisms are becoming increasing well known. The genetic variation contained within a worldwide sample of P. viridiflava collected from wild populations of A. thaliana was investigated using five genomic sequence fragments totaling 2.3 kb. Two distinct and deeply diverged clades were found within the P. viridiflava sample and in close proximity in multiple populations, each genetically diverse with synonymous variation as high as 9.3% in one of these clades. Within clades, there is evidence of frequent recombination within and between each sequenced locus and little geographic differentiation. Isolates from both clades were also found in a small sample of other herbaceous species in Midwest populations, indicating a possibly broad host range for P. viridiflava. The high levels of genetic variation and recombination together with a lack of geographic differentiation in this pathogen distinguish it from other bacterial plant pathogens for which intraspecific variation has been examined.  相似文献   

17.
Unpredictability during development of the optimum phenotype under future selection leads to a compromise reaction norm with a slope that is shallower than the slope of the optimum reaction norm. Unpredictability of selection can lead to an evolved curved reaction norm when genetic variation for curvature is available even if the optimum reaction norm is linear. This requires asymmetry in the frequency distribution of the habitats of selection; at small population size, stochasticity in the number of individuals per selection habitat is sufficient to generate such asymmetry. Unpredictability of selection in structured populations leads to local genetic differentiation of reaction norms. The mean habitat of a subpopulation is defined as the subpopulation's focal habitat. The evolved mean reaction norm of each subpopulation is anchored at the optimum genotypic value in its focal habitat. Linear reaction norms are parallel if the conditional distribution of adults around the focal habitats is the same for each subpopulation. Adult migration and absence of zygote dispersal represents the ultimate structured population, each habitat playing the role of focal habitat. Absence of zygote dispersal requires that the flow of individuals through the habitats is used instead of the habitats’ frequencies in the prediction of the evolved reaction norm. Adult migration in absence of zygote dispersal leads to an evolved pattern of locally differentiated reaction norms with optimum genotypic value anchored in the focal habitat and, for linear reaction norms, parallel slopes.  相似文献   

18.
Several Planktothrix strains, each producing a distinct oligopeptide profile, have been shown to coexist within Lake Steinsfjorden (Norway). Using nonribosomal peptide synthetase (NRPS) genes as markers, it has been shown that the Planktothrix community comprises distinct genetic variants displaying differences in bloom dynamics, suggesting a Planktothrix subpopulation structure. Here, we investigate the Planktothrix variants inhabiting four lakes in southeast of Norway utilizing both NRPS and non-NRPS genes. Phylogenetic analyses showed similar topologies for both NRPS and non-NRPS genes, and the lakes appear to have similar structuring of Planktothrix genetic variants. The structure of distinct variants was also supported by very low genetic diversity within variants compared to the between-variant diversity. Incongruent topologies and split decomposition revealed recombination events between Planktothrix variants. In several strains the gene variants seem to be a result of recombination. Both NRPS and non-NRPS genes are dominated by purifying selection; however, sites subjected to positive selection were also detected. The presence of similar and well-separated Planktothrix variants with low internal genetic diversity indicates gene flow within Planktothrix populations. Further, the low genetic diversity found between lakes (similar range as within lakes) indicates gene flow also between Planktothrix populations and suggests recent, or recurrent, dispersals. Our data also indicate that recombination has resulted in new genetic variants. Stability within variants and the development of new variants are likely to be influenced by selection patterns and within-variant homologous recombination.  相似文献   

19.
Recombination varies greatly among species, as illustrated by the poor conservation of the recombination landscape between humans and chimpanzees. Thus, shorter evolutionary time frames are needed to understand the evolution of recombination. Here, we analyze its recent evolution in humans. We calculated the recombination rates between adjacent pairs of 636,933 common single-nucleotide polymorphism loci in 28 worldwide human populations and analyzed them in relation to genetic distances between populations. We found a strong and highly significant correlation between similarity in the recombination rates corrected for effective population size and genetic differentiation between populations. This correlation is observed at the genome-wide level, but also for each chromosome and when genetic distances and recombination similarities are calculated independently from different parts of the genome. Moreover, and more relevant, this relationship is robustly maintained when considering presence/absence of recombination hotspots. Simulations show that this correlation cannot be explained by biases in the inference of recombination rates caused by haplotype sharing among similar populations. This result indicates a rapid pace of evolution of recombination, within the time span of differentiation of modern humans.  相似文献   

20.
Previous genetic analyses of the Caulobacter crescentus chromosome have resulted in the construction of a linear genetic map. To establish the circularity of the C. crescentus chromosome, restriction fragments generated by digestion with AseI and SpeI were analyzed by pulsed-field gel electrophoresis and Southern hybridization. The size of each fragment was calculated and used to demonstrate that C. crescentus has a genome size of approximately 4,000 kilobases. In addition, both enzymes gave rise to large DNA fragments which contained genes from both ends of the genetic map. Thus, there is physical linkage between the genes at the ends of the genetic map and the chromosome is circular. Since this region of the chromosome appears to contain the replication terminus, we propose that recombination occurs at a high frequency in the vicinity of the terminus. This high frequency of recombination would prevent genetic linkage from being observed between genes on opposite sides of the terminus. Additional experiments using insertions which introduced new AseI and DraI restriction sites into the genome allowed us to calculate the physical distance between genes located in the vicinity of the replication terminus.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号