首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Analysis of allelic associations is an increasingly more widely used approach to fine mapping of genes of various diseases. To interpret the results correctly, it is necessary to estimate the power of the statistical test used. The principle of the analysis of associations and testing of hypothesis are described, and analytically obtained estimates of the power of the transmission disequilibrium test (TDT), one of the most popular methods of analysis of allelic associations, are presented. These estimates are applicable to arbitrary models of inheritance formulated in terms of genotypic relative risk. The proposed method is illustrated by analysis of the associations of idiopathic scoliosis and aggrecan gene alleles.  相似文献   

2.
Yang RC 《Genetics》2002,161(1):435-445
While nonrandom associations between zygotes at different loci (zygotic associations) frequently occur in Hardy-Weinberg disequilibrium populations, statistical analysis of such associations has received little attention. In this article, we describe the joint distributions of zygotes at multiple loci, which are completely characterized by heterozygosities at individual loci and various multilocus zygotic associations. These zygotic associations are defined in the same fashion as the usual multilocus linkage (gametic) disequilibria on the basis of gametic and allelic frequencies. The estimation and test procedures are described with details being given for three loci. The sampling properties of the estimates are examined through Monte Carlo simulation. The estimates of three-locus associations are not free of bias due to the presence of two-locus associations and vice versa. The power of detecting the zygotic associations is small unless different loci are strongly associated and/or sample sizes are large (>100). The analysis of zygotic associations not only offers an effective means of packaging numerous genic disequilibria required for a complete characterization of multilocus structure, but also provides opportunities for making inference about evolutionary and demographic processes through a comparative assessment of zygotic association vs. gametic disequilibrium for the same set of loci in nonequilibrium populations.  相似文献   

3.
A method is described to discover if a gene carries one or more allelic mutations that confer risk for any specified common disease. The method does not depend upon genetic linkage of risk-conferring mutations to high frequency genetic markers such as single nucleotide polymorphisms. Instead, the sums of allelic mutation frequencies in case and control cohorts are determined and a statistical test is applied to discover if the difference in these sums is greater than would be expected by chance. A statistical model is presented that defines the ability of such tests to detect significant gene-disease relationships as a function of case and control cohort sizes and key confounding variables: zygosity and genicity, environmental risk factors, errors in diagnosis, limits to mutant detection, linkage of neutral and risk-conferring mutations, ethnic diversity in the general population and the expectation that among all exonic mutants in the human genome greater than 90% will be neutral with regard to any effect on disease risk. Means to test the null hypothesis for, and determine the statistical power of, each test are provided. For this "cohort allelic sums test" or "CAST", the statistical model and test are provided as an Excel program, CASTAT(c) at . Based on genetics, technology and statistics, a strategy of enumerating the mutant alleles carried in the exons and splice sites of the estimated approximately 25,000 human genes in case cohort samples of 10,000 persons for each of 100 common diseases is proposed and evaluated: A wide range of possible conditions of multi-allelic or mono-allelic and monogenic, multigenic or polygenic (including epistatic) risk are found to be detectable using the statistical criteria of 1 or 10 "false positive" gene associations approximately 25,000 gene-disease pair-wise trials and a statistical power of >0.8. Using estimates of the distribution of both neutral and gene-inactivating nondeleterious mutations in humans and the sensitivity of the test to multigenic or multicausal risk, it is estimated that about 80% of nullizygous, heterozygous and functionally dominant gene-common disease associations may be discovered. Limitations include relative insensitivity of CAST to about 60% of possible associations given homozygous (wild type) risk and, more rarely, other stochastic limits when the frequency of mutations in the case cohort approaches that of the control cohort and biases such as absence of genetic risk masked by risk derived from a shared cultural environment.  相似文献   

4.
An extension to current maximum-likelihood variance-components procedures for mapping quantitative-trait loci in sib pairs that allows a simultaneous test of allelic association is proposed. The method involves modeling of the allelic means for a test of association, with simultaneous modeling of the sib-pair covariance structure for a test of linkage. By partitioning of the mean effect of a locus into between- and within-sibship components, the method controls for spurious associations due to population stratification and admixture. The power and efficacy of the method are illustrated through simulation of various models of both real and spurious association.  相似文献   

5.
Li Y  Li Y  Wu S  Han K  Wang Z  Hou W  Zeng Y  Wu R 《Genetics》2007,176(3):1811-1821
Analysis of population structure and organization with DNA-based markers can provide important information regarding the history and evolution of a species. Linkage disequilibrium (LD) analysis based on allelic associations between different loci is emerging as a viable tool to unravel the genetic basis of population differentiation. In this article, we derive the EM algorithm to obtain the maximum-likelihood estimates of the linkage disequilibria between dominant markers, to study the patterns of genetic diversity for a diploid species. The algorithm was expanded to estimate and test linkage disequilibria of different orders among three dominant markers and can be technically extended to manipulate an arbitrary number of dominant markers. The feasibility of the proposed algorithm is validated by an example of population genetic studies of hickory trees, native to southeastern China, using dominant random amplified polymorphic DNA markers. Extensive simulation studies were performed to investigate the statistical properties of this algorithm. The precision of the estimates of linkage disequilibrium between dominant markers was compared with that between codominant markers. Results from simulation studies suggest that three-locus LD analysis displays increased power of LD detection relative to two-locus LD analysis. This algorithm is useful for studying the pattern and amount of genetic variation within and among populations.  相似文献   

6.
Case‐control studies are primary study designs used in genetic association studies. Sasieni (Biometrics 1997, 53, 1253–1261) pointed out that the allelic chi‐square test used in genetic association studies is invalid when Hardy‐Weinberg equilibrium (HWE) is violated in a combined population. It is important to know how much type I error rate is deviated from the nominal level under violated HWE. We examine bounds of type I error rate of the allelic chi‐square test. We also investigate power of the goodness‐of‐fit test for HWE which can be used as a guideline for selecting an appropriate test between the allelic chi‐square test and the modified allelic chi‐square test, the latter of which was proposed for cases of violated HWE. In small samples, power is not large enough to detect the Wright's inbreeding model of small values of inbreeding coefficient. Therefore, when the null hypothesis of HWE is barely accepted, the modified test should be considered as an alternative method. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

7.
Luo ZW  Tao SH  Zeng ZB 《Genetics》2000,156(1):457-467
Three approaches are proposed in this study for detecting or estimating linkage disequilibrium between a polymorphic marker locus and a locus affecting quantitative genetic variation using the sample from random mating populations. It is shown that the disequilibrium over a wide range of circumstances may be detected with a power of 80% by using phenotypic records and marker genotypes of a few hundred individuals. Comparison of ANOVA and regression methods in this article to the transmission disequilibrium test (TDT) shows that, given the genetic variance explained by the trait locus, the power of TDT depends on the trait allele frequency, whereas the power of ANOVA and regression analyses is relatively independent from the allelic frequency. The TDT method is more powerful when the trait allele frequency is low, but much less powerful when it is high. The likelihood analysis provides reliable estimation of the model parameters when the QTL variance is at least 10% of the phenotypic variance and the sample size of a few hundred is used. Potential use of these estimates in mapping the trait locus is also discussed.  相似文献   

8.
When analyzing the relationship between allelic variability and traits, a potential source of confounding is population admixture. An approach to adjusting for potential confounding due to population admixture when estimating the influence of allelic variability at a candidate gene is presented. The approach involves augmenting linear regression models with additional regressors. Family genotype data are used to define the regressors, and inclusion of the regressors ensures that, even in the presence of population admixture, the estimates of the regression coefficients that parameterize the influence of allelic variability on the trait are unbiased. The approach is illustrated through an analysis of the influence of apolipoprotein E genotype on plasma low density lipoprotein cholesterol concentrations.  相似文献   

9.
Johnson PC  Haydon DT 《Genetics》2007,175(2):827-842
The importance of quantifying and accounting for stochastic genotyping errors when analyzing microsatellite data is increasingly being recognized. This awareness is motivating the development of data analysis methods that not only take errors into consideration but also recognize the difference between two distinct classes of error, allelic dropout and false alleles. Currently methods to estimate rates of allelic dropout and false alleles depend upon the availability of error-free reference genotypes or reliable pedigree data, which are often not available. We have developed a maximum-likelihood-based method for estimating these error rates from a single replication of a sample of genotypes. Simulations show it to be both accurate and robust to modest violations of its underlying assumptions. We have applied the method to estimating error rates in two microsatellite data sets. It is implemented in a computer program, Pedant, which estimates allelic dropout and false allele error rates with 95% confidence regions from microsatellite genotype data and performs power analysis. Pedant is freely available at http://www.stats.gla.ac.uk/ approximately paulj/pedant.html.  相似文献   

10.
We use evolutionary trees of haplotypes to study phenotypic associations by exhaustively examining all possible biallelic partitions of the tree, a technique we call tree scanning. If the first scan detects significant associations, additional rounds of tree scanning are used to partition the tree into three or more allelic classes. Two worked examples are presented. The first is a reanalysis of associations between haplotypes at the Alcohol Dehydrogenase locus in Drosophila melanogaster that was previously analyzed using a nested clade analysis, a more complicated technique for using haplotype trees to detect phenotypic associations. Tree scanning and the nested clade analysis yield the same inferences when permutation testing is used with both approaches. The second example is an analysis of associations between variation in various lipid traits and genetic variation at the Apolipoprotein E (APOE) gene in three human populations. Tree scanning successfully identified phenotypic associations expected from previous analyses. Tree scanning for the most part detected more associations and provided a better biological interpretative framework than single SNP analyses. We also show how prior information can be incorporated into the tree scan by starting with the traditional three electrophoretic alleles at APOE. Tree scanning detected genetically determined phenotypic heterogeneity within all three electrophoretic allelic classes. Overall, tree scanning is a simple, powerful, and flexible method for using haplotype trees to detect phenotype/genotype associations at candidate loci.  相似文献   

11.
There are many known examples of multiple semi-independent associations at individual loci; such associations might arise either because of true allelic heterogeneity or because of imperfect tagging of an unobserved causal variant. This phenomenon is of great importance in monogenic traits but has not yet been systematically investigated and quantified in complex-trait genome-wide association studies (GWASs). Here, we describe a multi-SNP association method that estimates the effect of loci harboring multiple association signals by using GWAS summary statistics. Applying the method to a large anthropometric GWAS meta-analysis (from the Genetic Investigation of Anthropometric Traits consortium study), we show that for height, body mass index (BMI), and waist-to-hip ratio (WHR), 3%, 2%, and 1%, respectively, of additional phenotypic variance can be explained on top of the previously reported 10% (height), 1.5% (BMI), and 1% (WHR). The method also permitted a substantial increase (by up to 50%) in the number of loci that replicate in a discovery-validation design. Specifically, we identified 74 loci at which the multi-SNP, a linear combination of SNPs, explains significantly more variance than does the best individual SNP. A detailed analysis of multi-SNPs shows that most of the additional variability explained is derived from SNPs that are not in linkage disequilibrium with the lead SNP, suggesting a major contribution of allelic heterogeneity to the missing heritability.  相似文献   

12.
Insights into latent class analysis of diagnostic test performance   总被引:2,自引:0,他引:2  
Latent class analysis is used to assess diagnostic test accuracy when a gold standard assessment of disease is not available but results of multiple imperfect tests are. We consider the simplest setting, where 3 tests are observed and conditional independence (CI) is assumed. Closed-form expressions for maximum likelihood parameter estimates are derived. They show explicitly how observed 2- and 3-way associations between test results are used to infer disease prevalence and test true- and false-positive rates. Although interesting and reasonable under CI, the estimators clearly have no basis when it fails. Intuition for bias induced by conditional dependence follows from the analytic expressions. Further intuition derives from an Expectation Maximization (EM) approach to calculating the estimates. We discuss implications of our results and related work for settings where more than 3 tests are available. We conclude that careful justification of assumptions about the dependence between tests in diseased and nondiseased subjects is necessary in order to ensure unbiased estimates of prevalence and test operating characteristics and to provide these estimates clinical interpretations. Such justification must be based in part on a clear clinical definition of disease and biological knowledge about mechanisms giving rise to test results.  相似文献   

13.
Association studies are traditionally performed in the case-control framework. As a first step in the analysis process, comparing allele frequencies using the Pearson's chi-square statistic is often invoked. However such an approach assumes the independence of alleles under the hypothesis of no association, which may not always be the case. Consequently this method introduces a bias that deviates the expected type I error-rate. In this article we first propose an unbiased and exact test as an alternative to the biased allelic test. Available data require to perform thousands of such tests so we focused on its fast execution. Since the biased allelic test is still widely used in the community, we illustrate its pitfalls in the context of genome-wide association studies and particularly in the case of low-level tests. Finally, we compare the unbiased and exact test with the Cochran-Armitage test for trend and show it perfoms similarly in terms of power. The fast, unbiased and exact allelic test code is available in R, C++ and Perl at: http://stat.genopole.cnrs.fr/software/fueatest.  相似文献   

14.
nessi is a computer program generating predictions about allelic and genotypic frequencies at the S-locus in sporophytic self-incompatibility systems under finite and infinite populations. For any pattern of dominance relationships among self-incompatibility alleles, nessi computes deterministic equilibrium frequencies and estimates distributions in samples from finite populations of the number of alleles at equilibrium, allelic and genotypic frequencies at equilibrium and allelic and genotypic frequency changes in a single generation. These predictions can be used to rigorously test the impact of negative frequency-dependent selection on diversity patterns in natural populations.  相似文献   

15.
We present two novel methods to infer mating patterns from genetic data. They differ from existing statistical methods of parentage inference in that they apply to populations that deviate from Hardy-Weinberg and linkage equilibrium, and so are suited for the study of assortative mating in hybrid zones. The core data set consists of genotypes at several loci for a number of full-sib clutches of unknown parentage. Our inference is based throughout on estimates of allelic associations within and across loci, such as heterozygote deficit and pairwise linkage disequilibrium. In the first method, the most likely parents of a given clutch are determined from the genotypic distribution of the associated adult population, given an explicit model of nonrandom mating. This leads to estimates of the strength of assortment. The second approach is based solely on the offspring genotypes and relies on the fact that a linear relation exists between associations among the offspring and those in the population of breeding pairs. We apply both methods to a sample from the hybrid zone between the fire-bellied toads Bombina bombina and B. variegata (Anura: Disco glossidae) in Croatia. Consistently, both approaches provide no evidence for a departure from random mating, despite adequate statistical power. Instead, B. variegata-like individuals among the adults contributed disproportionately to the offspring cohort, consistent with their preference for the type of breeding habitat in which this study was conducted.  相似文献   

16.
To avoid problems related to unknown population substructure, association studies may be conducted in founder populations. In such populations, however, the relatedness among individuals may be considerable. Neglecting such correlations among individuals can lead to seriously spurious associations. Here, we propose a method for case-control association studies of binary traits that is suitable for any set of related individuals, provided that their genealogy is known. Although we focus here on large inbred pedigrees, this method may also be used in outbred populations for case-control studies in which some individuals are relatives. We base inference on a quasi-likelihood score (QLS) function and construct a QLS test for allelic association. This approach can be used even when the pedigree structure is far too complex to use an exact-likelihood calculation. We also present an alternative approach to this test, in which we use the known genealogy to derive a correction factor for the case-control association chi2 test. We perform analytical power calculations for each of the two tests by deriving their respective noncentrality parameters. The QLS test is more powerful than the corrected chi2 test in every situation considered. Indeed, under certain regularity conditions, the QLS test is asymptotically the locally most powerful test in a general class of linear tests that includes the corrected chi2 test. The two methods are used to test for associations between three asthma-associated phenotypes and 48 SNPs in 35 candidate genes in the Hutterites. We report a highly significant novel association (P=2.10-6) between atopy and an amino acid polymorphism in the P-selectin gene, detected with the QLS test and also, but less significantly (P=.0014), with the transmission/disequilibrium test.  相似文献   

17.
Cluster analysis can be a useful tool for exploratory data analysis to uncover natural groupings in data, and initiate new ideas and hypotheses about such groupings. When applied to short-term assay results, it provides and improves estimates for the sensitivity and specificity of assays, provides indications of association between assays and, in turn, which assays can be substituted for one another in a battery, and allows a data base containing test results on chemicals of unknown carcinogenicity to be linked to a data base for which animal carcinogenicity data are available. Cluster analysis was applied to the Gene-Tox data base (which contains short-term test results on chemicals of both known and unknown carcinogenicity). The results on chemicals of known carcinogenicity were different from those obtained when the entire data base was analyzed. This suggests that the associations (and possibly the sensitivities and specificities) which are based on chemicals of known carcinogenicity may not be representative of the true measures. Cluster analysis applied to the total data base should be useful in improving these estimates. Many of the associations between the assays which were found through the use of cluster analysis could be 'validated' based on previous knowledge of the mechanistic basis of the various tests, but some of the associations were unsuspected. These associations may be a reflection of a non-ideal data base. As additional data becomes available and new clustering techniques for handling non-ideal data bases are developed, results from such analyses could play an increasing role in strengthening prediction schemes which utilize short-term tests results to screen chemicals for carcinogenicity, such as the carcinogenicity and battery selection (CPBS) method (Chankong et al., 1985).  相似文献   

18.
19.
Exact tests for association between alleles at arbitrary numbers of loci   总被引:21,自引:0,他引:21  
Associations between allelic frequencies, within and between loci, can be tested for with an exact test. The probability of the set of multi-locus genotypes in a sample, conditional on the allelic counts, is calculated from multinomial theory under the hypothesis of no association. Alleles are then permuted and the conditional probability calculated for the permuted genotypic array. The proportion of arrays no more probable than the original sample provides the significance level for the test. An algorithm is provided for counting genotypes efficiently in the arrays, and the powers of the test presented for various kinds of association. The powers for the case when associations are generated by admixture of several populations suggest that exact tests are capable of detecting levels of association that would affect forensic calculations to a significant extent.  相似文献   

20.
When multiple related families derived from inbred lines are jointly analysed to detect quantitative trait loci (QTLs), the analysis should estimate allelic effects as accurately as possible and estimate the probability that different parents carry alleles that are identical in state. Analyses exist that assume that all parents carry unique alleles or that all parents but one carry the same allele. In practice, many configurations are possible that group different parents according to their identity-in-state condition at a putative QTL allele. Here, we propose a variable model Bayesian analysis that selects among possible identity-in-state configurations and jointly estimates the allelic effects of identical-in-state parents. We contrast this analysis with a fixed model analysis that estimates unique allelic effects for all parents. We analyse two simulated mating designs: an experimental design in which three inbred parents were crossed to generate two families of 150 doubled haploid lines; and a breeding design in which 20 inbred parents were crossed to generate 60 families of 20 doubled haploid lines, with each parent contributing to six families. In all cases where some parents were simulated to carry alleles of identical effect (that is, they were identical in state), the variable analysis estimated allelic effects with lower mean-squared error than the fixed analysis. The variable analysis showed that, unless each family contains many individuals (more than 100), there is insufficient information in DNA-marker and phenotypic data to determine with high probability the QTL allelic number.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号