首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Genome-wide linkage analysis using microsatellite markers has been successful in the identification of numerous Mendelian and complex disease loci. The recent availability of high-density single-nucleotide polymorphism (SNP) maps provides a potentially more powerful option. Using the simulated and Collaborative Study on the Genetics of Alcoholism (COGA) datasets from the Genetics Analysis Workshop 14 (GAW14), we examined how altering the density of SNP marker sets impacted the overall information content, the power to detect trait loci, and the number of false positive results. For the simulated data we used SNP maps with density of 0.3 cM, 1 cM, 2 cM, and 3 cM. For the COGA data we combined the marker sets from Illumina and Affymetrix to create a map with average density of 0.25 cM and then, using a sub-sample of these markers, created maps with density of 0.3 cM, 0.6 cM, 1 cM, 2 cM, and 3 cM. For each marker set, multipoint linkage analysis using MERLIN was performed for both dominant and recessive traits derived from marker loci. Our results showed that information content increased with increased map density. For the homogeneous, completely penetrant traits we created, there was only a modest difference in ability to detect trait loci. Additionally, as map density increased there was only a slight increase in the number of false positive results when there was linkage disequilibrium (LD) between markers. The presence of LD between markers may have led to an increased number of false positive regions but no clear relationship between regions of high LD and locations of false positive linkage signals was observed.  相似文献   

2.
Many investigators of complexly inherited familial traits bypass classical segregation analysis to perform model-free genome-wide linkage scans. Because model-based or parametric linkage analysis may be the most powerful means to localize genes when a model can be approximated, model-free statistics may result in a loss of power to detect linkage. We performed limited segregation analyses on the electrophysiological measurements that have been collected for the Collaborative Study on the Genetics of Alcoholism. The resulting models are used in whole-genome scans. Four genomic regions provided a model-based LOD > 2 and only 3 of these were detected (p < 0.05) by a model-free approach. We conclude that parametric methods, using even over-simplified models of complex phenotypes, may complement nonparametric methods and decrease false positives.  相似文献   

3.
Complex traits are often governed by more than one trait locus. The first step towards an adequate model for such diseases is a linkage analysis with two trait loci. Such an analysis can be expected to have higher power to detect linkage than a standard single-trait-locus linkage analysis. However, it is crucial to accurately specify the parameters of the two-locus model. Here, we recapitulate the general two-locus model with and without genomic imprinting. We relate heterogeneity, multiplicative, and additive two-locus models to biological or pathophysiological mechanisms, and give the corresponding averaged ("best-fitting") single-trait-locus models for each of the two loci. Furthermore, we derive the two-locus penetrances from the averaged single-locus models, under the assumption of one of the three model classes mentioned above. Using these formulae, if the best-fitting single-locus models are available, investigators may perform a two-trait-locus linkage analysis under a realistic model. This procedure will maximize the power to detect linkage for traits which are governed by two or more loci, and lead to more accurate estimates of the disease-locus positions.  相似文献   

4.
Chen L  Storey JD 《Genetics》2006,173(4):2371-2381
Linkage analysis involves performing significance tests at many loci located throughout the genome. Traditional criteria for declaring a linkage statistically significant have been formulated with the goal of controlling the rate at which any single false positive occurs, called the genomewise error rate (GWER). As complex traits have become the focus of linkage analysis, it is increasingly common to expect that a number of loci are truly linked to the trait. This is especially true in mapping quantitative trait loci (QTL), where sometimes dozens of QTL may exist. Therefore, alternatives to the strict goal of preventing any single false positive have recently been explored, such as the false discovery rate (FDR) criterion. Here, we characterize some of the challenges that arise when defining relaxed significance criteria that allow for at least one false positive linkage to occur. In particular, we show that the FDR suffers from several problems when applied to linkage analysis of a single trait. We therefore conclude that the general applicability of FDR for declaring significant linkages in the analysis of a single trait is dubious. Instead, we propose a significance criterion that is more relaxed than the traditional GWER, but does not appear to suffer from the problems of the FDR. A generalized version of the GWER is proposed, called GWERk, that allows one to provide a more liberal balance between true positives and false positives at no additional cost in computation or assumptions.  相似文献   

5.
Mathematically-derived traits from two or more component traits, either by addition, subtraction, multiplication, or division, have been frequently used in genetics and breeding. When used in quantitative trait locus (QTL) mapping, derived traits sometimes show discrepancy with QTL identified for the component traits. We used three QTL distributions and three genetic effects models, and an actual maize mapping population, to investigate the efficiency of using derived traits in QTL mapping, and to understand the genetic and biological basis of derived-only QTL, i.e., QTL identified for a derived trait but not for any component trait. Results indicated that the detection power of the four putative QTL was consistently greater than 90% for component traits in simulated populations, each consisting of 200 recombinant inbred lines. Lower detection power and higher false discovery rate (FDR) were observed when derived traits were used. In an actual maize population, simulations were designed based on the observed QTL distributions and effects. When derived traits were used, QTL detected for both component and derived traits had comparable power, but those detected for component traits but not for derived traits had low detection power. The FDR from subtraction and division in the maize population were higher than the FDR from addition and multiplication. The use of derived traits increased the gene number, caused higher-order gene interactions than observed in component traits, and possibly complicated the linkage relationship between QTL as well. The increased complexity of the genetic architecture with derived traits may be responsible for the reduced detection power and the increased FDR. Derived-only QTL identified in practical genetic populations can be explained either as minor QTL that are not significant in QTL mapping of component traits, or as false positives.  相似文献   

6.
Many genetic traits have complex modes of inheritance; they may exhibit incomplete or age-dependent penetrance or fail to show any clear Mendelian inheritance pattern. As primary linkage maps for the human genome near completion, it is becoming increasingly possible to map these traits. Prior to undertaking a linkage study, it is important to consider whether the pedigrees available for the proposed study are likely to provide sufficient information to demonstrate linkage, assuming a linked marker is tested. In the current paper, we describe a computer simulation method to estimate the power of a proposed study to detect linkage for a complex genetic trait, given a hypothesized genetic model for the trait. Our method simulates trait locus genotypes consistent with observed trait phenotypes, in such a way that the probability to detect linkage can be estimated by sample statistics of the maximum lod score distribution. The method uses terms available when calculating the likelihood of the trait phenotypes for the pedigree and is applicable to any trait determined by one or a few genetic loci; individual-specific environmental effects can also be dealt with. Our method provides an objective answer to the question, Will these pedigrees provide sufficient information to map this complex genetic trait?  相似文献   

7.
Using the simulated data set from Genetic Analysis Workshop 13, we explored the advantages of using longitudinal data in genetic analyses. The weighted average of the longitudinal data for each of seven quantitative phenotypes were computed and analyzed. Genome screen results were then compared for these longitudinal phenotypes and the results obtained using two cross-sectional designs: data collected near a single age (45 years) and data collected at a single time point. Significant linkage was obtained for nine regions (LOD scores ranging from 5.5 to 34.6) for six of the phenotypes. Using cross-sectional data, LOD scores were slightly lower for the same chromosomal regions, with two regions becoming nonsignificant and one additional region being identified. The magnitude of the LOD score was highly correlated with the heritability of each phenotype as well as the proportion of phenotypic variance due to that locus. There were no false-positive linkage results using the longitudinal data and three false-positive findings using the cross-sectional data. The three false positive results appear to be due to the kurtosis in the trait distribution, even after removing extreme outliers. Our analyses demonstrated that the use of simple longitudinal phenotypes was a powerful means to detect genes of major to moderate effect on trait variability. In only one instance was the power and heritability of the trait increased by using data from one examination. Power to detect linkage can be improved by identifying the most heritable phenotype, ensuring normality of the trait distribution and maximizing the information utilized through novel longitudinal designs for genetic analysis.  相似文献   

8.
Several methods have been proposed for linkage analysis of complex traits with unknown mode of inheritance. These methods include the LOD score maximized over disease models (MMLS) and the "nonparametric" linkage (NPL) statistic. In previous work, we evaluated the increase of type I error when maximizing over two or more genetic models, and we compared the power of MMLS to detect linkage, in a number of complex modes of inheritance, with analysis assuming the true model. In the present study, we compare MMLS and NPL directly. We simulated 100 data sets with 20 families each, using 26 generating models: (1) 4 intermediate models (penetrance of heterozygote between that of the two homozygotes); (2) 6 two-locus additive models; and (3) 16 two-locus heterogeneity models (admixture alpha = 1.0,.7,.5, and.3; alpha = 1.0 replicates simple Mendelian models). For LOD scores, we assumed dominant and recessive inheritance with 50% penetrance. We took the higher of the two maximum LOD scores and subtracted 0.3 to correct for multiple tests (MMLS-C). We compared expected maximum LOD scores and power, using MMLS-C and NPL as well as the true model. Since NPL uses only the affected family members, we also performed an affecteds-only analysis using MMLS-C. The MMLS-C was both uniformly more powerful than NPL for most cases we examined, except when linkage information was low, and close to the results for the true model under locus heterogeneity. We still found better power for the MMLS-C compared with NPL in affecteds-only analysis. The results show that use of two simple modes of inheritance at a fixed penetrance can have more power than NPL when the trait mode of inheritance is complex and when there is heterogeneity in the data set.  相似文献   

9.
Variance component modeling for linkage analysis of quantitative traits is a powerful tool for detecting and locating genes affecting a trait of interest, but the presence of genetic heterogeneity will decrease the power of a linkage study and may even give biased estimates of the location of the quantitative trait loci. Many complex diseases are believed to be influenced by multiple genes and therefore genetic heterogeneity is likely to be present for many real applications of linkage analysis. We consider a mixture of multivariate normals to model locus heterogeneity by allowing only a proportion of the sampled pedigrees to segregate trait-influencing allele(s) at a specific locus. However, for mixtures of normals the classical asymptotic distribution theory of the maximum likelihood estimates does not hold, so tests of linkage and/or heterogeneity are evaluated using resampling methods. It is shown that allowing for genetic heterogeneity leads to an increase in power to detect linkage. This increase is more prominent when the genetic effect of the locus is small or when the percentage of pedigrees not segregating trait-influencing allele(s) at the locus is high.  相似文献   

10.
A recent study by Cheung et al. demonstrates how to identify expression quantitative trait loci (eQTLs) underlying gene expression phenotypes through a combination of genome-wide linkage analysis and subsequent fine mapping or by genome-wide association (GWA) analysis. This study emphasizes the complexity of human traits, highlighting the challenges faced by investigators--in particular, insufficient linkage disequilibrium between the trait and marker variant, genetic heterogeneity and correcting for multiple testing will all adversely impact the power to detect loci by association. These issues must be considered carefully if the GWA approach is to succeed in mapping complex phenotypes.  相似文献   

11.
Xiao J  Wang X  Hu Z  Tang Z  Xu C 《Heredity》2007,98(6):427-435
Segregation analysis is a method of detecting major genes for quantitative traits without using marker information. It serves as an important tool in helping investigators to plan further studies such as quantitative trait loci mapping or more sophisticated genomic analyses. However, current methods of segregation analysis for a single trait typically have low statistical power. We propose a multivariate segregation analysis (MSA) that takes advantage of the correlation structure of multiple quantitative traits to detect major genes. This method not only increases the statistical power, but allows dissection of the genetic architecture underlying the trait complex. In MSA the observed phenotypes of multiple correlated traits are fitted to a multivariate Gaussian mixture model. Model parameters are estimated under the maximum likelihood framework via the expectation-maximization algorithm. The presence of major genes is tested using likelihood ratio test statistics. Pleiotropy is distinguished from close linkage by comparing three possible models using the Bayesian information criterion. Two simulation experiments were performed based on the F(2) mating design. In the first, the statistical properties of MSA under varying heritabilities and sample sizes were investigated and the results compared with those obtained from single-trait analysis. In the second simulation the efficacy of MSA in separating pleiotropy from close linkage was demonstrated. Finally, the new method was applied to real data and detected a major gene responsible for both plant height and tiller number in rice.  相似文献   

12.
We have compared the power of several allele-sharing statistics for "nonparametric" linkage analysis of X-linked traits in nuclear families and extended pedigrees. Our rationale was that, although several of these statistics have been implemented in popular software packages, there has been no formal evaluation of their relative power. Here, we evaluate the relative performance of five test statistics, including two new test statistics. We considered sibships of sizes two through four, four different extended pedigrees, 15 different genetic models (12 single-locus models and 3 two-locus models), and varying recombination fractions between the marker and the trait locus. We analytically estimated the sample sizes required for 80% power at a significance level of.001 and also used simulation methods to estimate power for a sample size of 10 families. We tried to identify statistics whose power was robust over a wide variety of models, with the idea that such statistics would be particularly useful for detection of X-linked loci associated with complex traits. We found that a commonly used statistic, S(all), generally performed well under various conditions and had close to the optimal sample sizes in most cases but that there were certain cases in which it performed quite poorly. Our two new statistics did not perform any better than those already in the literature. We also note that, under dominant and additive models, regardless of the statistic used, pedigrees with all-female siblings have very little power to detect X-linked loci.  相似文献   

13.

Background

We investigate the power of heterogeneity LOD test to detect linkage when a trait is determined by several major genes using Genetic Analysis Workshop 13 simulated data. We consider three traits, two of which are disease-causing traits: 1) the rate of change in body mass index (BMI); and 2) the maximum BMI; and 3) the disease itself (hypertension). Of interest is the power of "HLOD2", the maximum heterogeneity LOD obtained upon maximizing over the two genetic models.

Results

Using a trait phenotype Obesity Slope, we observe that the power to detect the two markers closest to the two genes (S1, S2) at the 0.05 level using HLOD2 is 13% and 10%. The power of HLOD2 for Max BMI phenotype is 12% and 9%. The corresponding values for the Hypertension phenotype are 8% and 6%.

Conclusion

The power to detect linkage to the slope genes is quite low. But the power using disease-related traits as a phenotype is greater than the power using the disease (hypertension) phenotype.
  相似文献   

14.
Advancements in genotyping are rapidly decreasing marker costs and increasing marker density. This opens new possibilities for mapping quantitative trait loci (QTL), in particular by combining linkage disequilibrium information and linkage analysis (LDLA). In this study, we compared different approaches to detect QTL for four traits of agronomical importance in two large multi-parental datasets of maize (Zea mays L.) of 895 and 928 testcross progenies composed of 7 and 21 biparental families, respectively, and genotyped with 491 markers. We compared to traditional linkage-based methods two LDLA models relying on the dense genotyping of parental lines with 17,728 SNP: one based on a clustering approach of parental line segments into ancestral alleles and one based on single marker information. The two LDLA models generally identified more QTL (60 and 52 QTL in total) than classical linkage models (49 and 44 QTL in total). However, they performed inconsistently over datasets and traits suggesting that a compromise must be found between the reduction of allele number for increasing statistical power and the adequacy of the model to potentially complex allelic variation. For some QTL, the model exclusively based on linkage analysis, which assumed that each parental line carried a different QTL allele, was able to capture remaining variation not explained by LDLA models. These complementarities between models clearly suggest that the different QTL mapping approaches must be considered to capture the different levels of allelic variation at QTL involved in complex traits.  相似文献   

15.
ABSTRACT: BACKGROUND: There is increasing empirical evidence that whole-genome prediction (WGP) is a powerful tool for predicting line and hybrid performance in maize. However, there is a lack of knowledge about the sensitivity of WGP models towards the genetic architecture of the trait. Whereas previous studies exclusively focused on highly polygenic traits, important agronomic traits such as disease resistances, nutrifunctional or climate adaptational traits have a genetic architecture which is either much less complex or unknown. For such cases, information about model robustness and guidelines for model selection are lacking. Here, we compared five WGP models with different assumptions about the distribution of the underlying genetic effects. As contrasting model traits, we chose three highly polygenic agronomic traits and three metabolites each with a major QTL explaining 22 to 30 % of the genetic variance in a panel of 289 diverse maize inbred lines genotyped with 56,110 SNPs. RESULTS: We found the five WGP models to be remarkable robust towards trait architecture with the largest differences in prediction accuracies ranging between 0.05 and 0.14 for the same trait, most likely as the result of the high level of linkage disequilibrium prevailing in elite maize germplasm. Whereas RR-BLUP performed best for the agronomic traits, it was inferior to LASSO or elastic net for the three metabolites. We found the approach of genome partitioning of genetic variance, first applied in human genetics, as useful in guiding the breeder which model to choose, if prior knowledge of the trait architecture is lacking. CONCLUSIONS: Our results suggest that in diverse germplasm of elite maize inbred lines with a high level of LD, WGP models differ only slightly in their accuracies, irrespective of the number and effects of QTL found in previous linkage or association mapping studies. However, small gains in prediction accuracies can be achieved if the WGP model is selected according to the genetic architecture of the trait. If the trait architecture is unknown e.g. for novel traits which only recently received attention in breeding, we suggest to inspect the distribution of the genetic variance explained by each chromosome for guiding model selection in WGP.  相似文献   

16.
Statistical methods for linkage analysis are well established for both binary and quantitative traits. However, numerous diseases including cancer and psychiatric disorders are rated on discrete ordinal scales. To analyze pedigree data with ordinal traits, we recently proposed a latent variable model which has higher power to detect linkage using ordinal traits than methods using the dichotomized traits. The challenge with the latent variable model is that the likelihood is usually very complicated, and as a result, the computation of the likelihood ratio statistic is too intensive for large pedigrees. In this paper, we derive a computationally efficient score statistic based on the identity-by-decent sharing information between relatives. Using simulation studies, we examined the asymptotic distribution of the test statistic and the power of our proposed test under various levels of heritability. We compared the computing time as well as power of the score test with the likelihood ratio test. We then applied our method for the Collaborative Study on the Genetics of Alcoholism and performed a genome scan to map susceptibility genes for alcohol dependence. We found a strong linkage signal on chromosome 4.  相似文献   

17.
ABSTRACT: BACKGROUND: Although many experiments have measurements on multiple traits, most studies performed the analysis of mapping of quantitative trait loci (QTL) for each trait separately using single trait analysis. Single trait analysis does not take advantage of possible genetic and environmental correlations between traits. In this paper, we propose a novel statistical method for multiple trait multiple interval mapping (MTMIM) of QTL for inbred line crosses. We also develop a novel score-based method for estimating genome-wide significance level of putative QTL effects suitable for the MTMIM model. The MTMIM method is implemented in the freely available and widely used Windows QTL Cartographer software. RESULTS: Throughout the paper, we provide compelling empirical evidences that: (1) the score-based threshold maintains proper type I error rate and tends to keep false discovery rate within an acceptable level; (2) the MTMIM method can deliver better parameter estimates and power than single trait multiple interval mapping method; (3) an analysis of Drosophila dataset illustrates how the MTMIM method can better extract information from datasets with measurements in multiple traits. CONCLUSIONS: The MTMIM method represents a convenient statistical framework to test hypotheses of pleiotropic QTL versus closely linked nonpleiotropic QTL, QTL by environment interaction, and to estimate the total genotypic variance-covariance matrix between traits and to decompose it in terms of QTL-specific variance-covariance matrices, therefore, providing more details on the genetic architecture of complex traits.  相似文献   

18.
Scanning the genome for association between markers and complex diseases typically requires testing hundreds of thousands of genetic polymorphisms. Testing such a large number of hypotheses exacerbates the trade-off between power to detect meaningful associations and the chance of making false discoveries. Even before the full genome is scanned, investigators often favor certain regions on the basis of the results of prior investigations, such as previous linkage scans. The remaining regions of the genome are investigated simultaneously because genotyping is relatively inexpensive compared with the cost of recruiting participants for a genetic study and because prior evidence is rarely sufficient to rule out these regions as harboring genes with variation of conferring liability (liability genes). However, the multiple testing inherent in broad genomic searches diminishes power to detect association, even for genes falling in regions of the genome favored a priori. Multiple testing problems of this nature are well suited for application of the false-discovery rate (FDR) principle, which can improve power. To enhance power further, a new FDR approach is proposed that involves weighting the hypotheses on the basis of prior data. We present a method for using linkage data to weight the association P values. Our investigations reveal that if the linkage study is informative, the procedure improves power considerably. Remarkably, the loss in power is small, even when the linkage study is uninformative. For a class of genetic models, we calculate the sample size required to obtain useful prior information from a linkage study. This inquiry reveals that, among genetic models that are seemingly equal in genetic information, some are much more promising than others for this mode of analysis.  相似文献   

19.
To date, most genetic analyses of phenotypes have focused on analyzing single traits or analyzing each phenotype independently. However, joint epistasis analysis of multiple complementary traits will increase statistical power and improve our understanding of the complicated genetic structure of the complex diseases. Despite their importance in uncovering the genetic structure of complex traits, the statistical methods for identifying epistasis in multiple phenotypes remains fundamentally unexplored. To fill this gap, we formulate a test for interaction between two genes in multiple quantitative trait analysis as a multiple functional regression (MFRG) in which the genotype functions (genetic variant profiles) are defined as a function of the genomic position of the genetic variants. We use large-scale simulations to calculate Type I error rates for testing interaction between two genes with multiple phenotypes and to compare the power with multivariate pairwise interaction analysis and single trait interaction analysis by a single variate functional regression model. To further evaluate performance, the MFRG for epistasis analysis is applied to five phenotypes of exome sequence data from the NHLBI’s Exome Sequencing Project (ESP) to detect pleiotropic epistasis. A total of 267 pairs of genes that formed a genetic interaction network showed significant evidence of epistasis influencing five traits. The results demonstrate that the joint interaction analysis of multiple phenotypes has a much higher power to detect interaction than the interaction analysis of a single trait and may open a new direction to fully uncovering the genetic structure of multiple phenotypes.  相似文献   

20.
Identifying gene-gene interactions or gene-environment interactions in studies of human complex diseases remains a big challenge in genetic epidemiology. An additional challenge, often forgotten, is to account for important lower-order genetic effects. These may hamper the identification of genuine epistasis. If lower-order genetic effects contribute to the genetic variance of a trait, identified statistical interactions may simply be due to a signal boost of these effects. In this study, we restrict attention to quantitative traits and bi-allelic SNPs as genetic markers. Moreover, our interaction study focuses on 2-way SNP-SNP interactions. Via simulations, we assess the performance of different corrective measures for lower-order genetic effects in Model-Based Multifactor Dimensionality Reduction epistasis detection, using additive and co-dominant coding schemes. Performance is evaluated in terms of power and familywise error rate. Our simulations indicate that empirical power estimates are reduced with correction of lower-order effects, likewise familywise error rates. Easy-to-use automatic SNP selection procedures, SNP selection based on "top" findings, or SNP selection based on p-value criterion for interesting main effects result in reduced power but also almost zero false positive rates. Always accounting for main effects in the SNP-SNP pair under investigation during Model-Based Multifactor Dimensionality Reduction analysis adequately controls false positive epistasis findings. This is particularly true when adopting a co-dominant corrective coding scheme. In conclusion, automatic search procedures to identify lower-order effects to correct for during epistasis screening should be avoided. The same is true for procedures that adjust for lower-order effects prior to Model-Based Multifactor Dimensionality Reduction and involve using residuals as the new trait. We advocate using "on-the-fly" lower-order effects adjusting when screening for SNP-SNP interactions using Model-Based Multifactor Dimensionality Reduction analysis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号