首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
ABSTRACT: BACKGROUND: In the last years GWA studies have successfully identified common SNPs associated with complex diseases. However, most of the variants found this way account for only a small portion of the trait variance. This fact leads researchers to focus on rare-variant mapping with large scale sequencing, which can be facilitated by using linkage information. The question arises why linkage analysis often fails to identify genes when analyzing complex diseases. Using simulations we have investigated the power of parametric and nonparametric linkage statistics (KC-LOD, NPL, LOD and MOD scores), to detect the effect of genes responsible for complex diseases using different pedigree structures. RESULTS: As expected, a small number of pedigrees with less than three affected individuals has low power to map disease genes with modest effect. Interestingly, the power decreases when unaffected individuals are included in the analysis, irrespective of the true mode of inheritance. Furthermore, we found that the best performing statistic depends not only on the type of pedigrees but also on the true mode of inheritance. CONCLUSIONS: When applied in a sensible way linkage is an appropriate and robust technique to map genes for complex disease. Unlike association analysis, linkage analysis is not hampered by allelic heterogeneity. So, why does linkage analysis often fail with complex diseases? Evidently, when using an insufficient number of small pedigrees, one might miss a true genetic linkage when actually a real effect exists. Furthermore, we show that the test statistic has an important effect on the power to detect linkage as well. Therefore, a linkage analysis might fail if an inadequate test statistic is employed. We provide recommendations regarding the most favorable test statistics, in terms of power, for a given mode of inheritance and type of pedigrees under study, in order to reduce the probability to miss a true linkage.  相似文献   

2.
Previously, Szatkiewicz and colleagues evaluated the performance of a wide variety of statistics for quantitative-trait-locus linkage, using discordant sibling pairs. They found that the most powerful statistics, in general, were a score statistic and a "composite statistic." However, whereas these two statistics have equal power under ideal conditions, each has limitations that reduce its power in certain circumstances. The score statistic depends on estimates of trait parameters and can lose a lot of power if those estimates are incorrect. The composite statistic is not sensitive to trait-parameter estimates but does depend on arbitrary weights that must be chosen on the basis of the ascertainment scheme. In this report, we elucidate the algebraic relationship between the score and composite statistics and then use that relationship to suggest a new statistic that combines the best properties of both. We call our new statistic the "robust discordant pair" (RDP) statistic. We report simulation studies to show that the RDP statistic does, indeed, have all of the strengths and none of the weaknesses of the score and composite statistics.  相似文献   

3.
We have compared the power of a large number of allele-sharing statistics for "nonparametric" linkage analysis with affected sibships. Our rationale was that there is an extensive literature comparing statistics for sibling pairs but that there has not been much guidance on how to choose statistics for studies that include sibships of various sizes. We concentrated on statistics that can be described as assigning scores to each identity-by-descent-sharing configuration that a pedigree might take on (Whittemore and Halpern 1994). We considered sibships of sizes two through five, 27 different genetic models, and varying recombination fractions between the marker and the trait locus. We tried to identify statistics whose power was robust over a wide variety of models. We found that the statistic that is probably used most often in such studies-S(all)-performs quite well, although it is not necessarily the best. We also found several other statistics (such as the R criterion, S(robdom), and the Sobel-and-Lange statistic C) that perform well in most situations, a few (such as S(-#geno) and the Feingold-and-Siegmund version of S(pairs)) that have high power only in very special situations, and a few (such as S(-#geno), the N criterion, and the Sobel-and-Lange statistic B) that seem to have low power for the majority of the trait models. For the most part, the same statistics performed well for all sibship sizes. We also used our results to give some suggestions regarding how to weight sibships of different sizes, in forming an overall statistic.  相似文献   

4.
Here we present analytical studies to evaluate the relative efficiency of commonly used penetrance estimators using linkage designs. We investigated three different methods of estimating penetrance using sib pairs: Maximum likehood estimation (MLE) with trait information alone, MLE with both trait and marker information and the MOD score approach. Modeling sib pairs with unknown phase, we evaluated the asymptotic relative efficiency between estimators under either random sampling or single ascertainment for an autosomal dominant or recessive disease. We then provide plots of the asymptotic relative efficiency, enabling researchers to easily determine regions where the MOD score or segregation alone performs with comparable efficiency relative to joint segregation and linkage.  相似文献   

5.
Zhang H  Wang X  Ye Y 《Genetics》2006,172(1):693-699
There is growing interest in genomewide association analysis using single-nucleotide polymorphisms (SNPs), because traditional linkage studies are not as powerful in identifying genes for common, complex diseases. Tests for linkage disequilibrium have been developed for binary and quantitative traits. However, since many human conditions and diseases are measured in an ordinal scale, methods need to be developed to investigate the association of genes and ordinal traits. Thus, in the current report we propose and derive a score test statistic that identifies genes that are associated with ordinal traits when gametic disequilibrium between a marker and trait loci exists. Through simulation, the performance of this new test is examined for both ordinal traits and quantitative traits. The proposed statistic not only accommodates and is more powerful for ordinal traits, but also has similar power to that of existing tests when the trait is quantitative. Therefore, our proposed statistic has the potential to serve as a unified approach to identifying genes that are associated with any trait, regardless of how the trait is measured. We further demonstrated the advantage of our test by revealing a significant association (P = 0.00067) between alcohol dependence and a SNP in the growth-associated protein 43.  相似文献   

6.
We analyze some aspects of scan statistics, which have been proposed to help for the detection of weak signals in genetic linkage analysis. We derive approximate expressions for the power of a test based on moving averages of the identity by descent allele sharing proportions for pairs of relatives at several contiguous markers. We confirm these approximate formulae by simulation. The results show that when there is a single trait-locus on a chromosome, the test based on the scan statistic is slightly less powerful than that based on the customary allele sharing statistic. On the other hand, if two genes having a moderate effect on a trait lie close to each other on the same chromosome, scan statistics improve power to detect linkage.  相似文献   

7.
We investigate strategies for detecting linkage of recessive and partially recessive traits, using sibling pairs and inbred individuals. We assume that a genomewide search is being conducted and that locus heterogeneity of the trait is likely. For sibling pairs, we evaluate the efficiency of different statistics under the assumption that one does not know the true degree of recessiveness of the trait. We recommend a sibling-pair statistic that is a linear compromise between two previously suggested statistics. We also compare the power of sibling pairs to that of more distant relatives, such as cousins. For inbred individuals, we evaluate the power of offspring of different types of matings and compare them to sibling pairs. Over a broad range of trait etiologies, sibling pairs are more powerful than inbred individuals, but for traits caused by very rare alleles, particularly in the case of heterogeneity, inbred individuals can be much more powerful. The models we develop can also be used to examine specific situations other than those we look at. We present this analysis in the idealized context of a dense set of highly polymorphic markers. In general, incorporation of real-world complexities makes inbred individuals, particularly offspring of distant relatives, look slightly less useful than our results imply.  相似文献   

8.
We present here four nonparametric statistics for linkage analysis that test whether pairs of affected relatives share marker alleles more often than expected. These statistics are based on simulating the null distribution of a given statistic conditional on the unaffecteds' marker genotypes. Each statistic uses a different measure of marker sharing: the SimAPM statistic uses the simulation-based affected-pedigree-member measure based on identity-by-state (IBS) sharing. The SimKIN (kinship) measure is 1.0 for identity-by-descent (IBD) sharing, 0.0 for no IBD status sharing, and the kinship coefficient when the IBD status is ambiguous. The simulation-based IBD (SimIBD) statistic uses a recursive algorithm to determine the probability of two affecteds sharing a specific allele IBD. The SimISO statistic is identical to SimIBD, except that it also measures marker similarity between unaffected pairs. We evaluated our statistics on data simulated under different two-locus disease models, comparing our results to those obtained with several other nonparametric statistics. Use of IBD information produces dramatic increases in power over the SimAPM method, which uses only IBS information. The power of our best statistic in most cases meets or exceeds the power of the other nonparametric statistics. Furthermore, our statistics perform comparisons between all affected relative pairs within general pedigrees and are not restricted to sib pairs or nuclear families.  相似文献   

9.
We have compared the power of several allele-sharing statistics for "nonparametric" linkage analysis of X-linked traits in nuclear families and extended pedigrees. Our rationale was that, although several of these statistics have been implemented in popular software packages, there has been no formal evaluation of their relative power. Here, we evaluate the relative performance of five test statistics, including two new test statistics. We considered sibships of sizes two through four, four different extended pedigrees, 15 different genetic models (12 single-locus models and 3 two-locus models), and varying recombination fractions between the marker and the trait locus. We analytically estimated the sample sizes required for 80% power at a significance level of.001 and also used simulation methods to estimate power for a sample size of 10 families. We tried to identify statistics whose power was robust over a wide variety of models, with the idea that such statistics would be particularly useful for detection of X-linked loci associated with complex traits. We found that a commonly used statistic, S(all), generally performed well under various conditions and had close to the optimal sample sizes in most cases but that there were certain cases in which it performed quite poorly. Our two new statistics did not perform any better than those already in the literature. We also note that, under dominant and additive models, regardless of the statistic used, pedigrees with all-female siblings have very little power to detect X-linked loci.  相似文献   

10.
Sib-pair linkage analysis has been proposed for identifying genes that predispose to common diseases. We have shown that the presence of assortative mating and multiple disease-susceptibility loci (genetic heterogeneity) can increase the required sample size for affected-affected sib pairs several fold over the sample size required under random mating. We propose a new test statistic based on sib trios composed of either one unaffected and two affected siblings or one affected and two unaffected siblings. The sample-size requirements under assortative mating and multiple disease loci for these sib-trio statistics are much smaller, under most conditions, than the corresponding sample sizes for sib pairs. Study designs based on data from sib trios with one or two affected members are recommended whenever assortative mating and genetic heterogeneity are suspected.  相似文献   

11.
Extreme discordant sibling-pair (EDSP) designs have been shown in theory to be very powerful for mapping quantitative-trait loci (QTLs) in humans. However, their practical applicability has been somewhat limited by the need to phenotype very large populations to find enough pairs that are extremely discordant. In this paper, we demonstrate that there is also substantial power in pairs that are only moderately discordant, and that designs using moderately discordant pairs can yield a more practical balance between phenotyping and genotyping efforts. The power we demonstrate for moderately discordant pairs stems from a new statistical result. Statistical analysis in discordant-pair studies is generally done by testing for reduced identity by descent (IBD) sharing in the pairs. By contrast, the most commonly-used statistical methods for more standard QTL mapping are Haseman-Elston regression and variance-components analysis. Both of these use statistics that are functions of the trait values given IBD information for the pedigree. We show that IBD sharing statistics and "trait value given IBD" statistics contribute complementary rather than redundant information, and thus that statistics of the two types can be combined to form more powerful tests of linkage. We propose a simple composite statistic, and test it with simulation studies. The simulation results show that our composite statistic increases power only minimally for extremely discordant pairs. However, it boosts the power of moderately discordant pairs substantially and makes them a very practical alternative. Our composite statistic is straightforward to calculate with existing software; we give a practical example of its use by applying it to a Genetic Analysis Workshop (GAW) data set.  相似文献   

12.
Fan R  Floros J  Xiong M 《Human heredity》2002,53(3):130-145
In this paper, we explore models and tests for association and linkage studies of a quantitative trait locus (QTL) linked to a multi-allele marker locus. Based on the difference between an offspring's conditional trait means of receiving and not receiving an allele from a parent at marker locus, we propose three statistics T(m), T(m,row) and T(m,col) to test association or linkage disequilibrium between the marker locus and the QTL. These tests are composite tests, and use the offspring marginal sample means including offspring data of both homozygous and heterozygous parents. For the linkage study, we calculate the offspring's conditional trait mean given the allele transmission status of a heterozygous parent at the marker locus. Based on the difference between the conditional means of a transmitted and a nontransmitted allele from a heterozygous parent, we propose statistics T(parsi), T(satur), T(gen) and T(m,het) to perform composite tests of linkage between the marker locus and the quantitative trait locus in the presence of association. These tests only use the offspring data that are related to the heterozygous parents at the marker locus. T(parsi) is a parsimonious or allele-wise statistic, T(satur) and T(gen )are satured or genotype-wise statistics, and T(m,het) compares the row and column sample means for offspring data of heterozygous parents. After comparing the powers and the sample sizes, we conclude that T(parsi) has higher power than those of the bi-allele tests, T(satur), T(gen), and T(m,het). If there is tight linkage between the marker and the trait locus, T(parsi) is powerful in detecting linkage between the marker and the trait locus in the presence of association. By investigating the goodness-of-fit of T(parsi), we find that T(satur) does not gain much power compared to that of T(parsi). Moreover, T(parsi) takes into account the pattern of the data that is consistent with linkage and linkage disequilibrium. As the number of alleles at the marker locus increases, T(parsi) is very conservative, and can be useful even for sparse data. To illustrate the usefulness and the power of the methods proposed in this paper, we analyze the chromosome 6 data of the Oxford asthma data, Genetic Analysis Workshop 12.  相似文献   

13.
The aim of this study was to determine whether identity-by-descent (IBD) information for affected sib pairs (ASPs) can be used to select a sample of cases for a genetic case-control study which will provide more power for detecting association with loci in a known linkage region. By modeling the expected frequency of the disease allele in ASPs showing IBD sharing of 0, 1, or 2 alleles, and considering additive, recessive, and dominant disease models, we show that cases selected from IBD 2 families are best for this purpose, followed by those selected from IBD 1 families; least useful are cases selected from IBD 0 families.  相似文献   

14.
Power and sample size calculations are critical parts of any research design for genetic association. We present a method that utilizes haplotype frequency information and average marker-marker linkage disequilibrium on SNPs typed in and around all genes on a chromosome. The test statistic used is the classic likelihood ratio test applied to haplotypes in case/control populations. Haplotype frequencies are computed through specification of genetic model parameters. Power is determined by computation of the test's non-centrality parameter. Power per gene is computed as a weighted average of the power assuming each haplotype is associated with the trait. We apply our method to genotype data from dense SNP maps across three entire chromosomes (6, 21, and 22) for three different human populations (African-American, Caucasian, Chinese), three different models of disease (additive, dominant, and multiplicative) and two trait allele frequencies (rare, common). We perform a regression analysis using these factors, average marker-marker disequilibrium, and the haplotype diversity across the gene region to determine which factors most significantly affect average power for a gene in our data. Also, as a 'proof of principle' calculation, we perform power and sample size calculations for all genes within 100 kb of the PSORS1 locus (chromosome 6) for a previously published association study of psoriasis. Results of our regression analysis indicate that four highly significant factors that determine average power to detect association are: disease model, average marker-marker disequilibrium, haplotype diversity, and the trait allele frequency. These findings may have important implications for the design of well-powered candidate gene association studies. Our power and sample size calculations for the PSORS1 gene appear consistent with published findings, namely that there is substantial power (>0.99) for most genes within 100 kb of the PSORS1 locus at the 0.01 significance level.  相似文献   

15.
Genome wide association studies have been usually analyzed in a univariate manner. The commonly used univariate tests have one degree of freedom and assume an additive mode of inheritance. The experiment-wise significance of these univariate statistics is obtained by adjusting for multiple testing. Next generation sequencing studies, which assay 10-20 million variants, are beginning to come online. For these studies, the strategy of additive univariate testing and multiple testing adjustment is likely to result in a loss of power due to (1) the substantial multiple testing burden and (2) the possibility of a non-additive causal mode of inheritance. To reduce the power loss we propose: a new method (1) to summarize in a single statistic the strength of the association signals coming from all not-very-rare variants in a linkage disequilibrium block and (2) to incorporate, in any linkage disequilibrium block statistic, the strength of the association signals under multiple modes of inheritance. The proposed linkage disequilibrium block test consists of the sum of squares of nominally significant univariate statistics. We compare the performance of this method to the performance of existing linkage disequilibrium block/gene-based methods. Simulations show that (1) extending methods to combine testing for multiple modes of inheritance leads to substantial power gains, especially for a recessive mode of inheritance, and (2) the proposed method has a good overall performance. Based on simulation results, we provide practical advice on choosing suitable methods for applied analyses.  相似文献   

16.
Hereditary spastic paraplegia (HSP) is a degenerative disorder of the motor system, defined by progressive weakness and spasticity of the lower limbs. HSP may be inherited as an autosomal dominant (AD), autosomal recessive, or an X-linked trait. AD HSP is genetically heterogeneous, and three loci have been identified so far: SPG3 maps to chromosome 14q, SPG4 to 2p, and SPG4a to 15q. We have undertaken linkage analysis with 21 uncomplicated AD families to the three AD HSP loci. We report significant linkage for three of our families to the SPG4 locus and exclude several families by multipoint linkage. We used linkage information from several different research teams to evaluate the statistical probability of linkage to the SPG4 locus for uncomplicated AD HSP families and established the critical LOD-score value necessary for confirmation of linkage to the SPG4 locus from Bayesian statistics. In addition, we calculated the empirical P-values for the LOD scores obtained with all families with computer simulation methods. Power to detect significant linkage, as well as type I error probabilities, were evaluated. This combined analytical approach permitted conclusive linkage analyses on small to medium-size families, under the restrictions of genetic heterogeneity.  相似文献   

17.
Genetic linkage maps are often based on maximum-likelihood estimates of recombination fractions which are converted into map units by mapping functions. This paper presents a cost analysis of linkage analysis for a segregating F2␣population with codominant or dominant molecular markers and a qualitative monogenic dominant–recessive trait. For illustration, a disease-resistance trait is considered, where the susceptible allele is recessive. Three sub-populations of the F2 can be used for linkage analysis [susceptible (= recessive) individuals, resistant (= dominant) individuals, complete F2]. While it is well-known that recessive individuals are more informative than dominant individuals, it is not obvious a priori, which of the three sub-populations should be preferred, when costs of phenotyping and genotyping are taken into consideration. A comparative economic analysis of alternative procedures of linkage detection based on these three sub-populations does exhibit a clear economic superiority of the sub-population of susceptible (= recessive) individuals, when costs of genotyping are high. This cost-effectiveness is due to the higher information content of this sub-population compared to the sub-population of dominant (= resistant) individuals and also compared to the complete F2. Our final conclusion/recommendation is as follows: If the cost to genotype an individual is sufficiently large compared with the cost to phenotype an individual, then linkage analysis and genetic mapping should be only based on susceptible (= recessive) individuals. Conversely, if the cost of phenotyping exceeds that for genotyping, it may be preferable to genotype all plants. The exact conditions under which a strategy is preferable are described in the paper.  相似文献   

18.
We propose a general likelihood-based approach to the linkage analysis of qualitative and quantitative traits using identity by descent (IBD) data from sib-pairs. We consider the likelihood of IBD data conditional on phenotypes and test the null hypothesis of no linkage between a marker locus and a gene influencing the trait using a score test in the recombination fraction theta between the two loci. This method unifies the linkage analysis of qualitative and quantitative traits into a single inferential framework, yielding a simple and intuitive test statistic. Conditioning on phenotypes avoids unrealistic random sampling assumptions and allows sib-pairs from differing ascertainment mechanisms to be incorporated into a single likelihood analysis. In particular, it allows the selection of sib-pairs based on their trait values and the analysis of only those pairs having the most informative phenotypes. The score test is based on the full likelihood, i.e. the likelihood based on all phenotype data rather than just differences of sib-pair phenotypes. Considering only phenotype differences, as in Haseman and Elston (1972) and Kruglyak and Lander (1995), may result in important losses in power. The linkage score test is derived under general genetic models for the trait, which may include multiple unlinked genes. Population genetic assumptions, such as random mating or linkage equilibrium at the trait loci, are not required. This score test is thus particularly promising for the analysis of complex human traits. The score statistic readily extends to accommodate incomplete IBD data at the test locus, by using the hidden Markov model implemented in the programs MAPMAKER/SIBS and GENEHUNTER (Kruglyak and Lander, 1995; Kruglyak et al., 1996). Preliminary simulation studies indicate that the linkage score test generally matches or outperforms the Haseman-Elston test, the largest gains in power being for selected samples of sib-pairs with extreme phenotypes.  相似文献   

19.
Genetic linkage studies are reported on two families with cleft lip +/- cleft palate. For the first family (LP01) the etiology of the clefting is unknown, and the linkage analyses were done assuming both autosomal dominant and autosomal recessive inheritance. Close linkage is rejected with the Duffy blood group under the dominant model and with four loci (Duffy, Kidd, and ABO blood groups and haptoglobin) under the recessive model. The second family (LP02) is a Mexican-American family segregating the van der Woude syndrome with lip pits. The linkage analyses for this autosomal dominant trait excluded close linkage with seven genetic markers, including three on chromosome one. The maximum lod scores were 0.6 with BF (chromosome 6) and 0.4 with the P blood group, which is not yet mapped.  相似文献   

20.
Several methods have been proposed for linkage analysis of complex traits with unknown mode of inheritance. These methods include the LOD score maximized over disease models (MMLS) and the "nonparametric" linkage (NPL) statistic. In previous work, we evaluated the increase of type I error when maximizing over two or more genetic models, and we compared the power of MMLS to detect linkage, in a number of complex modes of inheritance, with analysis assuming the true model. In the present study, we compare MMLS and NPL directly. We simulated 100 data sets with 20 families each, using 26 generating models: (1) 4 intermediate models (penetrance of heterozygote between that of the two homozygotes); (2) 6 two-locus additive models; and (3) 16 two-locus heterogeneity models (admixture alpha = 1.0,.7,.5, and.3; alpha = 1.0 replicates simple Mendelian models). For LOD scores, we assumed dominant and recessive inheritance with 50% penetrance. We took the higher of the two maximum LOD scores and subtracted 0.3 to correct for multiple tests (MMLS-C). We compared expected maximum LOD scores and power, using MMLS-C and NPL as well as the true model. Since NPL uses only the affected family members, we also performed an affecteds-only analysis using MMLS-C. The MMLS-C was both uniformly more powerful than NPL for most cases we examined, except when linkage information was low, and close to the results for the true model under locus heterogeneity. We still found better power for the MMLS-C compared with NPL in affecteds-only analysis. The results show that use of two simple modes of inheritance at a fixed penetrance can have more power than NPL when the trait mode of inheritance is complex and when there is heterogeneity in the data set.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号