首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Detecting the association between genetic markers and complex diseases can be a critical first step toward identification of the genetic basis of disease. Misleading associations can be avoided by choosing as controls the parents of diseased cases, but the availability of parents often limits this design to early-onset disease. Alternatively, sib controls offer a valid design. A general multivariate score statistic is presented, to detect the association between a multiallelic genetic marker locus and affection status; this general approach is applicable to designs that use parents as controls, sibs as controls, or even unrelated controls whose genotypes do not fit Hardy-Weinberg proportions or that pool any combination of these different designs. The benefit of this multivariate score statistic is that it will tend to be the most powerful method when multiple marker alleles are associated with affection status. To plan these types of studies, we present methods to compute sample size and power, allowing for varying sibship sizes, ascertainment criteria, and genetic models of risk. The results indicate that sib controls have less power than parental controls and that the power of sib controls can be increased by increasing either the number of affected sibs per sibship or the number of unaffected control sibs. The sample-size results indicate that the use of sib controls to test for associations, by use of either a single-marker locus or a genomewide screen, will be feasible for markers that have a dominant effect and for common alleles having a recessive effect. The results presented will be useful for investigators planning studies using sibs as controls.  相似文献   

2.
The association of a candidate gene with disease can be efficiently evaluated by a case-control study in which allele frequencies are compared for diseased cases and unaffected controls. However, when the distribution of genotypes in the population deviates from Hardy-Weinberg proportions, the frequency of genotypes--rather than alleles--should be compared by the Armitage test for trend. We present formulas for power and sample size for studies that use Armitage's trend test. The formulas make no assumptions about Hardy-Weinberg equilibrium, but do assume random ascertainment of cases and controls, all of whom are independent of one another. We demonstrate the accuracy of the formulas by simulations.  相似文献   

3.
This paper is concerned with estimating parameters associated with HLA-linked diseases. We consider a single disease locus closely linked to HLA, allowing a disease and a normal allele. The parameters to be estimated are the penetrances of the genotypes at the disease locus, the population frequency of the disease allele, and the distance of the disease locus from HLA. The presently used method of estimation uses HLA-sharing information from affected sib-pairs. The method proposed here generalizes the previous approach, using data from all sibs (affected or unaffected) in a family of any size. It allows immediate generalizations to the use of information on parental affectedness status and population prevalence.  相似文献   

4.
Case-control disease-marker association studies are often used in the search for variants that predispose to complex diseases. One approach to increasing the power of these studies is to enrich the case sample for individuals likely to be affected because of genetic factors. In this article, we compare three case-selection strategies that use allele-sharing information with the standard strategy that selects a single individual from each family at random. In affected sibship samples, we show that, by carefully selecting sibships and/or individuals on the basis of allele sharing, we can increase the frequency of disease-associated alleles in the case sample. When these cases are compared with unrelated controls, the difference in the frequency of the disease-associated allele is therefore also increased. We find that, by choosing the affected sib who shows the most evidence for pairwise allele sharing with the other affected sibs in families, the test statistic is increased by >20%, on average, for additive models with modest genotype relative risks. In addition, we find that the per-genotype information associated with the allele sharing-based strategies is increased compared with that associated with random selection of a sib for genotyping. Even though we select sibs on the basis of a nonparametric statistic, the additional gain for selection based on the unknown underlying mode of inheritance is minimal. We show that these properties hold even when the power to detect linkage to a region in the entire sample is negligible. This approach can be extended to more-general pedigree structures and quantitative traits.  相似文献   

5.
In two previous articles, we have considered sample sizes required to detect linkage for mapping quantitative-trait loci in humans, using extreme discordant sib pairs. Here, we examine further the use of extreme concordant sib pairs but consider the effect of parents' phenotypes. Sample sizes necessary to obtain a power of 80% with concordant sib pairs at a significance level of .0001 are given, stratified by parental phenotypes. When there is no residual correlation between sibs, the parental phenotypes have little impact on the sample sizes. When residual correlations between sibs exist, we show, however, that power can be considerably reduced by including extreme sib pairs when the parents also have similarly extreme values. Thus, we recommend the exclusion of such pairs from linkage studies. This recommendation reduces the required sample sizes by 3- to 28-fold. The degree of saving in the required sample sizes varies among different models and allele frequencies. The reduction is most dramatic (a 28-fold reduction) for a rare recessive gene.  相似文献   

6.
Family-based tests of association are now often used when trying to fine-map a disease susceptibility locus. Recently, several tests of linkage and association have been proposed that use nuclear families with multiple affected and unaffected sibs rather than just case-parent triads. In this paper we propose a test that generalizes these previous tests. Formulae are derived to calculate the power of the test for a randomly mating population. These power calculations are used to determine conditions under which it is advantageous to include unaffected sibs in the analysis.  相似文献   

7.
Deng HW  Chen WM  Recker RR 《Human genetics》2002,110(5):451-461
The transmission disequilibrium test (TDT) has been employed to map disease susceptibility loci (DSL), while being immune to the problem of population admixture. The customary TDT test (TDT(D)) was developed for affected child(ren) and their parents and was most often applied to case-parent trios. Recently, the TDT has been extended to the situations when (1) parents are not available but affected and nonaffected sibs from each family are available, (2) unrelated control-parent trios are available for combined analyses with case-parent trios (TDT(DC)), and (3) large pedigrees. For many diseases, affected children in the case-parent trios enlisted into the TDT(D) have unaffected sibs who can be recruited. We present an extension of the TDT by effectively incorporating one unaffected sib of each of the affected children in the case-parent trios into a single analysis (TDT(DS), where DS denotes discordant sib pairs). We have developed a general analytical method for computing the statistical power of the TDT(DS) under any genetic model, the accuracy of which is validated by computer simulations. We compare the power of the TDT(D), TDT(DC), and TDT(DS) under a range of parameter space and genetic models. We find that the TDT(DS) is generally more powerful than the TDT(DC) and TDT(D), particularly when the disease is prevalent (>30%) in the population. The relative power of the TDT(D) and the TDT(DS) largely depends upon the allele frequencies and genetic effects at the DSL, whereas the recombination rate, the degree of linkage disequilibrium, and the marker allele frequencies have little effect. Importantly, the TDT(DS) not only may be more powerful, it also has the advantage of being able to test for segregation distortion that may yield false linkage/association in the TDT(D).  相似文献   

8.
Segregation analysis, employing nuclear families, is the most frequently used method to evaluate the mode of inheritance of a trait. To our knowledge, there exists no tabular information regarding the sample sizes required of individuals and families needed to perform a significance test of a specific segregation ratio for a predetermined power and significance level. To fill this gap, we have developed sample-size tables based on the asymptotic variance of the maximum likelihood estimate of the segregation ratio and on the normal approximation for two-sided hypothesis testing. Assuming homogeneous sibship size, minimum sample sizes were determined for testing the null hypothesis for the segregation ratio of 1/4 or 1/2 vs. alternative values of .05-.80, for the significance level of .05 and power of .8, for ascertainment probabilities of nearly 0 to 1.0, and sibship sizes 2-7. The results of these calculations indicate a complex interaction of the null and the alternate hypotheses, ascertainment probability, and sibship size in determining the sample size required for simple segregation analysis. The accompanying tables should aid in the appropriate design and cost assessment of future genetic epidemiologic studies.  相似文献   

9.
This paper is concerned with efficient strategies for gene mapping using pedigrees containing small numbers of affecteds and identity-by-descent data from closely spaced markers throughout the genome. Particular attention is paid to additive traits involving phenocopies and/or locus heterogeneity. For a sample of pedigrees containing a particular configuration of affecteds, e.g., pairs of siblings together with a first cousin, we use a likelihood analysis to find 1-df statistics that are very efficient over a broad range of penetrances and allele frequencies. We identify configurations of affecteds that are particularly powerful for detecting linkage, and we show how pedigrees containing different numbers and configurations of affecteds can be efficiently combined in an overall test statistic.  相似文献   

10.
Linkage disequilibrium has been used to help in the identification of genes predisposing to certain qualitative diseases. Although several linkage-disequilibrium tests have been developed for localization of genes influencing quantitative traits, these tests have not been thoroughly compared with one another. In this report we compare, under a variety of conditions, several different linkage-disequilibrium tests for identification of loci affecting quantitative traits. These tests use either single individuals or parent-child trios. When we compared tests with equal samples, we found that the truncated measured allele (TMA) test was the most powerful. The trait allele frequencies, the stringency of sample ascertainment, the number of marker alleles, and the linked genetic variance affected the power, but the presence of polygenes did not. When there were more than two trait alleles at a locus in the population, power to detect disequilibrium was greatly diminished. The presence of unlinked disequilibrium (D'*) increased the false-positive error rates of disequilibrium tests involving single individuals but did not affect the error rates of tests using family trios. The increase in error rates was affected by the stringency of selection, the trait allele frequency, and the linked genetic variance but not by polygenic factors. In an equilibrium population, the TMA test is most powerful, but, when adjusted for the presence of admixture, Allison test 3 becomes the most powerful whenever D'*>.15.  相似文献   

11.
OBJECTIVES: Confidence intervals for genotype relative risks, for allele frequencies and for the attributable risk in the case parent trio design for candidate-gene studies are proposed which can be easily calculated from the observed familial genotype frequencies. METHODS: Likelihood theory and the delta method were used to derive point estimates and confidence internals. We used Monte Carlo simulations to show the validity of the formulae for a variety of given modes of inheritance and allele frequencies and illustrated their usefulness by applying them to real data. RESULTS: Generally these formulae were found to be valid for 'sufficiently large' sample sizes. For smaller sample sizes the estimators for genotype relative risks tended to be conservative whereas the estimator for attributable risk was found to be anti-conservative for moderate to high allele frequencies. CONCLUSIONS: Since the proposed formulae provide quantitative information on the individual and epidemiological relevance of a genetic variant they might be a useful addition to the traditional statistical significance level of TDT results.  相似文献   

12.
Using large-sample theory, we present a unified approach to power calculations for family-based association tests. Currently available methods for power calculations are restricted to special designs or require approximations or simulations. Our analytical approach to power calculations is broadly applicable in many settings. We discuss power calculations for two scenarios that have high practical relevance and in which power previously could only be assessed by simulation studies or by approximations: (1) studies using both affected and unaffected offspring and (2) studies with missing parental information. When the population prevalence is high, it can be worthwhile to genotype unaffected offspring. For many scenarios, high power can be achieved with reasonable sample sizes, even when no parental information is available.  相似文献   

13.
The transmission/disequilibrium test (TDT) and the affected sib pair test (ASP) both test for the association of a marker allele with some conditions. Here, we present methods for calculating the probability of detecting the association (power) for a study examining a fixed number of families for suitability for the study and for calculating the number of such families to be examined. Both calculations use a genetic model for the association. The model considered posits a bi-allelic marker locus that is linked to a bi-allelic disease locus with a possibly nonzero recombination fraction between the loci. The penetrance of the disease is an increasing function of the number of disease alleles. The TDT tests whether the transmission by a heterozygous parent of a particular allele at a marker locus to an affected offspring occurs with probability greater than 0.5. The ASP tests whether transmission of the same allele to two affected sibs occurs with probability greater than 0.5. In either case, evidence that the probability is greater than 0.5 is evidence for association between the marker and the disease. Study inclusion criteria (IC) can greatly affect the necessary sample size of a TDT or ASP study. IC considered by us include a randomly selected parent at least one parent or both parents required to be heterozygous. It also allows a specified minimum number of affected offspring to be required (TDT only). We use elementary probability calculations rather than complex mathematical manipulations or asymptotic methods (large sample size approximations) to compute power and requisite sample size for a proposed study. The advantages of these methods are simplicity and generality.  相似文献   

14.
Using exact expected likelihoods, we have computed the average number of phase-unknown nuclear families needed to detect linkage and heterogeneity. We have examined the case of both dominant and recessive inheritance with reduced penetrance and phenocopies. Most of our calculations have been carried out under the assumption that 50% of families are linked to a marker locus. We have varied both the number of offspring per family and the sampling scheme. We have also investigated the increased power when the disease locus is midway between two marker loci 10 cM apart. For recessive inheritance, both linkage and heterogeneity can be detected in clinically feasible sample sizes. For dominant inheritance, linkage can be detected but heterogeneity cannot be detected unless larger sibships (four offspring) are sampled or two linked markers are available. As expected, if penetrance is reduced, sampling families with all sibs affected is most efficient. Our results provide a basis for estimating the amount of resources needed to find genes for complex disorders under conditions of heterogeneity.  相似文献   

15.
Murphy A  Weiss ST  Lange C 《PLoS genetics》2008,4(9):e1000197
For genome-wide association studies in family-based designs, we propose a powerful two-stage testing strategy that can be applied in situations in which parent-offspring trio data are available and all offspring are affected with the trait or disease under study. In the first step of the testing strategy, we construct estimators of genetic effect size in the completely ascertained sample of affected offspring and their parents that are statistically independent of the family-based association/transmission disequilibrium tests (FBATs/TDTs) that are calculated in the second step of the testing strategy. For each marker, the genetic effect is estimated (without requiring an estimate of the SNP allele frequency) and the conditional power of the corresponding FBAT/TDT is computed. Based on the power estimates, a weighted Bonferroni procedure assigns an individually adjusted significance level to each SNP. In the second stage, the SNPs are tested with the FBAT/TDT statistic at the individually adjusted significance levels. Using simulation studies for scenarios with up to 1,000,000 SNPs, varying allele frequencies and genetic effect sizes, the power of the strategy is compared with standard methodology (e.g., FBATs/TDTs with Bonferroni correction). In all considered situations, the proposed testing strategy demonstrates substantial power increases over the standard approach, even when the true genetic model is unknown and must be selected based on the conditional power estimates. The practical relevance of our methodology is illustrated by an application to a genome-wide association study for childhood asthma, in which we detect two markers meeting genome-wide significance that would not have been detected using standard methodology.  相似文献   

16.
Elsewhere we have proposed the use of extreme discordant sib pairs (EDSPs) for mapping quantitative trait loci in humans. Here we present sample sizes necessary to achieve a given level of power with this study design, as well as the number of sibs that need to be screened to obtain the required sample. Further, we present simple formulas for adjusting sample sizes to account for variable significance levels and power, as well as the density and informativeness of linkage markers in a multipoint sib-pair analysis. We conclude that with EDSPs, the most powerful study design, the smallest genetic effect detectable with a realistic sample size is approximately 10% of the variance of the trait.  相似文献   

17.
Risch and Zhang (1995; Science 268: 1584-9) reported a simple sample size and power calculation approach for the Haseman-Elston method and based their computations on the null hypothesis of no genetic effect. We argue that the more reasonable null hypothesis is that of no recombination. For this null hypothesis, we provide a general approach for sample size and power calculations within the Haseman-Elston framework. We demonstrate the validity of our approach in a Monte-Carlo simulation study and illustrate the differences using data from published segregation analyses on body weight and heritability estimates on carotid artery artherosclerotic lesions.  相似文献   

18.
Assessing the role of HLA-linked and unlinked determinants of disease.   总被引:39,自引:17,他引:22       下载免费PDF全文
The relationship between increased risk in relatives over population prevalence (lambda R = KR/K) and probability of sharing zero marker alleles identical by descent (ibd) at a linked locus (such as HLA) by an affected relative pair is examined. For a model assuming a single disease-susceptibility locus or group of loci tightly linked to a marker locus, the relationship is remarkably simple and general. Namely, if phi R is the prior probability for the relative pair to share zero marker alleles identical by descent, then P (sharing 0 markers/both relatives are affected) is just phi R/lambda R. Alternatively, lambda AR, the increased risk over population prevalence to a relative R due to a disease locus tightly linked to marker locus A, equals the prior probability that the relative pair share zero A alleles ibd divided by the posterior probability that they share zero alleles ibd, given that they are both affected. For example, for affected sib pairs, P (sharing 0 markers/both sibs are affected) = .25/lambda S. This formula holds true for any number of alleles at the disease locus and for their frequencies, penetrances, and population prevalence. Similar formulas are derived for sharing one and two markers. Application of these formulas to several well-studied HLA-associated diseases yields the following results: For multiple sclerosis, insulin-dependent diabetes mellitus, and coeliac disease, a single-locus model of disease susceptibility is rejected, implying the existence of additional unlinked familial determinants. For all three diseases, the effect of the HLA-linked locus on familiality is minor: for multiple sclerosis, it accounts for only a 2.5-fold increased risk to sibs over the population prevalence, compared to an observed value of 20; for coeliac disease, it accounts for approximately a 5.25-fold increased risk to sibs, while the observed value is on the order of 60; for insulin-dependent diabetes mellitus, it accounts for a 3.42-fold increased risk in sibs, while the observed value is 15. In all cases, the secondary determinants must be outside the HLA region. For tuberculoid leprosy, an unlinked familial determinant is also implicated (increased risk to sibs due to HLA = 1.49; observed value = 2.38). For hemochromatosis and Hodgkin's disease, there is little evidence for HLA-unlinked familial determinants. With this formula, it is also possible to examine the hypothesis of pleiotropy versus linkage dis-equilibrium by comparing lambda AS with the increased risk to sibs due to the associated allele(s).(ABSTRACT TRUNCATED AT 400 WORDS)  相似文献   

19.
Power and sample size calculations are critical parts of any research design for genetic association. We present a method that utilizes haplotype frequency information and average marker-marker linkage disequilibrium on SNPs typed in and around all genes on a chromosome. The test statistic used is the classic likelihood ratio test applied to haplotypes in case/control populations. Haplotype frequencies are computed through specification of genetic model parameters. Power is determined by computation of the test's non-centrality parameter. Power per gene is computed as a weighted average of the power assuming each haplotype is associated with the trait. We apply our method to genotype data from dense SNP maps across three entire chromosomes (6, 21, and 22) for three different human populations (African-American, Caucasian, Chinese), three different models of disease (additive, dominant, and multiplicative) and two trait allele frequencies (rare, common). We perform a regression analysis using these factors, average marker-marker disequilibrium, and the haplotype diversity across the gene region to determine which factors most significantly affect average power for a gene in our data. Also, as a 'proof of principle' calculation, we perform power and sample size calculations for all genes within 100 kb of the PSORS1 locus (chromosome 6) for a previously published association study of psoriasis. Results of our regression analysis indicate that four highly significant factors that determine average power to detect association are: disease model, average marker-marker disequilibrium, haplotype diversity, and the trait allele frequency. These findings may have important implications for the design of well-powered candidate gene association studies. Our power and sample size calculations for the PSORS1 gene appear consistent with published findings, namely that there is substantial power (>0.99) for most genes within 100 kb of the PSORS1 locus at the 0.01 significance level.  相似文献   

20.
Family-based association methods have been developed primarily for autosomal markers. The X-linked sibling transmission/disequilibrium test (XS-TDT) and the reconstruction-combined TDT for X-chromosome markers (XRC-TDT) are the first association-based methods for testing markers on the X chromosome in family data sets. These are valid tests of association in family triads or discordant sib pairs but are not theoretically valid in multiplex families when linkage is present. Recently, XPDT and XMCPDT, modified versions of the pedigree disequilibrium test (PDT), were proposed. Like the PDT, XPDT compares genotype transmissions from parents to affected offspring or genotypes of discordant siblings; however, the XPDT can have low power if there are many missing parental genotypes. XMCPDT uses a Monte Carlo sampling approach to infer missing parental genotypes on the basis of true or estimated population allele frequencies. Although the XMCPDT was shown to be more powerful than the XPDT, variability in the statistic due to the use of an estimate of allele frequency is not properly accounted for. Here, we present a novel family-based test of association, X-APL, a modification of the test for association in the presence of linkage (APL) test. Like the APL, X-APL can use singleton or multiplex families and properly infers missing parental genotypes in linkage regions by considering identity-by-descent parameters for affected siblings. Sampling variability of parameter estimates is accounted for through a bootstrap procedure. X-APL can test individual marker loci or X-chromosome haplotypes. To allow for different penetrances in males and females, separate sex-specific tests are provided. Using simulated data, we demonstrated validity and showed that the X-APL is more powerful than alternative tests. To show its utility and to discuss interpretation in real-data analysis, we also applied the X-APL to candidate-gene data in a sample of families with Parkinson disease.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号