首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In studies of complex diseases, a common paradigm is to conduct association analysis at markers in regions identified by linkage analysis, to attempt to narrow the region of interest. Family-based tests for association based on parental transmissions to affected offspring are often used in fine-mapping studies. However, for diseases with late onset, parental genotypes are often missing. Without parental genotypes, family-based tests either compare allele frequencies in affected individuals with those in their unaffected siblings or use siblings to infer missing parental genotypes. An example of the latter approach is the score test implemented in the computer program TRANSMIT. The inference of missing parental genotypes in TRANSMIT assumes that transmissions from parents to affected siblings are independent, which is appropriate when there is no linkage. However, using computer simulations, we show that, when the marker and disease locus are linked and the data set consists of families with multiple affected siblings, this assumption leads to a bias in the score statistic under the null hypothesis of no association between the marker and disease alleles. This bias leads to an inflated type I error rate for the score test in regions of linkage. We present a novel test for association in the presence of linkage (APL) that correctly infers missing parental genotypes in regions of linkage by estimating identity-by-descent parameters, to adjust for correlation between parental transmissions to affected siblings. In simulated data, we demonstrate the validity of the APL test under the null hypothesis of no association and show that the test can be more powerful than the pedigree disequilibrium test and family-based association test. As an example, we compare the performance of the tests in a candidate-gene study in families with Parkinson disease.  相似文献   

2.
We consider the effect of informative missingness on association tests that use parental genotypes as controls and that allow for missing parental data. Parental data can be informatively missing when the probability of a parent being available for study is related to that parent's genotype; when this occurs, the distribution of genotypes among observed parents is not representative of the distribution of genotypes among the missing parents. Many previously proposed procedures that allow for missing parental data assume that these distributions are the same. We propose association tests that behave well when parental data are informatively missing, under the assumption that, for a given trio of paternal, maternal, and affected offspring genotypes, the genotypes of the parents and the sex of the missing parents, but not the genotype of the affected offspring, can affect parental missingness. (This same assumption is required for validity of an analysis that ignores incomplete parent-offspring trios.) We use simulations to compare our approach with previously proposed procedures, and we show that if even small amounts of informative missingness are not taken into account, they can have large, deleterious effects on the performance of tests.  相似文献   

3.
The transmission disequilibrium test (TDT) customarily uses affected children and their parents (often case-parent trios, TDTD). Control-parent trios are necessary to guard against spurious significant results due to segregation distortion but are not generally utilized in the identification of disease susceptibility loci (DSL). Controls are often easy to recruit and the TDT can easily be extended to include control-parent trios into the analyses with unrelated case-parent trios. We present an extension of the TDT (TDTDC) that incorporates unrelated cases and controls and their parents into a single analysis. We develop a simple and accurate analytical method for computing the statistical power of various TDT (e.g. the TDTD, TDTDC, TDTDC and TDTC that employ control-parent trios only) under any genetic model. We investigated the power of these TDT, and particularly compared the relative power of the TDTD and TDTDC. We found that the TDTDC is almost always more powerful than the TDTC and TDTD. The relative power of the TDTDC and TDTD depends largely upon a number of parameters identified in the study. This study provides a basis for efficient use of control-parent trios in DSL identification.  相似文献   

4.
Deng HW  Chen WM  Recker RR 《Human genetics》2002,110(5):451-461
The transmission disequilibrium test (TDT) has been employed to map disease susceptibility loci (DSL), while being immune to the problem of population admixture. The customary TDT test (TDT(D)) was developed for affected child(ren) and their parents and was most often applied to case-parent trios. Recently, the TDT has been extended to the situations when (1) parents are not available but affected and nonaffected sibs from each family are available, (2) unrelated control-parent trios are available for combined analyses with case-parent trios (TDT(DC)), and (3) large pedigrees. For many diseases, affected children in the case-parent trios enlisted into the TDT(D) have unaffected sibs who can be recruited. We present an extension of the TDT by effectively incorporating one unaffected sib of each of the affected children in the case-parent trios into a single analysis (TDT(DS), where DS denotes discordant sib pairs). We have developed a general analytical method for computing the statistical power of the TDT(DS) under any genetic model, the accuracy of which is validated by computer simulations. We compare the power of the TDT(D), TDT(DC), and TDT(DS) under a range of parameter space and genetic models. We find that the TDT(DS) is generally more powerful than the TDT(DC) and TDT(D), particularly when the disease is prevalent (>30%) in the population. The relative power of the TDT(D) and the TDT(DS) largely depends upon the allele frequencies and genetic effects at the DSL, whereas the recombination rate, the degree of linkage disequilibrium, and the marker allele frequencies have little effect. Importantly, the TDT(DS) not only may be more powerful, it also has the advantage of being able to test for segregation distortion that may yield false linkage/association in the TDT(D).  相似文献   

5.
Researchers conducting family-based association studies have a wide variety of transmission/disequilibrium (TD)-based methods to choose from, but few guidelines exist in the selection of a particular method to apply to available data. Using a simulation study design, we compared the power and type I error of eight popular TD-based methods under different family structures, frequencies of missing parental data, genetic models, and population stratifications. No method was uniformly most powerful under all conditions, but type I error was appropriate for nearly every test statistic under all conditions. Power varied widely across methods, with a 46.5% difference in power observed between the most powerful and the least powerful method when 50% of families consisted of an affected sib pair and one parent genotyped under an additive genetic model and a 35.2% difference when 50% of families consisted of a single affection-discordant sibling pair without parental genotypes available under an additive genetic model. Methods were generally robust to population stratification, although some slightly less so than others. The choice of a TD-based test statistic should be dependent on the predominant family structure ascertained, the frequency of missing parental genotypes, and the assumed genetic model.  相似文献   

6.
Yu Z 《Human heredity》2011,71(3):171-179
The case-parents design has been widely used to detect genetic associations as it can prevent spurious association that could occur in population-based designs. When examining the effect of an individual genetic locus on a disease, logistic regressions developed by conditioning on parental genotypes provide complete protection from spurious association caused by population stratification. However, when testing gene-gene interactions, it is unknown whether conditional logistic regressions are still robust. Here we evaluate the robustness and efficiency of several gene-gene interaction tests that are derived from conditional logistic regressions. We found that in the presence of SNP genotype correlation due to population stratification or linkage disequilibrium, tests with incorrectly specified main-genetic-effect models can lead to inflated type I error rates. We also found that a test with fully flexible main genetic effects always maintains correct test size and its robustness can be achieved with negligible sacrifice of its power. When testing gene-gene interactions is the focus, the test allowing fully flexible main effects is recommended to be used.  相似文献   

7.
There is great interest in detecting associations between human traits and rare genetic variation. To address the low power implicit in single-locus tests of rare genetic variants, many rare-variant association approaches attempt to accumulate information across a gene, often by taking linear combinations of single-locus contributions to a statistic. Using the right linear combination is key—an optimal test will up-weight true causal variants, down-weight neutral variants, and correctly assign the direction of effect for causal variants. Here, we propose a procedure that exploits data from population controls to estimate the linear combination to be used in an case-parent trio rare-variant association test. Specifically, we estimate the linear combination by comparing population control allele frequencies with allele frequencies in the parents of affected offspring. These estimates are then used to construct a rare-variant transmission disequilibrium test (rvTDT) in the case-parent data. Because the rvTDT is conditional on the parents’ data, using parental data in estimating the linear combination does not affect the validity or asymptotic distribution of the rvTDT. By using simulation, we show that our new population-control-based rvTDT can dramatically improve power over rvTDTs that do not use population control information across a wide variety of genetic architectures. It also remains valid under population stratification. We apply the approach to a cohort of epileptic encephalopathy (EE) trios and find that dominant (or additive) inherited rare variants are unlikely to play a substantial role within EE genes previously identified through de novo mutation studies.  相似文献   

8.
Yang Q  Xu X  Laird N 《Genetics》2003,164(1):399-406
While a variety of methods have been developed to deal with incomplete parental genotype information in family-based association tests, sampling design issues with incomplete parental genotype data still have not received much attention. In this article, we present simulation studies with four genetic models and various sampling designs and evaluate power in family-based association studies. Efficiency depends heavily on disease prevalence. With rare diseases, sampling affecteds and their parents is preferred, and three sibs will be required to have close power if parents are unavailable. With more common diseases, sampling affecteds and two sibs will generally be more efficient than trios. When parents are unavailable, siblings need not be phenotyped if the disease is rare, but a loss of power will result with common diseases. Finally, for a class of complex traits where other genetic and environmental factors also cause phenotypic correlation among siblings, little loss of efficiency occurs to rare disease, but substantial loss of efficiency occurs to common disease.  相似文献   

9.
To analyze incomplete families, the following statistical tests can be used: LRAT-a simple likelihood-based association test, TRANSMIT, SIBASSOC/STDT, and RCTDT. We compared these four tests, for the diallelic case, on simulated data sets. The comparisons focused on the power to detect linkage and association when different familial structures, resistance to population stratification, resistance to misclassification of the disease status of the healthy sib, and the effect of nonpaternity were considered. The simulations lead to the following conclusions. The type I errors of TRANSMIT, SIBASSOC/STDT, and RCTDT were not affected by population stratification. LRAT showed bias under strong population stratification. High nonpaternity rates can lead to inflated type I errors, highlighting the importance of identification of half sibs. Under different homogeneous models, the power of TRANSMIT was very similar to that of LRAT, and, similarly, no difference in power was observed between SIBASSOC/STDT and RCTDT. Under various recessive and additive models, TRANSMIT was slightly more powerful than SIBASSOC/STDT when monoparental families with one affected and one unaffected sib were analyzed. Under various dominant models, SIBASSOC/STDT was slightly more powerful than TRANSMIT. Misclassification of the disease status of healthy sibs, as well as the discarding of incomplete families, resulted in a consistent loss of power.  相似文献   

10.
The Haplotype Relative Risk (HRR) was first proposed [Falk et al., Ann Hum Genet 1987] to test for Linkage Disequilibrium (LD) between a marker and a putative disease locus using case-parent trios. Spurious association does not appear in such family-based studies under population admixture. In this paper, we extend the HRR to accommodate incomplete trios via the Expectation-Maximization (EM) algorithm [Dempster et al., J R Stat Soc Ser B, 1977]. In addition to triads and dyads (parent-offspring pair), the EM-HRR easily incorporates individuals with no parental genotype information available, which is excluded from the one parent Transmission/Disequilibrium Test (1-TDT) [Sun et al., Am J Epidemiol 1999]. Due to the data structure of EM-HRR, transmitted alleles are always available regardless of the number of missing parental genotypes. As a result of having a larger sample size, computer simulations reveal that the EM-HRR is more powerful in detecting LD than the 1-TDT in a population under Hardy-Weinberg Equilibirum (HWE). If admixture is not extreme, the EM-HRR remains more powerful. When a large degree of admixture exists, the EM-HRR performs better the 1-TDT when the association is strong, though not as well when the association is weak. We illustrate the proposed method with an application to the Framingham Heart Study.  相似文献   

11.
Linkage disequilibrium arising from the recent admixture of genetically distinct populations can be potentially useful in mapping genes for complex diseases. McKeigue has proposed a method that conditions on parental admixture to detect linkage. We show that this method tests for linkage only under specific assumptions, such as equal admixture in the parental generation and admixture that occurs in a single generation. In practice, these assumptions are unlikely to hold for natural populations, resulting in an inflation of the type I error rate when testing for linkage by this method. In this article, we generalize McKeigue's approach of testing for linkage to allow two different admixture models: (1) intermixture admixture and (2) continuous gene flow. We calculate the sample size required for a genomewide search by this method under different disease models: multiplicative, additive, recessive, and dominant. Our results show that the sample size required to obtain 90% power to detect a putative mutant allele at a genomewide significance level of 5% can usually be achieved in practice if informative markers are available at a density of 2 cM.  相似文献   

12.
Sebastiani P  Abad MM  Alpargu G  Ramoni MF 《Genetics》2004,168(4):2329-2337
Several solutions have been proposed to extend the transmission disequilibrium test (TDT) to include cases with missing parental genotype. However, completion of the missing parental genotype may bias the test if the underlying missing data mechanism is informative. Furthermore, all these solutions resolve the problem of missing parental genotype, while offspring with missing genotypes are typically ignored. We propose here an extension to the TDT, called robust TDT (rTDT), able to handle incomplete genotypes on both parents and children and that does not rest on any assumption about the missing data mechanism. rTDT returns minimum and maximum values of TDT that are consistent with all the possible completions of the missing data. We also show that, in some situations, rTDT can achieve both greater power and greater significance than the popular TDT analysis of incomplete data. rTDT is applied to a database of markers of susceptibility to Crohn's disease and it shows that only 2 of the 11 markers originally associated with the phenotype do not depend on assumptions about the missing data mechanism.  相似文献   

13.
We propose a new method for family-based tests of association and linkage called transmission/disequilibrium tests incorporating unaffected offspring (TDTU). This new approach, constructed based on transmission/disequilibrium tests for quantitative traits (QTDT), provides a natural extension of the transmission/disequilibrium test (TDT) to utilize transmission information from heterozygous parents to their unaffected offspring as well as the affected offspring from ascertained nuclear families. TDTU can be used in various study designs and can accommodate all types of independent nuclear families with at least one affected offspring. When the study sample contains only case-parent trios, the TDTU is equivalent to TDT. Informative-transmission disequilibrium test (i-TDT) and generalized disequilibrium test(GDT) are another two methods that can use information of both unaffected offspring and affected offspring. In contract to i-TDT and GDT, the test statistic of TDTU is simpler and more explicit, and can be implemented more easily. Through computer simulations, we demonstrate that power of the TDTU is slightly higher compared to i-TDT and GDT. All the three methods are more powerful than method that uses affected offspring only, suggesting that unaffected siblings also provide information about linkage and association.  相似文献   

14.
OBJECTIVE: In affected sib pair studies without genotyped parents the effect of genotyping error is generally to reduce the type I error rate and power of tests for linkage. The effect of genotyping error when parents have been genotyped is unknown. We investigated the type I error rate of the single-point Mean test for studies in which genotypes of both parents are available. METHODS: Datasets were simulated assuming no linkage and one of five models for genotyping error. In each dataset, Mendelian-inconsistent families were either excluded or regenotyped, and then the Mean test applied. RESULTS: We found that genotyping errors lead to an inflated type I error rate when inconsistent families are excluded. Depending on the genotyping-error model assumed, regenotyping inconsistent families has one of several effects. It may produce the same type I error rate as if inconsistent families are excluded; it may reduce the type I error, but still leave an anti-conservative test; or it may give a conservative test. Departures of the type I error rate from its nominal level increase with both the genotyping error rate and sample size. CONCLUSION: We recommend that markers with high error rates either be excluded from the analysis or be regenotyped in all families.  相似文献   

15.
Family-based association methods have been developed primarily for autosomal markers. The X-linked sibling transmission/disequilibrium test (XS-TDT) and the reconstruction-combined TDT for X-chromosome markers (XRC-TDT) are the first association-based methods for testing markers on the X chromosome in family data sets. These are valid tests of association in family triads or discordant sib pairs but are not theoretically valid in multiplex families when linkage is present. Recently, XPDT and XMCPDT, modified versions of the pedigree disequilibrium test (PDT), were proposed. Like the PDT, XPDT compares genotype transmissions from parents to affected offspring or genotypes of discordant siblings; however, the XPDT can have low power if there are many missing parental genotypes. XMCPDT uses a Monte Carlo sampling approach to infer missing parental genotypes on the basis of true or estimated population allele frequencies. Although the XMCPDT was shown to be more powerful than the XPDT, variability in the statistic due to the use of an estimate of allele frequency is not properly accounted for. Here, we present a novel family-based test of association, X-APL, a modification of the test for association in the presence of linkage (APL) test. Like the APL, X-APL can use singleton or multiplex families and properly infers missing parental genotypes in linkage regions by considering identity-by-descent parameters for affected siblings. Sampling variability of parameter estimates is accounted for through a bootstrap procedure. X-APL can test individual marker loci or X-chromosome haplotypes. To allow for different penetrances in males and females, separate sex-specific tests are provided. Using simulated data, we demonstrated validity and showed that the X-APL is more powerful than alternative tests. To show its utility and to discuss interpretation in real-data analysis, we also applied the X-APL to candidate-gene data in a sample of families with Parkinson disease.  相似文献   

16.
Various family-based association methods have recently been proposed that allow testing for linkage in the presence of linkage disequilibrium between a marker and a disease even if there is only incomplete parental-genotype information. For some families, it may be possible to reconstruct missing parental genotypes from the genotypes of their offspring. Treating such a reconstructed family as if parental genotypes have been typed, however, can introduce bias. The reconstruction-combined transmission/disequilibrium test (RC-TDT) and its X-chromosomal counterpart, XRC-TDT, employ parental-genotype reconstruction and correct for the biases involved in this reconstruction without relying on population marker allele frequencies. For the two tests, exact P values can be obtained by numerically calculating the convolution of the null distributions corresponding to the families in the sample.  相似文献   

17.
The present study assesses the effects of genotyping errors on the type I error rate of a particular transmission/disequilibrium test (TDT(std)), which assumes that data are errorless, and introduces a new transmission/disequilibrium test (TDT(ae)) that allows for random genotyping errors. We evaluate the type I error rate and power of the TDT(ae) under a variety of simulations and perform a power comparison between the TDT(std) and the TDT(ae), for errorless data. Both the TDT(std) and the TDT(ae) statistics are computed as two times a log-likelihood difference, and both are asymptotically distributed as chi(2) with 1 df. Genotype data for trios are simulated under a null hypothesis and under an alternative (power) hypothesis. For each simulation, errors are introduced randomly via a computer algorithm with different probabilities (called "allelic error rates"). The TDT(std) statistic is computed on all trios that show Mendelian consistency, whereas the TDT(ae) statistic is computed on all trios. The results indicate that TDT(std) shows a significant increase in type I error when applied to data in which inconsistent trios are removed. This type I error increases both with an increase in sample size and with an increase in the allelic error rates. TDT(ae) always maintains correct type I error rates for the simulations considered. Factors affecting the power of the TDT(ae) are discussed. Finally, the power of TDT(std) is at least that of TDT(ae) for simulations with errorless data. Because data are rarely error free, we recommend that researchers use methods, such as the TDT(ae), that allow for errors in genotype data.  相似文献   

18.
Summary Case-parent trio studies concerned with children affected by a disease and their parents aim to detect single nucleotide polymorphisms (SNPs) showing a preferential transmission of alleles from the parents to their affected offspring. A popular statistical test for detecting such SNPs associated with disease in this study design is the genotypic transmission/disequilibrium test (gTDT) based on a conditional logistic regression model, which usually needs to be fitted by an iterative procedure. In this article, we derive exact closed-form solutions for the parameter estimates of the conditional logistic regression models when testing for an additive, a dominant, or a recessive effect of a SNP, and show that such analytic parameter estimates also exist when considering gene-environment interactions with binary environmental variables. Because the genetic model underlying the association between a SNP and a disease is typically unknown, it might further be beneficial to use the maximum over the gTDT statistics for the possible effects of a SNP as test statistic. We therefore propose a procedure enabling a fast computation of the test statistic and the permutation-based p-value of this MAX gTDT. All these methods are applied to whole-genome scans of the case-parent trios from the International Cleft Consortium. These applications show our procedures dramatically reduce the required computing time compared to the conventional iterative methods allowing, for example, the analysis of hundreds of thousands of SNPs in a few minutes instead of several hours.  相似文献   

19.
In the context of parentage assignment using genomic markers, key issues are genotyping errors and an absence of parent genotypes because of sampling, traceability or genotyping problems. Most likelihood‐based parentage assignment software programs require a priori estimates of genotyping errors and the proportion of missing parents to set up meaningful assignment decision rules. We present here the R package APIS, which can assign offspring to their parents without any prior information other than the offspring and parental genotypes, and a user‐defined, acceptable error rate among assigned offspring. Assignment decision rules use the distributions of average Mendelian transmission probabilities, which enable estimates of the proportion of offspring with missing parental genotypes. APIS has been compared to other software (CERVUS, VITASSIGN), on a real European seabass (Dicentrarchus labrax) single nucleotide polymorphism data set. The type I error rate (false positives) was lower with APIS than with other software, especially when parental genotypes were missing, but the true positive rate was also lower, except when the theoretical exclusion power reached 0.99999. In general, APIS provided assignments that satisfied the user‐set acceptable error rate of 1% or 5%, even when tested on simulated data with high genotyping error rates (1% or 3%) and up to 50% missing sires. Because it uses the observed distribution of Mendelian transmission probabilities, APIS is best suited to assigning parentage when numerous offspring (>200) are genotyped. We have demonstrated that APIS is an easy‐to‐use and reliable software for parentage assignment, even when up to 50% of sires are missing.  相似文献   

20.
Because of the need for fine mapping of disease loci and the availability of dense single-nucleotide-polymorphism markers, many forms of association tests have been developed. Most of them are applicable only to triads, whereas some are amenable to nuclear families (sibships). Although there are a number of methods that can deal with extended families (e.g., the pedigree disequilibrium test [PDT]), most of them cannot accommodate incomplete data. Furthermore, despite a large body of literature on association mapping, only a very limited number of publications are applicable to X-chromosomal markers. In this report, we first extend the PDT to markers on the X chromosome for testing linkage disequilibrium in the presence of linkage. This method is applicable to any pedigree structure and is termed "X-chromosomal pedigree disequilibrium test" (XPDT). We then further extend the XPDT to accommodate pedigrees with missing genotypes in some of the individuals, especially founders. Monte Carlo (MC) samples of the missing genotypes are generated and used to calculate the XMCPDT (X-chromosomal MC PDT) statistic, which is defined as the conditional expectation of the XPDT statistic given the incomplete (observed) data. This MC version of the XPDT remains a valid test for association under linkage with the assumption that the pedigrees and their associated affection patterns are drawn randomly from a population of pedigrees with at least one affected offspring. This set of methods was compared with existing approaches through simulation, and substantial power gains were observed in all settings considered, with type I error rates closely tracking their nominal values.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号