首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In studies of complex diseases, a common paradigm is to conduct association analysis at markers in regions identified by linkage analysis, to attempt to narrow the region of interest. Family-based tests for association based on parental transmissions to affected offspring are often used in fine-mapping studies. However, for diseases with late onset, parental genotypes are often missing. Without parental genotypes, family-based tests either compare allele frequencies in affected individuals with those in their unaffected siblings or use siblings to infer missing parental genotypes. An example of the latter approach is the score test implemented in the computer program TRANSMIT. The inference of missing parental genotypes in TRANSMIT assumes that transmissions from parents to affected siblings are independent, which is appropriate when there is no linkage. However, using computer simulations, we show that, when the marker and disease locus are linked and the data set consists of families with multiple affected siblings, this assumption leads to a bias in the score statistic under the null hypothesis of no association between the marker and disease alleles. This bias leads to an inflated type I error rate for the score test in regions of linkage. We present a novel test for association in the presence of linkage (APL) that correctly infers missing parental genotypes in regions of linkage by estimating identity-by-descent parameters, to adjust for correlation between parental transmissions to affected siblings. In simulated data, we demonstrate the validity of the APL test under the null hypothesis of no association and show that the test can be more powerful than the pedigree disequilibrium test and family-based association test. As an example, we compare the performance of the tests in a candidate-gene study in families with Parkinson disease.  相似文献   

2.
Genomewide linkage studies are tending toward the use of single-nucleotide polymorphisms (SNPs) as the markers of choice. However, linkage disequilibrium (LD) between tightly linked SNPs violates the fundamental assumption of linkage equilibrium (LE) between markers that underlies most multipoint calculation algorithms currently available, and this leads to inflated affected-relative-pair allele-sharing statistics when founders' multilocus genotypes are unknown. In this study, we investigate the impact that the degree of LD, marker allele frequency, and association type have on estimating the probabilities of sharing alleles identical by descent in multipoint calculations and hence on type I error rates of different sib-pair linkage approaches that assume LE. We show that marker-marker LD does not inflate type I error rates of affected sib pair (ASP) statistics in the whole parameter space, and that, in any case, discordant sib pairs (DSPs) can be used to control for marker-marker LD in ASPs. We advocate the ASP/DSP design with appropriate sib-pair statistics that test the difference in allele sharing between ASPs and DSPs.  相似文献   

3.
Mapping disease genes: family-based association studies.   总被引:19,自引:9,他引:10       下载免费PDF全文
With recent rapid advances in mapping of the human genome, including highly polymorphic and closely linked markers, studies of marker associations with disease are increasingly relevant for mapping disease genes. The use of nuclear-family data in association studies was initially developed to avoid possible ethnic mismatching between patients and randomly ascertained controls. The parental marker alleles not transmitted to an affected child or never transmitted to an affected sib pair form the so-called AFBAC (affected family-based controls) population. In this paper, the theoretical foundation of the AFBAC method is proved for any single-locus model of disease and for any nuclear family-based ascertainment scheme. In a random-mating population, when there is a marker association with disease, the AFBAC population provides an unbiased estimate of the overall population (control) marker alleles when the recombination fraction (theta) between the marker and disease genes is sufficiently small that it can be taken as zero (theta = 0). With population stratification, only marker associations present in the subpopulations will be detected with family-based analyses. Of more importance, however, is the fact that, when theta not equal to 0, differences between transmitted parental (patient) marker allele frequencies and non- or never-transmitted parental marker allele frequencies (implying a marker association with disease) can only be observed for marker genes linked to a disease gene (theta < 1/2). Thus, associations of unlinked marker loci with disease at the population level, caused by population stratification, migration, or admixture, are eliminated. This validates the use of family-based association tests as an appropriate strategy for mapping disease genes.  相似文献   

4.
Detecting the association between genetic markers and complex diseases can be a critical first step toward identification of the genetic basis of disease. Misleading associations can be avoided by choosing as controls the parents of diseased cases, but the availability of parents often limits this design to early-onset disease. Alternatively, sib controls offer a valid design. A general multivariate score statistic is presented, to detect the association between a multiallelic genetic marker locus and affection status; this general approach is applicable to designs that use parents as controls, sibs as controls, or even unrelated controls whose genotypes do not fit Hardy-Weinberg proportions or that pool any combination of these different designs. The benefit of this multivariate score statistic is that it will tend to be the most powerful method when multiple marker alleles are associated with affection status. To plan these types of studies, we present methods to compute sample size and power, allowing for varying sibship sizes, ascertainment criteria, and genetic models of risk. The results indicate that sib controls have less power than parental controls and that the power of sib controls can be increased by increasing either the number of affected sibs per sibship or the number of unaffected control sibs. The sample-size results indicate that the use of sib controls to test for associations, by use of either a single-marker locus or a genomewide screen, will be feasible for markers that have a dominant effect and for common alleles having a recessive effect. The results presented will be useful for investigators planning studies using sibs as controls.  相似文献   

5.
Family-based association methods have been developed primarily for autosomal markers. The X-linked sibling transmission/disequilibrium test (XS-TDT) and the reconstruction-combined TDT for X-chromosome markers (XRC-TDT) are the first association-based methods for testing markers on the X chromosome in family data sets. These are valid tests of association in family triads or discordant sib pairs but are not theoretically valid in multiplex families when linkage is present. Recently, XPDT and XMCPDT, modified versions of the pedigree disequilibrium test (PDT), were proposed. Like the PDT, XPDT compares genotype transmissions from parents to affected offspring or genotypes of discordant siblings; however, the XPDT can have low power if there are many missing parental genotypes. XMCPDT uses a Monte Carlo sampling approach to infer missing parental genotypes on the basis of true or estimated population allele frequencies. Although the XMCPDT was shown to be more powerful than the XPDT, variability in the statistic due to the use of an estimate of allele frequency is not properly accounted for. Here, we present a novel family-based test of association, X-APL, a modification of the test for association in the presence of linkage (APL) test. Like the APL, X-APL can use singleton or multiplex families and properly infers missing parental genotypes in linkage regions by considering identity-by-descent parameters for affected siblings. Sampling variability of parameter estimates is accounted for through a bootstrap procedure. X-APL can test individual marker loci or X-chromosome haplotypes. To allow for different penetrances in males and females, separate sex-specific tests are provided. Using simulated data, we demonstrated validity and showed that the X-APL is more powerful than alternative tests. To show its utility and to discuss interpretation in real-data analysis, we also applied the X-APL to candidate-gene data in a sample of families with Parkinson disease.  相似文献   

6.
The transmission/disequilibrium test (TDT) and the affected sib pair test (ASP) both test for the association of a marker allele with some conditions. Here, we present methods for calculating the probability of detecting the association (power) for a study examining a fixed number of families for suitability for the study and for calculating the number of such families to be examined. Both calculations use a genetic model for the association. The model considered posits a bi-allelic marker locus that is linked to a bi-allelic disease locus with a possibly nonzero recombination fraction between the loci. The penetrance of the disease is an increasing function of the number of disease alleles. The TDT tests whether the transmission by a heterozygous parent of a particular allele at a marker locus to an affected offspring occurs with probability greater than 0.5. The ASP tests whether transmission of the same allele to two affected sibs occurs with probability greater than 0.5. In either case, evidence that the probability is greater than 0.5 is evidence for association between the marker and the disease. Study inclusion criteria (IC) can greatly affect the necessary sample size of a TDT or ASP study. IC considered by us include a randomly selected parent at least one parent or both parents required to be heterozygous. It also allows a specified minimum number of affected offspring to be required (TDT only). We use elementary probability calculations rather than complex mathematical manipulations or asymptotic methods (large sample size approximations) to compute power and requisite sample size for a proposed study. The advantages of these methods are simplicity and generality.  相似文献   

7.
We present a class of likelihood-based score statistics that accommodate genotypes of both unrelated individuals and families, thereby combining the advantages of case-control and family-based designs. The likelihood extends the one proposed by Schaid and colleagues (Schaid and Sommer 1993, 1994; Schaid 1996; Schaid and Li 1997) to arbitrary family structures with arbitrary patterns of missing data and to dense sets of multiple markers. The score statistic comprises two component test statistics. The first component statistic, the nonfounder statistic, evaluates disequilibrium in the transmission of marker alleles from parents to offspring. This statistic, when applied to nuclear families, generalizes the transmission/disequilibrium test to arbitrary numbers of affected and unaffected siblings, with or without typed parents. The second component statistic, the founder statistic, compares observed or inferred marker genotypes in the family founders with those of controls or those of some reference population. The founder statistic generalizes the statistics commonly used for case-control data. The strengths of the approach include both the ability to assess, by comparison of nonfounder and founder statistics, the potential bias resulting from population stratification and the ability to accommodate arbitrary family structures, thus eliminating the need for many different ad hoc tests. A limitation of the approach is the potential power loss and/or bias resulting from inappropriate assumptions on the distribution of founder genotypes. The systematic likelihood-based framework provided here should be useful in the evaluation of both the relative merits of case-control and various family-based designs and the relative merits of different tests applied to the same design. It should also be useful for genotype-disease association studies done with the use of a dense set of multiple markers.  相似文献   

8.
A population association has consistently been observed between insulin-dependent diabetes mellitus (IDDM) and the "class 1" alleles of the region of tandem-repeat DNA (5'' flanking polymorphism [5''FP]) adjacent to the insulin gene on chromosome 11p. This finding suggests that the insulin gene region contains a gene or genes contributing to IDDM susceptibility. However, several studies that have sought to show linkage with IDDM by testing for cosegregation in affected sib pairs have failed to find evidence for linkage. As means for identifying genes for complex diseases, both the association and the affected-sib-pairs approaches have limitations. It is well known that population association between a disease and a genetic marker can arise as an artifact of population structure, even in the absence of linkage. On the other hand, linkage studies with modest numbers of affected sib pairs may fail to detect linkage, especially if there is linkage heterogeneity. We consider an alternative method to test for linkage with a genetic marker when population association has been found. Using data from families with at least one affected child, we evaluate the transmission of the associated marker allele from a heterozygous parent to an affected offspring. This approach has been used by several investigators, but the statistical properties of the method as a test for linkage have not been investigated. In the present paper we describe the statistical basis for this "transmission test for linkage disequilibrium" (transmission/disequilibrium test [TDT]). We then show the relationship of this test to tests of cosegregation that are based on the proportion of haplotypes or genes identical by descent in affected sibs. The TDT provides strong evidence for linkage between the 5''FP and susceptibility to IDDM. The conclusions from this analysis apply in general to the study of disease associations, where genetic markers are usually closely linked to candidate genes. When a disease is found to be associated with such a marker, the TDT may detect linkage even when haplotype-sharing tests do not.  相似文献   

9.
10.
To test for linkage between a trait and a marker, one can consider identical marker alleles in related individuals, for instance, sibs. For recessive diseases, it has been shown that some information may be gained from the identity by descent (IBD) of the two alleles of an affected inbred individual at the marker locus. The aim of this paper is to extend the sib-pair method of linkage analysis to the situation of sib pairs sampled from consanguineous populations. This extension takes maximum advantage of the information provided by both the IBD pattern between sibs and allelic identity within each sib of the pair. This is possible through the use of the condensed identity coefficients. Here, we propose a new test of linkage based on a chi2. We compare the performance of this test with that of the classical chi2 test based on the distribution of sib pairs sharing 0, 1, or 2 alleles IBD. For sib pairs from first-cousin matings, the proposed test can better detect the role of a disease-susceptibility (DS) locus. Its power is shown to be greater than that of the classical test, especially for models where the DS allele may be common and incompletely penetrant; that is to say for situations that may be encountered in multifactorial diseases. A study of the impact of inbreeding on the expected proportions of sib pairs sharing 0, 1, or 2 alleles IBD is also performed here. Ignoring inbreeding, when in fact inbreeding exists, increases the rate of type I errors in tests of linkage.  相似文献   

11.
Refining genomic regions which have been identified by linkage analysis to contain a disease susceptibility locus has proven to be a challenging task. Detecting association between the disease and a genetic marker can significantly narrow down the candidate region. Since an adequate sample of families is already available from the genome scan, family-based association tests may be used to search for association. The use of haplotypes consisting of tightly linked markers can be more powerful for detecting association than the use of individual markers. An extension of the transmission/disequilibrium test to allow the simultaneous analysis of more than one marker locus is complicated by ambiguity of phase in some families of the sample. The present paper shows that a recently proposed method for the analysis of nuclear families with a single affected child can be viewed as a special application of a more general principle. This observation justifies several modifications, potentially increasing the power, as well as an extension of the method to allow the analysis of general nuclear families. Finally, the problem of missing parental genotypes is discussed.  相似文献   

12.
Transmission-disequilibrium tests for quantitative traits.   总被引:9,自引:3,他引:6       下载免费PDF全文
The transmission-disequilibrium test (TDT) of Spielman et al. is a family-based linkage-disequilibrium test that offers a powerful way to test for linkage between alleles and phenotypes that is either causal (i.e., the marker locus is the disease/trait allele) or due to linkage disequilibrium. The TDT is equivalent to a randomized experiment and, therefore, is resistant to confounding. When the marker is extremely close to the disease locus or is the disease locus itself, tests such as the TDT can be far more powerful than conventional linkage tests. To date, the TDT and most other family-based association tests have been applied only to dichotomous traits. This paper develops five TDT-type tests for use with quantitative traits. These tests accommodate either unselected sampling or sampling based on selection of phenotypically extreme offspring. Power calculations are provided and show that, when a candidate gene is available (1) these TDT-type tests are at least an order of magnitude more efficient than two common sib-pair tests of linkage; (2) extreme sampling results in substantial increases in power; and (3) if the most extreme 20% of the phenotypic distribution is selectively sampled, across a wide variety of plausible genetic models, quantitative-trait loci explaining as little as 5% of the phenotypic variation can be detected at the .0001 alpha level with <300 observations.  相似文献   

13.
Linkage mapping of complex diseases is often followed by association studies between phenotypes and marker genotypes through use of case-control or family-based designs. Given fixed genotyping resources, it is important to know which study designs are the most efficient. To address this problem, we extended the likelihood-based method of Li et al., which assesses whether there is linkage disequilibrium between a disease locus and a SNP, to accommodate sibships of arbitrary size and disease-phenotype configuration. A key advantage of our method is the ability to combine data from different family structures. We consider scenarios for which genotypes are available for unrelated cases, affected sib pairs (ASPs), or only one sibling per ASP. We construct designs that use cases only and others that use unaffected siblings or unrelated unaffected individuals as controls. Different combinations of cases and controls result in seven study designs. We compare the efficiency of these designs when the number of individuals to be genotyped is fixed. Our results suggest that (1) when the disease is influenced by a single gene, the one sibling per ASP-control design is the most efficient, followed by the ASP-control design, and familial cases contribute more association information than singleton cases; (2) when the disease is influenced by multiple genes, familial cases provide more association information than singleton cases, unless the effect of the locus being tested is much smaller than at least one other untested disease locus; and (3) the case-control design can be useful for detecting genes with small effect in the presence of genes with much larger effect. Our findings will be helpful for researchers designing and analyzing complex disease-association studies and will facilitate genotyping resource allocation.  相似文献   

14.
We present here four nonparametric statistics for linkage analysis that test whether pairs of affected relatives share marker alleles more often than expected. These statistics are based on simulating the null distribution of a given statistic conditional on the unaffecteds' marker genotypes. Each statistic uses a different measure of marker sharing: the SimAPM statistic uses the simulation-based affected-pedigree-member measure based on identity-by-state (IBS) sharing. The SimKIN (kinship) measure is 1.0 for identity-by-descent (IBD) sharing, 0.0 for no IBD status sharing, and the kinship coefficient when the IBD status is ambiguous. The simulation-based IBD (SimIBD) statistic uses a recursive algorithm to determine the probability of two affecteds sharing a specific allele IBD. The SimISO statistic is identical to SimIBD, except that it also measures marker similarity between unaffected pairs. We evaluated our statistics on data simulated under different two-locus disease models, comparing our results to those obtained with several other nonparametric statistics. Use of IBD information produces dramatic increases in power over the SimAPM method, which uses only IBS information. The power of our best statistic in most cases meets or exceeds the power of the other nonparametric statistics. Furthermore, our statistics perform comparisons between all affected relative pairs within general pedigrees and are not restricted to sib pairs or nuclear families.  相似文献   

15.
Wang T  Elston RC 《Human heredity》2005,60(3):134-142
The lack of replication of model-free linkage analyses performed on complex diseases raises questions about the robustness of these methods to various biases. The confounding effect of population stratification on a genetic association study has long been recognized in the genetic epidemiology community. Because the estimation of the number of alleles shared identical by descent (IBD) does not depend on the marker allele frequency when founders of families are observed, model-free linkage analysis is usually thought to be robust to population stratification. However, for common complex diseases, the genotypes of founders are often unobserved and therefore population stratification has the potential to impair model-free linkage analysis. Here, we demonstrate that, when some or all of the founder genotypes are missing, population stratification can introduce deleterious effects on various model-free linkage methods or designs. For an affected sib pair design, it can cause excess false-positive discoveries even when the trait distribution is homogeneous among subpopulations. After incorporating a control group of discordant sib pairs or for a quantitative trait, two circumstances must be met for population stratification to be a confounder: the distributions for both the marker and the trait must be heterogeneous among subpopulations. When this occurs, the bias can result in either a liberal, and hence invalid, test or a conservative test. Bias can be eliminated or alleviated by inclusion of founders' or other family members' genotype data. When this is not possible, new methods need to be developed to be robust to population stratification.  相似文献   

16.
The sibship disequilibrium test (SDT) is designed to detect both linkage in the presence of association and association in the presence of linkage (linkage disequilibrium). The test does not require parental data but requires discordant sibships with at least one affected and one unaffected sibling. The SDT has many desirable properties: it uses all the siblings in the sibship; it remains valid if there are misclassifications of the affectation status; it does not detect spurious associations due to population stratification; asymptotically it has a chi2 distribution under the null hypothesis; and exact P values can be easily computed for a biallelic marker. We show how to extend the SDT to markers with multiple alleles and how to combine families with parents and data from discordant sibships. We discuss the power of the test by presenting sample-size calculations involving a complex disease model, and we present formulas for the asymptotic relative efficiency (which is approximately the ratio of sample sizes) between SDT and the transmission/disequilibrium test (TDT) for special family structures. For sib pairs, we compare the SDT to a test proposed both by Curtis and, independently, by Spielman and Ewens. We show that, for discordant sib pairs, the SDT has good power for testing linkage disequilibrium relative both to Curtis''s tests and to the TDT using trios comprising an affected sib and its parents. With additional sibs, we show that the SDT can be more powerful than the TDT for testing linkage disequilibrium, especially for disease prevalence >.3.  相似文献   

17.
Case-control studies compare marker-allele distributions in affected and unaffected individuals, and significant results suggest linkage but may simply reflect population structure. For markers with m alleles (m > or = 2), a McNemar-like statistic, I, estimates the level of population association between marker and disease loci. To test for linkage after significant case-control tests, within-family tests are performed. These operate on the contingency table, with i, jth element equal to the number of parents that transmit marker allele Mi and do not transmit marker allele Mi to an affected offspring. The dimension of the table is the number of alleles at the marker locus. Three test statistics have recently been proposed in the literature: Tc compares symmetric pairs of cells (i, j) and (j, i), Tm compares row and column totals for the same marker allele, and a likelihood ratio statistic Tl uses all the cells in the table. In addition, we consider a new statistic, Tmhet, that uses only the heterozygous parents and is approximately chi2 with (m - 1) df. We use a Monte Carlo test to guarantee valid tests and to demonstrate the inferiority of Tc and the equality of Tm and Tl in terms of power. The power of the Tmhet test is close but not always equal to the power of the Tm test. We also show that under the alternative hypothesis of linkage, Tm is approximately noncentral chi2 with (m - 1) df and noncentrality parameter 2NT(1 - 2theta)2I*, when data on single affecteds in NT families are used. If the disease has a low population frequency, then I* is estimated using the case-control statistic I. This offers a basis for choosing sample size, or choosing a marker system.  相似文献   

18.
An individual's disease risk is determined by the compounded action of both common variants, inherited from remote ancestors, that segregated within the population and rare variants, inherited from recent ancestors, that segregated mainly within pedigrees. Next-generation sequencing (NGS) technologies generate high-dimensional data that allow a nearly complete evaluation of genetic variation. Despite their promise, NGS technologies also suffer from remarkable limitations: high error rates, enrichment of rare variants, and a large proportion of missing values, as well as the fact that most current analytical methods are designed for population-based association studies. To meet the analytical challenges raised by NGS, we propose a general framework for sequence-based association studies that can use various types of family and unrelated-individual data sampled from any population structure and a universal procedure that can transform any population-based association test statistic for use in family-based association tests. We develop family-based functional principal-component analysis (FPCA) with or without smoothing, a generalized T(2), combined multivariate and collapsing (CMC) method, and single-marker association test statistics. Through intensive simulations, we demonstrate that the family-based smoothed FPCA (SFPCA) has the correct type I error rates and much more power to detect association of (1) common variants, (2) rare variants, (3) both common and rare variants, and (4) variants with opposite directions of effect from other population-based or family-based association analysis methods. The proposed statistics are applied to two data sets with pedigree structures. The results show that the smoothed FPCA has a much smaller p value than other statistics.  相似文献   

19.
The transmission/disequilibrium test (TDT) [Spielman et al.: Am J Hum Genet 1993;52:506-516] has been postulated as the future of gene mapping for complex diseases, provided one is able to genotype a dense enough map of markers across the genome. Risch and Merikangas [Science 1996;273:1516-1517] suggested a million-marker screen in affected sibpair (ASP) families, demonstrating that the TDT is a more powerful test of linkage than traditional linkage tests based on allele-sharing when there is also association between marker and disease alleles. While the future of genotyping has arrived, successes in family-based association studies have been modest. This is often attributed to excessive false positives in candidate gene studies. This problem is only exacerbated by the increasing numbers of whole genome association (WGA) screens. When applied in ASPs, the TDT statistic, which assumes transmissions to siblings are independent, is not expected to have a constant variance in the presence of variable linkage. This results in generally more extreme statistics, hence will further aggravate the problem of having a large number of positive results to sort through. So an important question is how many positive TDT results will show up on a chromosome containing a disease gene due only to linkage, and will they obfuscate the true disease gene location. To answer this question we combined theory and computer simulations. These studies show that in ASPs the normal version of the TDT statistic has a mean of 0 and a variance of 1 in unlinked regions, but has a variance larger than 1 in linked regions. In contrast, the pedigree disequilibrium test (PDT) statistic adjusts for correlation between siblings due to linkage and maintains a constant variance of 1 at unassociated markers irrespective of linkage. The TDT statistic is generally larger than the PDT statistic across linked regions. This is true for unassociated as well as associated markers. To compare the two tests we ranked both statistics at the disease locus, or an associated marker, among statistics at all other markers. The TDT did better job than PDT placing the score of the associated marker near the top. Though, strictly speaking, the TDT in ASPs should be interpreted as a test of linkage and not a test of association, there is a good chance that if a marker stands out, the marker is associated as well as linked. In conclusion, our results suggest that TDT is an effective screening tool for WGA studies, especially in multiplex families.  相似文献   

20.
Nuclear families with multiple affected sibs are often collected for genetic linkage analysis of complex diseases. Once linkage evidence is established, dense markers are often typed in the linked region for genetic association analysis based on linkage disequilibrium (LD). Detection of association in the presence of linkage localizes disease genes more accurately than the methods that rely on linkage alone. However, test of association due to LD in the linked region needs to account for dependency of the allele transmissions to different sibs within a family. In this paper, we define a joint model for genetic linkage and association and derive the corresponding joint survival function of age of onset for the sibs within a sibship. The joint survival function is a function of both the inheritance vector and the genotypes at the candidate marker locus. Based on this joint survival function, we derive score tests for genetic association. The proposed methods utilize the phenotype data of all the sibs and have the advantages of family-based designs which can avoid the potential spurious association caused by population admixture. In addition, the methods can account for variable age of onset or age at censoring and possible covariate effects, and therefore provide important tools for modelling disease heterogeneity. Simulation studies and application to the data sets from the 12th Genetic Analysis Workshop indicate that the proposed methods have correct type 1 error rates and increased power over other existing methods for testing allelic association.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号