首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Hsu L  Zhao LP  Aragaki C 《Human heredity》2000,50(3):194-200
The family-based association study design is a variation of the case-control study design, where unaffected family members instead of unrelated subjects are sampled as controls. This variation is useful in assessing the effects of candidate genes on disease, because it avoids false associations caused by admixture of populations. A complication of this design is that because of an inherited genotypic correlation among family members, the genotypic distributions between cases and relative controls may be distorted by the ascertainment criteria of families, which could involve not only cases and relative controls, but also other relatives. Analyzing such data naively may lead to biased estimates of relative risk. In this note, we will discuss the consistency of a conditional-likelihood approach. We show analytically that maximum conditional-likelihood estimators are consistent for the true relative risks, if genotypes for family members are exchangeable under the sampling process, for example, sibling clusters. Besides being straightforward conceptually and computationally, this approach is robust to ascertainment bias and naturally accommodates genetic heterogeneity across families.  相似文献   

2.
Detecting the association between genetic markers and complex diseases can be a critical first step toward identification of the genetic basis of disease. Misleading associations can be avoided by choosing as controls the parents of diseased cases, but the availability of parents often limits this design to early-onset disease. Alternatively, sib controls offer a valid design. A general multivariate score statistic is presented, to detect the association between a multiallelic genetic marker locus and affection status; this general approach is applicable to designs that use parents as controls, sibs as controls, or even unrelated controls whose genotypes do not fit Hardy-Weinberg proportions or that pool any combination of these different designs. The benefit of this multivariate score statistic is that it will tend to be the most powerful method when multiple marker alleles are associated with affection status. To plan these types of studies, we present methods to compute sample size and power, allowing for varying sibship sizes, ascertainment criteria, and genetic models of risk. The results indicate that sib controls have less power than parental controls and that the power of sib controls can be increased by increasing either the number of affected sibs per sibship or the number of unaffected control sibs. The sample-size results indicate that the use of sib controls to test for associations, by use of either a single-marker locus or a genomewide screen, will be feasible for markers that have a dominant effect and for common alleles having a recessive effect. The results presented will be useful for investigators planning studies using sibs as controls.  相似文献   

3.
We present a class of likelihood-based score statistics that accommodate genotypes of both unrelated individuals and families, thereby combining the advantages of case-control and family-based designs. The likelihood extends the one proposed by Schaid and colleagues (Schaid and Sommer 1993, 1994; Schaid 1996; Schaid and Li 1997) to arbitrary family structures with arbitrary patterns of missing data and to dense sets of multiple markers. The score statistic comprises two component test statistics. The first component statistic, the nonfounder statistic, evaluates disequilibrium in the transmission of marker alleles from parents to offspring. This statistic, when applied to nuclear families, generalizes the transmission/disequilibrium test to arbitrary numbers of affected and unaffected siblings, with or without typed parents. The second component statistic, the founder statistic, compares observed or inferred marker genotypes in the family founders with those of controls or those of some reference population. The founder statistic generalizes the statistics commonly used for case-control data. The strengths of the approach include both the ability to assess, by comparison of nonfounder and founder statistics, the potential bias resulting from population stratification and the ability to accommodate arbitrary family structures, thus eliminating the need for many different ad hoc tests. A limitation of the approach is the potential power loss and/or bias resulting from inappropriate assumptions on the distribution of founder genotypes. The systematic likelihood-based framework provided here should be useful in the evaluation of both the relative merits of case-control and various family-based designs and the relative merits of different tests applied to the same design. It should also be useful for genotype-disease association studies done with the use of a dense set of multiple markers.  相似文献   

4.
The association of a candidate gene with disease can be efficiently evaluated by a case-control study in which allele frequencies are compared for diseased cases and unaffected controls. However, when the distribution of genotypes in the population deviates from Hardy-Weinberg proportions, the frequency of genotypes--rather than alleles--should be compared by the Armitage test for trend. We present formulas for power and sample size for studies that use Armitage's trend test. The formulas make no assumptions about Hardy-Weinberg equilibrium, but do assume random ascertainment of cases and controls, all of whom are independent of one another. We demonstrate the accuracy of the formulas by simulations.  相似文献   

5.
Family-based association studies have been widely used to identify association between diseases and genetic markers. It is known that genotyping uncertainty is inherent in both directly genotyped or sequenced DNA variations and imputed data in silico. The uncertainty can lead to genotyping errors and missingness and can negatively impact the power and Type I error rates of family-based association studies even if the uncertainty is independent of disease status. Compared with studies using unrelated subjects, there are very few methods that address the issue of genotyping uncertainty for family-based designs. The limited attempts have mostly been made to correct the bias caused by genotyping errors. Without properly addressing the issue, the conventional testing strategy, i.e. family-based association tests using called genotypes, can yield invalid statistical inferences. Here, we propose a new test to address the challenges in analyzing case-parents data by using calls with high accuracy and modeling genotype-specific call rates. Our simulations show that compared with the conventional strategy and an alternative test, our new test has an improved performance in the presence of substantial uncertainty and has a similar performance when the uncertainty level is low. We also demonstrate the advantages of our new method by applying it to imputed markers from a genome-wide case-parents association study.  相似文献   

6.
Recently genetic epidemiologists have begun using case-control family study designs to investigate the role of genetic and environmental risk factors in disease etiology. The objective of these studies is to assess the association of environmental factors with the disease trait; to characterize the disease genes using segregation analysis; and to quantify the residual familial aggregation after controlling for environmental and genetic factors. Typically these objectives are achieved by conducting separate studies and analysis. This paper describes an estimating equation based approach for a combined association, segregation and aggregation analysis on data from case-control family studies. Simulations indicate that the method performs well in a variety of settings. The method is illustrated using simulated family history data made available to participants in a recent Genetic Analysis Workshop.  相似文献   

7.
Aplenc R  Zhao H  Rebbeck TR  Propert KJ 《Genetics》2003,163(3):1215-1219
Molecular epidemiological association studies use valuable biosamples and incur costs. Statistical methods for early genotyping termination may conserve biosamples and costs. Group sequential methods (GSM) allow early termination of studies on the basis of interim comparisons. Simulation studies evaluated the application of GSM using data from a case-control study of GST genotypes and prostate cancer. Group sequential boundaries (GSB) were defined in the EAST-2000 software and were evaluated for study termination when early evidence suggested that the null hypothesis of no association between genotype and disease was unlikely to be rejected. Early termination of GSTM1 genotyping, which demonstrated no association with prostate cancer, occurred in >90% of the simulated studies. On average, 36.4% of biosamples were saved from unnecessary genotyping. In contrast, for GSTT1, which demonstrated a positive association, inappropriate termination occurred in only 6.6%. GSM may provide significant cost and sample savings in molecular epidemiology studies.  相似文献   

8.
The most simple and commonly used approach for genetic associations is the case-control study design of unrelated people. This design is susceptible to population stratification. This problem is obviated in family-based studies, but it is usually difficult to accumulate large enough samples of well-characterized families. We addressed empirically whether the two designs give similar estimates of association in 93 investigations where both unrelated case-control and family-based designs had been employed. Estimated odds ratios differed beyond chance between the two designs in only four instances (4%). The summary relative odds ratio (ROR) (the ratio of odds ratios obtained from unrelated case-control and family-based studies) was close to unity (0.96 [95% confidence interval, 0.91-1.01]). There was no heterogeneity in the ROR across studies (amount of heterogeneity beyond chance I(2) = 0%). Differences on whether results were nominally statistically significant (p < 0.05) or not with the two designs were common (opposite classification rates 14% and 17%); this reflected largely differences in power. Conclusions were largely similar in diverse subgroup analyses. Unrelated case-control and family-based designs give overall similar estimates of association. We cannot rule out rare large biases or common small biases.  相似文献   

9.
Summary Combining data collected from different sources can potentially enhance statistical efficiency in estimating effects of environmental or genetic factors or gene–environment interactions. However, combining data across studies becomes complicated when data are collected under different study designs, such as family‐based and unrelated individual‐based case–control design. In this article, we describe likelihood‐based approaches that permit the joint estimation of covariate effects on disease risk under study designs that include cases, relatives of cases, and unrelated individuals. Our methods accommodate familial residual correlation and a variety of ascertainment schemes. Extensive simulation experiments demonstrate that the proposed methods for estimation and inference perform well in realistic settings. Efficiencies of different designs are contrasted in the simulation. We applied the methods to data from the Colorectal Cancer Family Registry.  相似文献   

10.
Sequencing and exome-chip technologies have motivated development of novel statistical tests to identify rare genetic variation that influences complex diseases. Although many rare-variant association tests exist for case-control or cross-sectional studies, far fewer methods exist for testing association in families. This is unfortunate, because cosegregation of rare variation and disease status in families can amplify association signals for rare variants. Many researchers have begun sequencing (or genotyping via exome chips) familial samples that were either recently collected or previously collected for linkage studies. Because many linkage studies of complex diseases sampled affected sibships, we propose a strategy for association testing of rare variants for use in this study design. The logic behind our approach is that rare susceptibility variants should be found more often on regions shared identical by descent by affected sibling pairs than on regions not shared identical by descent. We propose both burden and variance-component tests of rare variation that are applicable to affected sibships of arbitrary size and that do not require genotype information from unaffected siblings or independent controls. Our approaches are robust to population stratification and produce analytic p values, thereby enabling our approach to scale easily to genome-wide studies of rare variation. We illustrate our methods by using simulated data and exome chip data from sibships ascertained for hypertension collected as part of the Genetic Epidemiology Network of Arteriopathy (GENOA) study.  相似文献   

11.
There are two common designs for association mapping of complex diseases: case-control and family-based designs. A case-control sample is more powerful to detect genetic effects than a family-based sample that contains the same numbers of affected and unaffected persons, although additional markers may be required to control for spurious association. When family and unrelated samples are available, statistical analyses are often performed in the family and unrelated samples separately, conditioning on parental information for the former, thus resulting in reduced power. In this report, we propose a unified approach that can incorporate both family and case-control samples and, provided the additional markers are available, at the same time corrects for population stratification. We apply the principal components of a marker matrix to adjust for the effect of population stratification. This unified approach makes it unnecessary to perform a conditional analysis of the family data and is more powerful than the separate analyses of unrelated and family samples, or a meta-analysis performed by combining the results of the usual separate analyses. This property is demonstrated in both a variety of simulation models and empirical data. The proposed approach can be equally applied to the analysis of both qualitative and quantitative traits.  相似文献   

12.
We develop expressions for the power to detect associations between parental genotypes and offspring phenotypes for quantitative traits. Three different “indirect” experimental designs are considered: full-sib, half-sib, and full-sib–half-sib families. We compare the power of these designs to detect genotype–phenotype associations relative to the common, “direct,” approach of genotyping and phenotyping the same individuals. When heritability is low, the indirect designs can outperform the direct method. However, the extra power comes at a cost due to an increased phenotyping effort. By developing expressions for optimal experimental designs given the cost of phenotyping relative to genotyping, we show how the extra costs associated with phenotyping a large number of individuals will influence experimental design decisions. Our results suggest that indirect association studies can be a powerful means of detecting allelic associations in outbred populations of species for which genotyping and phenotyping the same individuals is impractical and for life history and behavioral traits that are heavily influenced by environmental variance and therefore best measured on groups of individuals. Indirect association studies are likely to be favored only on purely economical grounds, however, when phenotyping is substantially less expensive than genotyping. A web-based application implementing our expressions has been developed to aid in the design of indirect association studies.  相似文献   

13.
We present a conditional likelihood approach for testing linkage disequilibrium in nuclear families having multiple affected offspring. The likelihood, conditioned on the identity-by-descent (IBD) structure of the sibling genotypes, is unaffected by familial correlation in disease status that arises from linkage between a marker locus and the unobserved trait locus. Two such conditional likelihoods are compared: one that conditions on IBD and phase of the transmitted alleles and a second which conditions only on IBD of the transmitted alleles. Under the log-additive model, the first likelihood is equivalent to the allele-counting methods proposed in the literature. The second likelihood is valid under the added assumption of equal male and female recombination fractions. In a simulation study, we demonstrated that in sibships having two or three affected siblings the score test from each likelihood had the correct test size for testing disequilibrium. They also led to equivalent power to detect linkage disequilibrium at the 5% significance level.  相似文献   

14.
Genome-wide case-control association studies aim at identifying significant differential markers between sick and healthy populations. With the development of large-scale technologies allowing the genotyping of thousands of single nucleotide polymorphisms (SNPs) comes the multiple testing problem and the practical issue of selecting the most probable set of associated markers. Several False Discovery Rate (FDR) estimation methods have been developed and tuned mainly for differential gene expression studies. However they are based on hypotheses and designs that are not necessarily relevant in genetic association studies. In this article we present a universal methodology to estimate the FDR of genome-wide association results. It uses a single global probability value per SNP and is applicable in practice for any study design, using any statistic. We have benchmarked this algorithm on simulated data and shown that it outperforms previous methods in cases requiring non-parametric estimation. We exemplified the usefulness of the method by applying it to the analysis of experimental genotyping data of three Multiple Sclerosis case-control association studies.  相似文献   

15.
OBJECTIVE: Cohort and case-control genetic association studies offer the greatest power to detect small genotypic influences on disease phenotypes, relative to family-based designs. However, genetic subdivisions could confound studies involving unrelated individuals, but the topic has been little investigated. We examined geographical and interallelic association of SNP and microsatellite haplotypes of the Y chromosome, of regions of chromosome 11, and of autosomal SNP genotypes relevant to cardiovascular risk traits in a UK-wide epidemiological survey. RESULTS: We show evidence (p = 0.00001) of the Danelaw history of the UK, marked by a two-fold excess of a Viking Y haplotype in central England. We also found evidence for a (different) single-centre geographical over-representation of one haplotype, both for APOC3-A4-A5 and for IGF2. The basis of this remains obscure but neither reflect genotyping error nor correlate with the phenotypic associations by centre of these markers. A panel of SNPs relevant to cardiovascular risks traits showed neither association with geographical location nor with Y haplotypes. CONCLUSION: Combinations of Y haplotyping, autosomal haplotyping, and genome-wide SNP typing, taken together with phenotypic2 associations, should improve epidemiological recognition and interpretation of possible confounding by genetic subdivision.  相似文献   

16.
HLA-linked rheumatoid arthritis.   总被引:3,自引:1,他引:2       下载免费PDF全文
Twenty-eight pedigrees were ascertained through pairs of first-degree relatives diagnosed with rheumatoid arthritis (RA). RA was confirmed in 77 pedigree members including probands; the absence of disease was verified in an additional 261 pedigree members. Pedigree members were serologically typed for HLA. We used likelihood analysis to statistically characterize the HLA-linked RA susceptibility locus. The genetic model assumed tight linkage to HLA. The analysis supported the existence of an HLA-linked RA susceptibility locus, estimated the susceptibility allele frequency as 2.16%, and estimated the lifetime penetrance as 41% in male homozygotes and as 48% in female homozygotes. Inheritance was recessive in males and was nearly recessive in females. In addition, the analysis attributed 78% of the variance within genotypes to genetic or environmental effects shared by siblings. The genetic model inferred in this analysis is consistent with previous association, linkage, and familial aggregation studies of RA. The inferred HLA-linked RA susceptibility locus accounts for approximately one-half of familial RA, although it accounts for only approximately one-fifth of the RA in the population. Although other genes may account for the remaining familial RA, a large portion of RA cases may occur sporadically.  相似文献   

17.
The purpose of this work is to quantify the effects that errors in genotyping have on power and the sample size necessary to maintain constant asymptotic Type I and Type II error rates (SSN) for case-control genetic association studies between a disease phenotype and a di-allelic marker locus, for example a single nucleotide polymorphism (SNP) locus. We consider the effects of three published models of genotyping errors on the chi-square test for independence in the 2 x 3 table. After specifying genotype frequencies for the marker locus conditional on disease status and error model in both a genetic model-based and a genetic model-free framework, we compute the asymptotic power to detect association through specification of the test's non-centrality parameter. This parameter determines the functional dependence of SSN on the genotyping error rates. Additionally, we study the dependence of SSN on linkage disequilibrium (LD), marker allele frequencies, and genotyping error rates for a dominant disease model. Increased genotyping error rate requires a larger SSN. Every 1% increase in sum of genotyping error rates requires that both case and control SSN be increased by 2-8%, with the extent of increase dependent upon the error model. For the dominant disease model, SSN is a nonlinear function of LD and genotyping error rate, with greater SSN for lower LD and higher genotyping error rate. The combination of lower LD and higher genotyping error rates requires a larger SSN than the sum of the SSN for the lower LD and for the higher genotyping error rate.  相似文献   

18.
一种有效的复杂疾病基因定位的检测法   总被引:1,自引:0,他引:1  
连锁不平衡(LD)应用于某些复杂疾病基因的定位,近年来发展了许多LD定位方法,除TDT外,大多数LD定位方法须先假定无人群混和,人群混合可增大在疾病基因定位时犯Ⅰ类错误的机率,产生无效结果。此方法利用LD来检测标记位点和疾病敏感位点(DSL)的连锁(有连锁不平衡)相关(有连锁)。分析时采用不相关样本,已知其父母基因型和至少父母之一为杂合子,再将随机样本依基因型不同分类,然后对来自不同类的数据应用有力的统计方法进行单独和联合分析。此LD定位法不仅适用于患病和正常个体,而且有效消除据父母基因分类的样本定位时人群混合的影响,分析结果和模拟结果也表明此方法解决了在检测标记位点和疾病敏感位点之间的连锁和相关时人群混和的问题,但与TDT比,此法在检测的位点为DSL时丙能有效和充分地利用矫正数据,检测位点不是DSL时,此法和TDT法可相互补充更有效地检测连锁的DSL。  相似文献   

19.
The aim of the present analysis is to combine evidence for association from the two most commonly used designs in genetic association analysis, the case-control design and the transmission disequilibrium test (TDT) design. The cases here are affected offspring from nuclear families and are used in both the case-control and TDT designs. As a result, inference from these designs is not independent. We applied a simple logistic regression method for combining evidence for association from case-control and TDT designs to single-nucleotide polymorphism data purchased on a region on chromosome 3, replicate 1 of the Aipotu population. Combining the evidence from the case-control and TDT designs yielded a 5-10% reduction in the standard errors of the relative risk estimates. The authors did not know the results before the analyses were conducted.  相似文献   

20.
Two likelihood-based score statistics are used to detect association between a disease and a single diallelic polymorphism, on the basis of data from arbitrary types of nuclear families. The first statistic, the nonfounder statistic, extends the transmission/disequilibrium test to accommodate affected and unaffected offspring and missing parental genotypes. The second statistic, the founder statistic, compares observed or inferred parental genotypes with those of some reference population. In this comparison, the genotypes of affected parents or of those with many affected offspring are weighted more heavily than are the genotypes of unaffected parents or of those with few affected offspring. Genotypes of single unrelated cases and controls can be included in this analysis. We illustrate the two statistics by applying them to data on a polymorphism of the SDR5A2 gene in nuclear families with multiple cases of prostate cancer. We also use simulations to compare the power of the nonfounder statistic with that of the score statistic, on the basis of the conditional logistic regression of offspring genotypes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号