首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We present a conditional likelihood approach for testing linkage disequilibrium in nuclear families having multiple affected offspring. The likelihood, conditioned on the identity-by-descent (IBD) structure of the sibling genotypes, is unaffected by familial correlation in disease status that arises from linkage between a marker locus and the unobserved trait locus. Two such conditional likelihoods are compared: one that conditions on IBD and phase of the transmitted alleles and a second which conditions only on IBD of the transmitted alleles. Under the log-additive model, the first likelihood is equivalent to the allele-counting methods proposed in the literature. The second likelihood is valid under the added assumption of equal male and female recombination fractions. In a simulation study, we demonstrated that in sibships having two or three affected siblings the score test from each likelihood had the correct test size for testing disequilibrium. They also led to equivalent power to detect linkage disequilibrium at the 5% significance level.  相似文献   

2.
In studies of complex diseases, a common paradigm is to conduct association analysis at markers in regions identified by linkage analysis, to attempt to narrow the region of interest. Family-based tests for association based on parental transmissions to affected offspring are often used in fine-mapping studies. However, for diseases with late onset, parental genotypes are often missing. Without parental genotypes, family-based tests either compare allele frequencies in affected individuals with those in their unaffected siblings or use siblings to infer missing parental genotypes. An example of the latter approach is the score test implemented in the computer program TRANSMIT. The inference of missing parental genotypes in TRANSMIT assumes that transmissions from parents to affected siblings are independent, which is appropriate when there is no linkage. However, using computer simulations, we show that, when the marker and disease locus are linked and the data set consists of families with multiple affected siblings, this assumption leads to a bias in the score statistic under the null hypothesis of no association between the marker and disease alleles. This bias leads to an inflated type I error rate for the score test in regions of linkage. We present a novel test for association in the presence of linkage (APL) that correctly infers missing parental genotypes in regions of linkage by estimating identity-by-descent parameters, to adjust for correlation between parental transmissions to affected siblings. In simulated data, we demonstrate the validity of the APL test under the null hypothesis of no association and show that the test can be more powerful than the pedigree disequilibrium test and family-based association test. As an example, we compare the performance of the tests in a candidate-gene study in families with Parkinson disease.  相似文献   

3.
In two previous articles, we have considered sample sizes required to detect linkage for mapping quantitative-trait loci in humans, using extreme discordant sib pairs. Here, we examine further the use of extreme concordant sib pairs but consider the effect of parents' phenotypes. Sample sizes necessary to obtain a power of 80% with concordant sib pairs at a significance level of .0001 are given, stratified by parental phenotypes. When there is no residual correlation between sibs, the parental phenotypes have little impact on the sample sizes. When residual correlations between sibs exist, we show, however, that power can be considerably reduced by including extreme sib pairs when the parents also have similarly extreme values. Thus, we recommend the exclusion of such pairs from linkage studies. This recommendation reduces the required sample sizes by 3- to 28-fold. The degree of saving in the required sample sizes varies among different models and allele frequencies. The reduction is most dramatic (a 28-fold reduction) for a rare recessive gene.  相似文献   

4.
Genome-wide linkage and association studies of tens of thousands of clinical and molecular traits are currently underway, offering rich data for inferring causality between traits and genetic variation. However, the inference process is based on discovering subtle patterns in the correlation between traits and is therefore challenging and could create a flood of untrustworthy causal inferences. Here we introduce the concerns and show that they are already valid in simple scenarios of two traits linked to or associated with the same genomic region. We argue that more comprehensive analysis and Bayesian reasoning are needed and that these can overcome some of the pitfalls, although not in every conceivable case. We conclude that causal inference methods can still be of use in the iterative process of mathematical modeling and biological validation.  相似文献   

5.
This report describes computer implementation of a scheme for joint linkage and association analysis. The model implemented in the computer package Mendel estimates both recombination and linkage-disequilibrium parameters and conducts likelihood-ratio tests for (1) linkage alone, (2) linkage and association simultaneously, and (3) association in the presence of linkage. Application of the method to data from Finnish pedigrees with familial combined hyperlipidemia illustrates its potential for identification of associated SNP haplotypes in the presence of linkage. For the test results to be valid, good estimates of haplotype frequencies must be used in the analysis.  相似文献   

6.
The problem of ascertainment for linkage analysis.   总被引:2,自引:0,他引:2       下载免费PDF全文
It is generally believed that ascertainment corrections are unnecessary in linkage analysis, provided individuals are selected for study solely on the basis of trait phenotype and not on the basis of marker genotype. The theoretical rationale for this is that standard linkage analytic methods involve conditioning likelihoods on all the trait data, which may be viewed as an application of the ascertainment assumption-free (AAF) method of Ewens and Shute. In this paper, we show that when the observed pedigree structure depends on which relatives within a pedigree happen to have been the probands (proband-dependent, or PD, sampling) conditioning on all the trait data is not a valid application of the AAF method and will result in asymptotically biased estimates of genetic parameters (except under single ascertainment). Furthermore, this result holds even if the recombination fraction R is the only parameter of interest. Since the lod score is proportional to the likelihood of the marker data conditional on all the trait data, this means that when data are obtained under PD sampling the lod score will yield asymptotically biased estimates of R, and that so-called mod scores (i.e., lod scores maximized over both R and parameters theta of the trait distribution) will yield asymptotically biased estimates of R and theta. Furthermore, the problem appears to be intractable, in the sense that it is not possible to formulate the correct likelihood conditional on observed pedigree structure. In this paper we do not investigate the numerical magnitude of the bias, which may be small in many situations. On the other hand, virtually all linkage data sets are collected under PD sampling. Thus, the existence of this bias will be the rule rather than the exception in the usual applications.  相似文献   

7.
When affected probands and their biological parents are genotyped at a candidate gene or a marker, the resulting case-parents-triad data enable powerful tests for linkage in the presence of association. When linkage disequilibrium has been detected in such a study, the investigator may wish to look further for possible parent-of-origin effects. If, for example, the transmission/disequilibrium test restricted to fathers is statistically significant, whereas that restricted to mothers is not, the investigator might interpret this as evidence for nonexpression of the maternally derived disease gene-that is, imprinting. This report reviews existing methods for detection of parent-of-origin effects, showing that each can be invalid under certain scenarios. Two new methods are proposed, based on application of likelihood-based inference after stratification on both the parental mating type and the inherited number of copies of the allele under study. If there are no maternal genetic effects expressed prenatally during gestation, the parental-asymmetry test is powerful and provides valid estimation of a parent-of-origin parameter. For diseases for which there could be maternal effects on risk, the parent-of-origin likelihood-ratio test provides a robust alternative. Simulations based on an admixed population demonstrate good operating characteristics for these procedures, under diverse scenarios.  相似文献   

8.
In this report, we present a simple and powerful way to incorporate individual-specific liability classes into linkage analysis. The proposed method is applicable to both quantitative and qualitative traits. In linkage studies, we may have information about different covariates. Incorporation of these covariates along with the estimates of residual familial effects, age-at-onset effects, and susceptibility in the definition of liability classes can increase the power to detect genetic linkage. In this study, we show how one can form individual-specific liability classes and use these classes in standard linkage-analysis programs, such as the widely used LINKAGE package, to perform more powerful genetic linkage analysis. Our simulation study shows that this approach yields higher LOD scores and more-accurate estimates of the recombination fraction in the families showing linkage. The proposed method is also applied to kindreds collected, at the M. D. Anderson Cancer Center, through probands with childhood soft-tissue sarcoma. Confirmed germ-line mutations in the p53 tumor-suppressor gene have been identified in these families. Application of our method to these families yielded significantly higher LOD scores and more-accurate recombination fractions than did analysis that did not account for individual-specific covariate information.  相似文献   

9.
The analysis of the structure of populations on the basis of genetic data is essential in population genetics. It is used, for instance, to study the evolution of species or to correct for population stratification in association studies. These genetic data, normally based on DNA polymorphisms, may contain irrelevant information that biases the inference of population structure. In this paper we adapt a recently proposed algorithm, named multistart EMA, to be used in the inference of population structure. This algorithm is able to deal with irrelevant information when obtaining the (probabilistic) population partition. Additionally, we present a maker selection test able to obtain the most relevant markers to retrieve that population partition. The proposed algorithm is compared with the widely used STRUCTURE software on the basis of the F(ST) metric and the log-likelihood score. It is shown that the proposed algorithm improves the obtention of the population structure. Moreover, information about relevant markers obtained by the multi-start EMA can be used to improve the results obtained by other methods, correct for population stratification or even also reduce the economical cost of sequencing new samples. The software presented in this paper is available online at http://www.sc.ehu.es/ccwbayes/members/guzman.  相似文献   

10.
Ueki M  Cordell HJ 《PLoS genetics》2012,8(4):e1002625
Recently, Wu and colleagues [1] proposed two novel statistics for genome-wide interaction analysis using case/control or case-only data. In computer simulations, their proposed case/control statistic outperformed competing approaches, including the fast-epistasis option in PLINK and logistic regression analysis under the correct model; however, reasons for its superior performance were not fully explored. Here we investigate the theoretical properties and performance of Wu et al.'s proposed statistics and explain why, in some circumstances, they outperform competing approaches. Unfortunately, we find minor errors in the formulae for their statistics, resulting in tests that have higher than nominal type 1 error. We also find minor errors in PLINK's fast-epistasis and case-only statistics, although theory and simulations suggest that these errors have only negligible effect on type 1 error. We propose adjusted versions of all four statistics that, both theoretically and in computer simulations, maintain correct type 1 error rates under the null hypothesis. We also investigate statistics based on correlation coefficients that maintain similar control of type 1 error. Although designed to test specifically for interaction, we show that some of these previously-proposed statistics can, in fact, be sensitive to main effects at one or both loci, particularly in the presence of linkage disequilibrium. We propose two new "joint effects" statistics that, provided the disease is rare, are sensitive only to genuine interaction effects. In computer simulations we find, in most situations considered, that highest power is achieved by analysis under the correct genetic model. Such an analysis is unachievable in practice, as we do not know this model. However, generally high power over a wide range of scenarios is exhibited by our joint effects and adjusted Wu statistics. We recommend use of these alternative or adjusted statistics and urge caution when using Wu et al.'s originally-proposed statistics, on account of the inflated error rate that can result.  相似文献   

11.
To estimate an overall treatment difference with data from a randomized comparative clinical study, baseline covariates are often utilized to increase the estimation precision. Using the standard analysis of covariance technique for making inferences about such an average treatment difference may not be appropriate, especially when the fitted model is nonlinear. On the other hand, the novel augmentation procedure recently studied, for example, by Zhang and others (2008. Improving efficiency of inferences in randomized clinical trials using auxiliary covariates. Biometrics 64, 707-715) is quite flexible. However, in general, it is not clear how to select covariates for augmentation effectively. An overly adjusted estimator may inflate the variance and in some cases be biased. Furthermore, the results from the standard inference procedure by ignoring the sampling variation from the variable selection process may not be valid. In this paper, we first propose an estimation procedure, which augments the simple treatment contrast estimator directly with covariates. The new proposal is asymptotically equivalent to the aforementioned augmentation method. To select covariates, we utilize the standard lasso procedure. Furthermore, to make valid inference from the resulting lasso-type estimator, a cross validation method is used. The validity of the new proposal is justified theoretically and empirically. We illustrate the procedure extensively with a well-known primary biliary cirrhosis clinical trial data set.  相似文献   

12.
When the mode of inheritance of a disease is unknown, the LOD-score method of linkage analysis must take into account uncertainties in model parameters. We have previously proposed a parametric linkage test called "MFLOD," which does not require specification of disease model parameters. In the present study, we introduce two new model-free parametric linkage tests, known as "MLOD" and "MALOD." These tests are defined, respectively, as the LOD score and the admixture LOD score, maximized (subject to the same constraints as MFLOD) over disease-model parameters. We compared the power of these three parametric linkage tests and that of two nonparametric linkage tests, NPLall and NPLpairs, which are implemented in GENEHUNTER. With the use of small pedigrees and a fully informative marker, we found the powers of MLOD, NPLall, and NPLpairs to be almost equivalent to each other and not far below that of a LOD-score analysis performed under the assumption the correct genetic parameters. Thus, linkage analysis is not much hindered by uncertain mode of inheritance. The results also suggest that both parametric and nonparametric methods are suitable for linkage analysis of complex disorders in small pedigrees. However, whether these results apply to large pedigrees remains to be answered.  相似文献   

13.
Family-based tests of linkage disequilibrium typically are based on nuclear-family data including affected individuals and their parents or their unaffected siblings. A limitation of such tests is that they generally are not valid tests of association when data from related nuclear families from larger pedigrees are used. Standard methods require selection of a single nuclear family from any extended pedigrees when testing for linkage disequilibrium. Often data are available for larger pedigrees, and it would be desirable to have a valid test of linkage disequilibrium that can use all potentially informative data. In this study, we present the pedigree disequilibrium test (PDT) for analysis of linkage disequilibrium in general pedigrees. The PDT can use data from related nuclear families from extended pedigrees and is valid even when there is population substructure. Using computer simulations, we demonstrated validity of the test when the asymptotic distribution is used to assess the significance, and examined statistical power. Power simulations demonstrate that, when extended pedigree data are available, substantial gains in power can be attained by use of the PDT rather than existing methods that use only a subset of the data. Furthermore, the PDT remains more powerful even when there is misclassification of unaffected individuals. Our simulations suggest that there may be advantages to using the PDT even if the data consist of independent families without extended family information. Thus, the PDT provides a general test of linkage disequilibrium that can be widely applied to different data structures.  相似文献   

14.
A generalized case-control (GCC) study, like the standard case-control study, leverages outcome-dependent sampling (ODS) to extend to nonbinary responses. We develop a novel, unifying approach for analyzing GCC study data using the recently developed semiparametric extension of the generalized linear model (GLM), which is substantially more robust to model misspecification than existing approaches based on parametric GLMs. For valid estimation and inference, we use a conditional likelihood to account for the biased sampling design. We describe analysis procedures for estimation and inference for the semiparametric GLM under a conditional likelihood, and we discuss problems with estimation and inference under a conditional likelihood when the response distribution is misspecified. We demonstrate the flexibility of our approach over existing ones through extensive simulation studies, and we apply the methodology to an analysis of the Asset and Health Dynamics Among the Oldest Old study, which motives our research. The proposed approach yields a simple yet versatile solution for handling ODS in a wide variety of possible response distributions and sampling schemes encountered in practice.  相似文献   

15.
Data errors and marker allele frequency misspecification can lead to incorrect inference in linkage analysis. Here we demonstrate the effect of each on an allele-sharing statistic in a sample of sib pairs. In the context of relationship testing, we propose a new test that compares the sample genome-wide sib-pair allele sharing to its expectation and show that this test can detect the presence of large-scale data and model errors.  相似文献   

16.
17.
Model-free linkage analysis using likelihoods.   总被引:6,自引:2,他引:4       下载免费PDF全文
Misspecification of transmission model parameters can produce artifactually negative lod scores at small recombination fractions and in multipoint analysis. To avoid this problem, we have tried to devise a test that aims to detect a genetic effect at a particular locus, rather than attempting to estimate the map position of a locus with specified effect. Maximizing likelihoods over transmission model parameters, as well as linkage parameters, can produce seriously biased parameter estimates and so yield tests that lack power for the detection of linkage. However, constraining the transmission model parameters to produce the correct population prevalence largely avoids this problem. For computational convenience, we recommend that the likelihoods under linkage and non-linkage are independently maximized over a limited set of transmission models, ranging from Mendelian dominant to null effect and from null effect to Mendelian recessive. In order to test for a genetic effect at a given map position, the likelihood under linkage is maximized over admixture, the proportion of families linked. Application to simulated data for a wide range of transmission models in both affected sib pairs and pedigrees demonstrates that the new method is well behaved under the null hypothesis and provides a powerful test for linkage when it is present. This test requires no specification of transmission model parameters, apart from an approximate estimate of the population prevalence. It can be applied equally to sib pairs and pedigrees, and, since it does not diminish the lod score at test positions very close to a marker, it is suitable for application to multipoint data.  相似文献   

18.
Zheng Y  Barlow WE  Cutter G 《Biometrics》2005,61(1):259-268
The performance of a medical diagnostic test is often evaluated by comparing the outcome of the test to the patient's true disease state. Receiver operating characteristic analysis may then be used to summarize test accuracy. However, such analysis may encounter several complications in actual practice. One complication is verification bias, i.e., gold standard assessment of disease status may only be partially available and the probability of ascertainment of disease may depend on both the test result and characteristics of the subject. A second issue is that tests interpreted by the same rater may not be independent. Using estimating equations, we generalize previous methods that address these problems. We contrast the performance of alternative estimators of accuracy using robust sandwich variance estimators to permit valid asymptotic inference. We suggest that in the context of an observational cohort study where rich covariate information is available, a weighted estimating equations approach may be preferable for its robustness against model misspecification. We apply the methodology to mammography as performed by community radiologists.  相似文献   

19.
Jiang N  Wang M  Jia T  Wang L  Leach L  Hackett C  Marshall D  Luo Z 《PloS one》2011,6(8):e23192

Background

It has been well established that theoretical kernel for recently surging genome-wide association study (GWAS) is statistical inference of linkage disequilibrium (LD) between a tested genetic marker and a putative locus affecting a disease trait. However, LD analysis is vulnerable to several confounding factors of which population stratification is the most prominent. Whilst many methods have been proposed to correct for the influence either through predicting the structure parameters or correcting inflation in the test statistic due to the stratification, these may not be feasible or may impose further statistical problems in practical implementation.

Methodology

We propose here a novel statistical method to control spurious LD in GWAS from population structure by incorporating a control marker into testing for significance of genetic association of a polymorphic marker with phenotypic variation of a complex trait. The method avoids the need of structure prediction which may be infeasible or inadequate in practice and accounts properly for a varying effect of population stratification on different regions of the genome under study. Utility and statistical properties of the new method were tested through an intensive computer simulation study and an association-based genome-wide mapping of expression quantitative trait loci in genetically divergent human populations.

Results/Conclusions

The analyses show that the new method confers an improved statistical power for detecting genuine genetic association in subpopulations and an effective control of spurious associations stemmed from population structure when compared with other two popularly implemented methods in the literature of GWAS.  相似文献   

20.
家族性不宁腿综合征候选基因的连锁分析   总被引:3,自引:0,他引:3  
不宁腿综合征(restless legs syndrome,RLS)是以下肢部出现蚁行样及酸、麻、胀等不适感而使肢体不得休息为特征的一组病症。由于症状常在晚间发作并导致运动不安,患者长期入睡困难,经受严重的继发性失眠。作为一种常见的神经系统疾病,RLS发病率高达5%,其中原发性RLS多呈阳性家族史,表现为单基因决定的常染色体显性遗传。现在,人们普遍认为RLS的发生很可能与神经系统内多巴胺能功能异常和脑内铁缺乏有关,并初步建立了脑铁-多巴胺能系统的致病模型。为了探求脑铁-多巴胺能系统在RLS中的作用,选择了与脑铁-多巴胺能系统相关的16个疾病侯选基因,在每个候选基因附近染色体区域内选取若干个微卫星多态标记,应用微卫星引物荧光标记-基因扫描技术,对一个汉族家族性不宁腿综合征家系进行了基因分型和常染色体显性遗传模式下的连锁分析,试图从分子遗传学层面上确认或排除一些可能与RLS相关的重要侯选基因。结果显示,当重组系数θ=0.00时,LOD值均小于-2.00,所选位点与家族性不宁腿综合征不连锁。由此得出结论,在本家系中,所有候选基因均与家族性不宁腿综合征的发病无关,家族性不宁腿综合征可能是由其他多巴胺传导和脑铁代谢相关基因所致,或是存在全新的致病机制参与RLS的发生。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号