首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
OBJECTIVES: Confidence intervals for genotype relative risks, for allele frequencies and for the attributable risk in the case parent trio design for candidate-gene studies are proposed which can be easily calculated from the observed familial genotype frequencies. METHODS: Likelihood theory and the delta method were used to derive point estimates and confidence internals. We used Monte Carlo simulations to show the validity of the formulae for a variety of given modes of inheritance and allele frequencies and illustrated their usefulness by applying them to real data. RESULTS: Generally these formulae were found to be valid for 'sufficiently large' sample sizes. For smaller sample sizes the estimators for genotype relative risks tended to be conservative whereas the estimator for attributable risk was found to be anti-conservative for moderate to high allele frequencies. CONCLUSIONS: Since the proposed formulae provide quantitative information on the individual and epidemiological relevance of a genetic variant they might be a useful addition to the traditional statistical significance level of TDT results.  相似文献   

2.
3.
4.
J Benichou  M H Gail 《Biometrics》1990,46(4):991-1003
The attributable risk (AR), defined as AR = [Pr(disease) - Pr(disease/no exposure)]/Pr(disease), measures the proportion of disease risk that is attributable to an exposure. Recently Bruzzi et al. (1985, American Journal of Epidemiology 122, 904-914) presented point estimates of AR based on logistic models for case-control data to allow for confounding factors and secondary exposures. To produce confidence intervals, we derived variance estimates for AR under the logistic model and for various designs for sampling controls. Calculations for discrete exposure and confounding factors require covariances between estimates of the risk parameters of the logistic model and the proportions of cases with given levels of exposure and confounding factors. These covariances are estimated from Taylor series expansions applied to implicit functions. Similar calculations for continuous exposures are derived using influence functions. Simulations indicate that those asymptotic procedures yield reliable variance estimates and confidence intervals with near nominal coverage. An example illustrates the usefulness of variance calculations in selecting a logistic model that is neither so simplified as to exhibit systematic lack of fit nor so complicated as to inflate the variance of the estimate of AR.  相似文献   

5.
Genome-wide association studies (GWAS) have identified hundreds of associated loci across many common diseases. Most risk variants identified by GWAS will merely be tags for as-yet-unknown causal variants. It is therefore possible that identification of the causal variant, by fine mapping, will identify alleles with larger effects on genetic risk than those currently estimated from GWAS replication studies. We show that under plausible assumptions, whilst the majority of the per-allele relative risks (RR) estimated from GWAS data will be close to the true risk at the causal variant, some could be considerable underestimates. For example, for an estimated RR in the range 1.2-1.3, there is approximately a 38% chance that it exceeds 1.4 and a 10% chance that it is over 2. We show how these probabilities can vary depending on the true effects associated with low-frequency variants and on the minor allele frequency (MAF) of the most associated SNP. We investigate the consequences of the underestimation of effect sizes for predictions of an individual's disease risk and interpret our results for the design of fine mapping experiments. Although these effects mean that the amount of heritability explained by known GWAS loci is expected to be larger than current projections, this increase is likely to explain a relatively small amount of the so-called "missing" heritability.  相似文献   

6.
The US National Cancer Institute has recently sponsored the formation of a Cohort Consortium (http://2002.cancer.gov/scpgenes.htm) to facilitate the pooling of data on very large numbers of people, concerning the effects of genes and environment on cancer incidence. One likely goal of these efforts will be generate a large population-based case-control series for which a number of candidate genes will be investigated using SNP haplotype as well as genotype analysis. The goal of this paper is to outline the issues involved in choosing a method of estimating haplotype-specific risk estimates for such data that is technically appropriate and yet attractive to epidemiologists who are already comfortable with odds ratios and logistic regression. Our interest is to develop and evaluate extensions of methods, based on haplotype imputation, that have been recently described (Schaid et al., Am J Hum Genet, 2002, and Zaykin et al., Hum Hered, 2002) as providing score tests of the null hypothesis of no effect of SNP haplotypes upon risk, which may be used for more complex tasks, such as providing confidence intervals, and tests of equivalence of haplotype-specific risks in two or more separate populations. In order to do so we (1) develop a cohort approach towards odds ratio analysis by expanding the E-M algorithm to provide maximum likelihood estimates of haplotype-specific odds ratios as well as genotype frequencies; (2) show how to correct the cohort approach, to give essentially unbiased estimates for population-based or nested case-control studies by incorporating the probability of selection as a case or control into the likelihood, based on a simplified model of case and control selection, and (3) finally, in an example data set (CYP17 and breast cancer, from the Multiethnic Cohort Study) we compare likelihood-based confidence interval estimates from the two methods with each other, and with the use of the single-imputation approach of Zaykin et al. applied under both null and alternative hypotheses. We conclude that so long as haplotypes are well predicted by SNP genotypes (we use the Rh2 criteria of Stram et al. [1]) the differences between the three methods are very small and in particular that the single imputation method may be expected to work extremely well.  相似文献   

7.
Selecting a control group that is perfectly matched for ethnic ancestry with a group of affected individuals is a major problem in studying the association of a candidate gene with a disease. This problem can be avoided by a design that uses parental data in place of nonrelated controls. Schaid and Sommer presented two new methods for the statistical analysis using this approach: (1) a likelihood method (Hardy-Weinberg equilibrium [HWE] method), which rests on the assumption that HWE holds, and (2) a conditional likelihood method (conditional on parental genotype [CPG] method) appropriate when HWE is absent. Schaid and Sommer claimed that the CPG method can be more efficient than the HWE method, even when equilibrium holds. It can be shown, however that in the equilibrium situation the HWE method is always more efficient than the CPG method. For a dominant disease, the differences are slim. But for a recessive disease, the CPG method requires a much larger sample size to achieve a prescribed power than the HWE method. Additionally, we show how the relative risks for the various candidate-gene genotypes can be estimated without relying on iterative methods. For the CPG method, we represent an asymptotic power approximation that is sufficiently precise for planning the sample size of an association study.  相似文献   

8.
Confidence intervals for the mean of one sample and the difference in means of two independent samples based on the ordinary-t statistic suffer deficiencies when samples come from skewed families. In this article we evaluate several existing techniques and propose new methods to improve coverage accuracy. The methods examined include the ordinary-t, the bootstrap-t, the biased-corrected acceleration and three new intervals based on transformation of the t-statistic. Our study shows that our new transformation intervals and the bootstrap-t intervals give best coverage accuracy for a variety of skewed distributions, and that our new transformation intervals have shorter interval lengths.  相似文献   

9.
10.
Studies of association between candidate genes and disease can be designed to use cases with disease, and in place of nonrelated controls, their parents. The advantage of this design is the elimination of spurious differences due to ethnic differences between cases and nonrelated controls. However, several statistical methods of analysis have been proposed in the literature, and the choice of analysis is not always clear. We review some of the statistical methods currently developed and present two new statistical methods aimed at specific genetic hypotheses of dominance and recessivity of the candidate gene. These new methods can be more powerful than other current methods, as demonstrated by simulations. The basis of these new statistical methods is a likelihood approach. The advantage of the likelihood framework is that regression models can be developed to assess genotype-environment interactions, as well as the relative contribution that alleles at the candidate-gene locus make to the relative risk (RR) of disease. This latter development allows testing of (1) whether interactions between alleles exist, on the scale of log RR, and (2) whether alleles originating from the mother or father of a case impart different risks, i.e., genomic imprinting.  相似文献   

11.
12.
13.
The distributions of genetic variance components and their ratios (heritability and type-B genetic correlation) from 105 pairs of six-parent disconnected half-diallels of a breeding population of loblolly pine (Pinus taeda L.) were examined. A series of simulations based on these estimates were carried out to study the coverage accuracy of confidence intervals based on the usual t-method and several other alternative methods. Genetic variance estimates fluctuated greatly from one experiment to another. Both general combining ability variance (2g) and specific combining ability variance (2s) had a large positive skewness. For 2g and 2s, a skewness-adjusted t-method proposed by Boos and Hughes-Oliver (Am Stat 54:121–128, 2000) provided better upper endpoint confidence intervals than t-intervals, whereas they were similar for the lower endpoint. Bootstrap BCa-intervals (Efron and Tibshirani, An introduction to the bootstrap. Chapman & Hall, London 436 p, 1993) and Halls transformation methods (Zhou and Gao, Am Stat 54:100–104, 2000) had poor coverages. Coverage accuracy of Fiellers interval endpoint(J R Stat Soc Ser B 16:175–185, 1954) and t-interval endpoint were similar for both h2 and rB for sample sizes n10, but for n=30 the Fiellers method is much better.  相似文献   

14.
15.
Several research fields frequently deal with the analysis of diverse classification results of the same entities. This should imply an objective detection of overlaps and divergences between the formed clusters. The congruence between classifications can be quantified by clustering agreement measures, including pairwise agreement measures. Several measures have been proposed and the importance of obtaining confidence intervals for the point estimate in the comparison of these measures has been highlighted. A broad range of methods can be used for the estimation of confidence intervals. However, evidence is lacking about what are the appropriate methods for the calculation of confidence intervals for most clustering agreement measures. Here we evaluate the resampling techniques of bootstrap and jackknife for the calculation of the confidence intervals for clustering agreement measures. Contrary to what has been shown for some statistics, simulations showed that the jackknife performs better than the bootstrap at accurately estimating confidence intervals for pairwise agreement measures, especially when the agreement between partitions is low. The coverage of the jackknife confidence interval is robust to changes in cluster number and cluster size distribution.  相似文献   

16.
DNA microarray data are affected by variations from a number of sources. Before these data can be used to infer biological information, the extent of these variations must be assessed. Here we describe an open source software package, lcDNA, that provides tools for filtering, normalizing, and assessing the statistical significance of cDNA microarray data. The program employs a hierarchical Bayesian model and Markov Chain Monte Carlo simulation to estimate gene-specific confidence intervals for each gene in a cDNA microarray data set. This program is designed to perform these primary analytical operations on data from two-channel spotted, or in situ synthesized, DNA microarrays.  相似文献   

17.
ABSTRACT: BACKGROUND: Predicting a system's behavior based on a mathematical model is a primary task in Systems Biology. If the model parameters are estimated from experimental data, the parameter uncertainty has to be translated into confidence intervals for model predictions. For dynamic models of biochemical networks, the nonlinearity in combination with the large number of parameters hampers the calculation of prediction confidence intervals and renders classical approaches as hardly feasible. RESULTS: In this article reliable confidence intervals are calculated based on the prediction profile likelihood. Such prediction confidence intervals of the dynamic states can be utilized for a data-based observability analysis. The method is also applicable if there are non-identifiable parameters yielding to some insufficiently specified modelpredictions that can be interpreted as non-observability. Moreover, a validation profile likelihood is introduced that should be applied when noisy validation experiments are to be interpreted. CONCLUSIONS: The presented methodology allows the propagation of uncertainty from experimental to model pre-dictions. Although presented in the context of ordinary differential equations, the concept is general and also applicable to other types of models. Matlab code which can be used as a template to implement the method is provided at http://www.fdmold.uni-freiburg.de/~ckreutz/PPL .  相似文献   

18.
Varied approaches to estimating confidence intervals for immunological and hybridization distances can be uniformly applied to any matrix of distances. One procedure bootstraps the pairwise dissimilarities between the distances of every pair of taxa to all others, creating a derived matrix of distances for which dispersions can be estimated. Another approach bootstraps the sample of differences between pairwise homologous branch lengths concerning each pair of taxa and between asymmetric halves of the matrix, to find a standard error of the dispersions. This allows comparison of the robustness of trees among different sources of data. DNA hybridization, transferrin immunology and protein immunodiffusion matrices all yield much the same result once standard deviations of dissimilarities are acknowledged: namely, unresolvable trichotomies among the human-chimp-gorilla clade and among this clade with orang and gibbon; conventional relationships among hominoids, cercopithecoids, ceboids and strepsirhines; and a polychotomy among anthropoids, strepsirhines, tarsiers, tupaiids and dermopterans.  相似文献   

19.
Genetic correlations are frequently estimated from natural and experimental populations, yet many of the statistical properties of estimators of are not known, and accurate methods have not been described for estimating the precision of estimates of Our objective was to assess the statistical properties of multivariate analysis of variance (MANOVA), restricted maximum likelihood (REML), and maximum likelihood (ML) estimators of by simulating bivariate normal samples for the one-way balanced linear model. We estimated probabilities of non-positive definite MANOVA estimates of genetic variance-covariance matrices and biases and variances of MANOVA, REML, and ML estimators of and assessed the accuracy of parametric, jackknife, and bootstrap variance and confidence interval estimators for MANOVA estimates of were normally distributed. REML and ML estimates were normally distributed for but skewed for and 0.9. All of the estimators were biased. The MANOVA estimator was less biased than REML and ML estimators when heritability (H), the number of genotypes (n), and the number of replications (r) were low. The biases were otherwise nearly equal for different estimators and could not be reduced by jackknifing or bootstrapping. The variance of the MANOVA estimator was greater than the variance of the REML or ML estimator for most H, n, and r. Bootstrapping produced estimates of the variance of close to the known variance, especially for REML and ML. The observed coverages of the REML and ML bootstrap interval estimators were consistently close to stated coverages, whereas the observed coverage of the MANOVA bootstrap interval estimator was unsatisfactory for some H, n, and r. The other interval estimators produced unsatisfactory coverages. REML and ML bootstrap interval estimates were narrower than MANOVA bootstrap interval estimates for most H, and r. Received: 6 July 1995 / Accepted: 8 March 1996  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号