首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Both the absolute risk and the relative risk (RR) have a crucial role to play in epidemiology. RR is often approximated by odds ratio (OR) under the rare-disease assumption in conventional case-control study; however, such a study design does not provide an estimate for absolute risk. The case-base study is an alternative approach which readily produces RR estimation without resorting to the rare-disease assumption. However, previous researchers only considered one single dichotomous exposure and did not elaborate how absolute risks can be estimated in a case-base study. In this paper, the authors propose a logistic model for the case-base study. The model is flexible enough to admit multiple exposures in any measurement scale—binary, categorical or continuous. It can be easily fitted using common statistical packages. With one additional step of simple calculations of the model parameters, one readily obtains relative and absolute risk estimates as well as their confidence intervals. Monte-Carlo simulations show that the proposed method can produce unbiased estimates and adequate-coverage confidence intervals, for ORs, RRs and absolute risks. The case-base study with all its desirable properties and its methods of analysis fully developed in this paper may become a mainstay in epidemiology.  相似文献   

2.
Finding out biomarkers and building risk scores to predict the occurrence of survival outcomes is a major concern of clinical epidemiology, and so is the evaluation of prognostic models. In this paper, we are concerned with the estimation of the time-dependent AUC--area under the receiver-operating curve--which naturally extends standard AUC to the setting of survival outcomes and enables to evaluate the discriminative power of prognostic models. We establish a simple and useful relation between the predictiveness curve and the time-dependent AUC--AUC(t). This relation confirms that the predictiveness curve is the key concept for evaluating calibration and discrimination of prognostic models. It also highlights that accurate estimates of the conditional absolute risk function should yield accurate estimates for AUC(t). From this observation, we derive several estimators for AUC(t) relying on distinct estimators of the conditional absolute risk function. An empirical study was conducted to compare our estimators with the existing ones and assess the effect of model misspecification--when estimating the conditional absolute risk function--on the AUC(t) estimation. We further illustrate the methodology on the Mayo PBC and the VA lung cancer data sets.  相似文献   

3.
J O'Quigley  F Pessione 《Biometrics》1989,45(1):135-144
A simple model, containing the proportional hazards regression model as a special case, is presented. The purpose of the model is to provide a framework in which specific alternatives to the proportional hazards assumption may be tested. Rank-invariant score tests for linear, quadratic, or exponential trends can, for instance, all be undertaken within this framework. In the case of the two-sample problem the required calculations are shown to take a particularly simple form. Special consideration is given to the two-sample case in which there is an inversion of the regression effect, i.e., where the hazard functions cross at some given point. Both of the motivating examples are concerned with this problem. Computational aspects are relatively straightforward and some discussion on this is provided.  相似文献   

4.
On criteria for evaluating models of absolute risk   总被引:4,自引:0,他引:4  
Absolute risk is the probability that an individual who is free of a given disease at an initial age, a, will develop that disease in the subsequent interval (a, t]. Absolute risk is reduced by mortality from competing risks. Models of absolute risk that depend on covariates have been used to design intervention studies, to counsel patients regarding their risks of disease and to inform clinical decisions, such as whether or not to take tamoxifen to prevent breast cancer. Several general criteria have been used to evaluate models of absolute risk, including how well the model predicts the observed numbers of events in subsets of the population ("calibration"), and "discriminatory power," measured by the concordance statistic. In this paper we review some general criteria and develop specific loss function-based criteria for two applications, namely whether or not to screen a population to select subjects for further evaluation or treatment and whether or not to use a preventive intervention that has both beneficial and adverse effects. We find that high discriminatory power is much more crucial in the screening application than in the preventive intervention application. These examples indicate that the usefulness of a general criterion such as concordance depends on the application, and that using specific loss functions can lead to more appropriate assessments.  相似文献   

5.
Diagnostic studies in ophthalmology frequently involve binocular data where pairs of eyes are evaluated, through some diagnostic procedure, for the presence of certain diseases or pathologies. The simplest approach of estimating measures of diagnostic accuracy, such as sensitivity and specificity, treats eyes as independent, consequently yielding incorrect estimates, especially of the standard errors. Approaches that account for the inter‐eye correlation include regression methods using generalized estimating equations and likelihood techniques based on various correlated binomial models. The paper proposes a simple alternative statistical methodology of jointly estimating measures of diagnostic accuracy for binocular tests based on a flexible model for correlated binary data. Moments' estimation of model parameters is outlined and asymptotic inference is discussed. The resulting estimates are straightforward and easy to obtain, requiring no special statistical software but only elementary calculations. Results of simulations indicate that large‐sample and bootstrap confidence intervals based on the estimates have relatively good coverage properties when the model is correctly specified. The computation of the estimates and their standard errors are illustrated with data from a study on diabetic retinopathy.  相似文献   

6.
We present prediction and variable importance (VIM) methods for longitudinal data sets containing continuous and binary exposures subject to missingness. We demonstrate the use of these methods for prognosis of medical outcomes of severe trauma patients, a field in which current medical practice involves rules of thumb and scoring methods that only use a few variables and ignore the dynamic and high-dimensional nature of trauma recovery. Well-principled prediction and VIM methods can provide a tool to make care decisions informed by the high-dimensional patient’s physiological and clinical history. Our VIM parameters are analogous to slope coefficients in adjusted regressions, but are not dependent on a specific statistical model, nor require a certain functional form of the prediction regression to be estimated. In addition, they can be causally interpreted under causal and statistical assumptions as the expected outcome under time-specific clinical interventions, related to changes in the mean of the outcome if each individual experiences a specified change in the variable (keeping other variables in the model fixed). Better yet, the targeted MLE used is doubly robust and locally efficient. Because the proposed VIM does not constrain the prediction model fit, we use a very flexible ensemble learner (the SuperLearner), which returns a linear combination of a list of user-given algorithms. Not only is such a prediction algorithm intuitive appealing, it has theoretical justification as being asymptotically equivalent to the oracle selector. The results of the analysis show effects whose size and significance would have been not been found using a parametric approach (such as stepwise regression or LASSO). In addition, the procedure is even more compelling as the predictor on which it is based showed significant improvements in cross-validated fit, for instance area under the curve (AUC) for a receiver-operator curve (ROC). Thus, given that 1) our VIM applies to any model fitting procedure, 2) under assumptions has meaningful clinical (causal) interpretations and 3) has asymptotic (influence-curve) based robust inference, it provides a compelling alternative to existing methods for estimating variable importance in high-dimensional clinical (or other) data.  相似文献   

7.
Whether the aim is to diagnose individuals or estimate prevalence, many epidemiological studies have demonstrated the successful use of tests on pooled sera. These tests detect whether at least one sample in the pool is positive. Although originally designed to reduce diagnostic costs, testing pools also lowers false positive and negative rates in low prevalence settings and yields more precise prevalence estimates. Current methods are aimed at estimating the average population risk from diagnostic tests on pools. In this article, we extend the original class of risk estimators to adjust for covariates recorded on individual pool members. Maximum likelihood theory provides a flexible estimation method that handles different covariate values in the pool, different pool sizes, and errors in test results. In special cases, software for generalized linear models can be used. Pool design has a strong impact on precision and cost efficiency, with covariate-homogeneous pools carrying the largest amount of information. We perform joint pool and sample size calculations using information from individual contributors to the pool and show that a good design can severely reduce cost and yet increase precision. The methods are illustrated using data from a Kenyan surveillance study of HIV. Compared to individual testing, age-homogeneous, optimal-sized pools of average size seven reduce cost to 44% of the original price with virtually no loss in precision.  相似文献   

8.
Quantification of uncertainty associated with risk estimates is an important part of risk assessment. In recent years, use of second-order distributions, and two-dimensional simulations have been suggested for quantifying both variability and uncertainty. These approaches are better interpreted within the Bayesian framework. To help practitioners better use such methods and interpret the results, in this article, we describe propagation and interpretation of uncertainty in the Bayesian paradigm. We consider both the estimation problem where some summary measures of the risk distribution (e.g., mean, variance, or selected percentiles) are to be estimated, and the prediction problem, where the risk values for some specific individuals are to be predicted. We discuss some connections and differences between uncertainties in estimation and prediction problems, and present an interpretation of a decomposition of total variability/uncertainty into variability and uncertainty in terms of expected squared error of prediction and its reduction from perfect information. We also discuss the role of Monte Carlo methods in characterizing uncertainty. We explain the basic ideas using a simple example, and demonstrate Monte Carlo calculations using another example from the literature.  相似文献   

9.
Pybus OG  Rambaut A  Harvey PH 《Genetics》2000,155(3):1429-1437
We describe a unified set of methods for the inference of demographic history using genealogies reconstructed from gene sequence data. We introduce the skyline plot, a graphical, nonparametric estimate of demographic history. We discuss both maximum-likelihood parameter estimation and demographic hypothesis testing. Simulations are carried out to investigate the statistical properties of maximum-likelihood estimates of demographic parameters. The simulations reveal that (i) the performance of exponential growth model estimates is determined by a simple function of the true parameter values and (ii) under some conditions, estimates from reconstructed trees perform as well as estimates from perfect trees. We apply our methods to HIV-1 sequence data and find strong evidence that subtypes A and B have different demographic histories. We also provide the first (albeit tentative) genetic evidence for a recent decrease in the growth rate of subtype B.  相似文献   

10.
Right-truncated data arise when observations are ascertained retrospectively, and only subjects who experience the event of interest by the time of sampling are selected. Such a selection scheme, without adjustment, leads to biased estimation of covariate effects in the Cox proportional hazards model. The existing methods for fitting the Cox model to right-truncated data, which are based on the maximization of the likelihood or solving estimating equations with respect to both the baseline hazard function and the covariate effects, are numerically challenging. We consider two alternative simple methods based on inverse probability weighting (IPW) estimating equations, which allow consistent estimation of covariate effects under a positivity assumption and avoid estimation of baseline hazards. We discuss problems of identifiability and consistency that arise when positivity does not hold and show that although the partial tests for null effects based on these IPW methods can be used in some settings even in the absence of positivity, they are not valid in general. We propose adjusted estimating equations that incorporate the probability of observation when it is known from external sources, which results in consistent estimation. We compare the methods in simulations and apply them to the analyses of human immunodeficiency virus latency.  相似文献   

11.
We consider two methods of estimating phenotype probabilities for a number of standard genetic markers like the ABO, MNSs, and PGM markers. The first method is based on the maximum likelihood estimates of the allele probabilities, and the second (multinomial) method uses the phenotype proportions in the sample. The latter is easy to use, the estimates are always unbiased, and simple formulae for variances are available. The former method, although giving more efficient estimates, requires the assumption of panmixia so that the Hardy-Weinberg law can be used. The two methods are compared theoretically, where possible, or by simulation. Under panmixia, the maximum likelihood estimates can be substantially more efficient than the multinomial estimates. The estimates are also compared in the codominant allele case for nonpanmictic populations. The question of efficiency is of importance when estimating the probability of obtaining a given set of phenotypes, i.e., the product of individual phenotype estimators. This problem is discussed briefly.  相似文献   

12.
Measures of conserved synteny are important for estimating the relative rates of chromosomal evolution in various lineages. We present a natural way to view the synteny conservation between two species from an Oxford grid--an r x c table summarizing the number of orthologous genes on each of the chromosomes 1 through r of the first species that are on each of the chromosomes 1 through c of the second species. This viewpoint suggests a natural statistic, which we denote by rho and call syntenic correlation, designed to measure the amount of synteny conservation between two species. This measure allows syntenic conservation to be compared across many pairs of species. We improve the previous methods for estimating the true number of conserved syntenies given the observed number of conserved syntenies by taking into account the dependency of the numbers of orthologues observed in the chromosome pairings between the two species and by determining both point and interval estimators. We also discuss the application of our methods to genomes that contain chromosomes of highly variable lengths and to estimators of the true number of conserved segments between species pairs.  相似文献   

13.
Pepe MS  Heagerty P  Whitaker R 《Biometrics》1999,55(3):944-950
Data collected longitudinally in time provide the opportunity to develop predictive models of future observations given current data for an individual. Such models may be of particular value in defining individuals at high risk and thereby in suggesting subgroups for targeting of prevention intervention research efforts. In this paper, we propose a method for estimating predictive functions. The method uses an extension of the marginal regression analysis methods of Liang and Zeger (1986, Biometrika 73, 13-22) and is implemented using simple estimating equations. A key feature of the models is that regression coefficients are modelled as smooth functions of the times both at and for prediction. Data from a study of obesity in childhood and early adulthood is used to demonstrate the methodology. Criteria for defining individuals to be at high risk can be defined on the basis of estimated predictive functions. We suggest methods for evaluating the diagnostic accuracy (sensitivity and specificity) of such rules using cross-validation. The method holds promise as a robust and technically easy way of evaluating information about future prognosis that may be gleaned from a patient's current and past clinical status.  相似文献   

14.
Pedigrees, depicting the genealogical relationships between individuals in a population, are of fundamental importance to several research areas including conservation biology. For example, they are useful for estimating inbreeding, heritability, selection, studying kin selection and for measuring gene flow between populations. Pedigrees constructed from direct observations of reproduction are usually unavailable for wild populations. Therefore, pedigrees for these populations are usually estimated using molecular marker data. Despite their obvious importance, and the fact that pedigrees are conceptually well understood, the methods, and limitations of marker-based pedigree inference are often less well understood. Here we introduce animal conservation biologists to molecular marker-based pedigrees. We briefly describe the history of pedigree inference research, before explaining the underlying theory and basic mechanics of pedigree construction using standard methods. We explain the assumptions and limitations that accompany many of these methods, before going on to explain methods that relax several of these assumptions. Finally, we look to future and discuss some recent exciting advances such as the use of single-nucleotide polymorphisms, inference of multigenerational pedigrees and incorporation of non-genetic data such as field observations into the calculations. We also provide some guidelines on efficient marker selection in order to maximize accuracy and power. Throughout we use examples from the field of animal conservation and refer readers to appropriate software where possible. It is our hope that this review will help animal conservation biologists to understand, choose, and use the methods and tools of this fast-moving field.  相似文献   

15.
We describe and examine methods for estimating spatial correlations used in population ecology. We base our analyses on a hypothetical example of a species that has been censured at 30 different locations for 20 years. We assume that the population fluctuations can be described by a simple linear model on logarithmic scale. Stochastic simulations is utilized to check how seven different ways of resampling perform when the goal is to find nominal 95% confidence intervals for the spatial correlation in growth rates at given distances. It turns out that resampling of locations performs badly, with true coverage level as low as 30–40%, especially for small correlations at long distances. Resampling of timepoints performs much better, with coverage varying from 80 to 90%, depending on the strength of density regulation and whether the spatial correlation is estimated for the response variable or for the error terms in the model. Assuming that the underlying model is known, the best results are achieved for parametric bootstrapping, a result that strongly emphasize the importance of defining and estimating a proper population model when studying spatial processes.  相似文献   

16.
Excessive weight in adults is a national concern with over 2/3 of the US population deemed overweight. Because being overweight has been correlated to numerous diseases such as heart disease and type 2 diabetes, there is a need to understand mechanisms and predict outcomes of weight change and weight maintenance. A simple mathematical model that accurately predicts individual weight change offers opportunities to understand how individuals lose and gain weight and can be used to foster patient adherence to diets in clinical settings. For this purpose, we developed a one-dimensional differential equation model of weight change based on the energy balance equation paired to an algebraic relationship between fat-free mass and fat mass derived from a large nationally representative sample of recently released data collected by the Centers for Disease Control. We validate the model's ability to predict individual participants’ weight change by comparing model estimates of final weight data from two recent underfeeding studies and one overfeeding study. Mean absolute error and standard deviation between model predictions and observed measurements of final weights are less than 1.8±1.3 kg for the underfeeding studies and 2.5±1.6 kg for the overfeeding study. Comparison of the model predictions to other one-dimensional models of weight change shows improvement in mean absolute error, standard deviation of mean absolute error, and group mean predictions. The maximum absolute individual error decreased by approximately 60% substantiating reliability in individual weight-change predictions. The model provides a viable method for estimating individual weight change as a result of changes in intake and determining individual dietary adherence during weight-change studies.  相似文献   

17.
Estimating the false discovery rate using nonparametric deconvolution   总被引:1,自引:0,他引:1  
van de Wiel MA  Kim KI 《Biometrics》2007,63(3):806-815
Given a set of microarray data, the problem is to detect differentially expressed genes, using a false discovery rate (FDR) criterion. As opposed to common procedures in the literature, we do not base the selection criterion on statistical significance only, but also on the effect size. Therefore, we select only those genes that are significantly more differentially expressed than some f-fold (e.g., f = 2). This corresponds to use of an interval null domain for the effect size. Based on a simple error model, we discuss a naive estimator for the FDR, interpreted as the probability that the parameter of interest lies in the null-domain (e.g., mu < log(2)(2) = 1) given that the test statistic exceeds a threshold. We improve the naive estimator by using deconvolution. That is, the density of the parameter of interest is recovered from the data. We study performance of the methods using simulations and real data.  相似文献   

18.
High-resolution genetic mapping of complex traits.   总被引:19,自引:5,他引:14       下载免费PDF全文
Positional cloning requires high-resolution genetic mapping. To plan a positional cloning project, one needs to know how many informative meioses will be required to narrow the search for a disease gene to an acceptably small region. For a simple Mendelian trait studied with linkage analysis, the answer is straightforward. In this paper, we address the situation of a complex trait studied with affected-relative-pair methods. We derive mathematical formulas for the size of an appropriate confidence region, as a function of the relative risk attributable to the gene. Using these results, we provide graphs showing the number of relative pairs required to narrow the gene hunt to an interval of a given size. For example, we show that localizing a gene to 1 cM requires a median of 200 sib pairs for a locus causing a fivefold increased risk to an offspring and 700 sib pairs for a locus causing a twofold increased risk. We discuss the implications of these results for the positional cloning of genes underlying complex traits.  相似文献   

19.
There exist a number of methods to determine age dependent reference intervals. Some of those are based on standard parametric classes of distributions like normal or lognormal and standard parametric classes of age functions like linear or polynomial of some order. Others are based on more flexible distribution classes like Box-Cox transformation of the normal distribution, which allows for skewness. There exist also purely nonparametric methods, where the bounds of the reference intervals are only assumed to be nondecreasing and they are directly estimated by a suitable error function without any distributional assumption. In this paper we propose a flexible four-parameter age function class for the reference interval bounds and a method to estimate those. The four parameters in the class have concrete meanings; starting value at age 0, asymptotic value at increasing age, time scale and shape. The function class satisfies some desirable properties, which are discussed. The estimation of the parameters in the model uses the same type of error function as in the purely nonparametric methods. With our method we also get an estimate of the distributional position of an observation for a new individual given its age. The method is illustrated by an application example, where a 90% reference interval for ocular axis length of children up to age 18 years are determined.  相似文献   

20.
The estimation of body segment properties is important in the biomechanical analysis of movement. Current subject-specific estimation methods however can be expensive and time-consuming, while other methods do not adequately take into account individual or group variability. We describe a simple procedure for estimating subject-specific geometric properties, independent of joint centres. The method requires only a small number of anthropometric measurements and digital images of the segment or subject, a 3-dimensional modeller program and simple mathematical calculations to estimate segment volumes and centroids. Assuming that the segment is of uniform density, it's mass and moment of inertia can also be derived. Future work should include generating segment density profiles for particular populations, to increase the accuracy of the method, and comparing the accuracy of the results obtained with those produced by other techniques.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号