首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In population‐based case‐control studies, it is of great public‐health importance to estimate the disease incidence rates associated with different levels of risk factors. This estimation is complicated by the fact that in such studies the selection probabilities for the cases and controls are unequal. A further complication arises when the subjects who are selected into the study do not participate (i.e. become nonrespondents) and nonrespondents differ systematically from respondents. In this paper, we show how to account for unequal selection probabilities as well as differential nonresponses in the incidence estimation. We use two logistic models, one relating the disease incidence rate to the risk factors, and one modelling the predictors that affect the nonresponse probability. After estimating the regression parameters in the nonresponse model, we estimate the regression parameters in the disease incidence model by a weighted estimating function that weights a respondent's contribution to the likelihood score function by the inverse of the product of his/her selection probability and his/her model‐predicted response probability. The resulting estimators of the regression parameters and the corresponding estimators of the incidence rates are shown to be consistent and asymptotically normal with easily estimated variances. Simulation results demonstrate that the asymptotic approximations are adequate for practical use and that failure to adjust for nonresponses could result in severe biases. An illustration with data from a cardiovascular study that motivated this work is presented.  相似文献   

2.
The problem of combining information from separate trials is a key consideration when performing a meta‐analysis or planning a multicentre trial. Although there is a considerable journal literature on meta‐analysis based on individual patient data (IPD), i.e. a one‐step IPD meta‐analysis, versus analysis based on summary data, i.e. a two‐step IPD meta‐analysis, recent articles in the medical literature indicate that there is still confusion and uncertainty as to the validity of an analysis based on aggregate data. In this study, we address one of the central statistical issues by considering the estimation of a linear function of the mean, based on linear models for summary data and for IPD. The summary data from a trial is assumed to comprise the best linear unbiased estimator, or maximum likelihood estimator of the parameter, along with its covariance matrix. The setup, which allows for the presence of random effects and covariates in the model, is quite general and includes many of the commonly employed models, for example, linear models with fixed treatment effects and fixed or random trial effects. For this general model, we derive a condition under which the one‐step and two‐step IPD meta‐analysis estimators coincide, extending earlier work considerably. The implications of this result for the specific models mentioned above are illustrated in detail, both theoretically and in terms of two real data sets, and the roles of balance and heterogeneity are highlighted. Our analysis also shows that when covariates are present, which is typically the case, the two estimators coincide only under extra simplifying assumptions, which are somewhat unrealistic in practice.  相似文献   

3.
Summary Time varying, individual covariates are problematic in experiments with marked animals because the covariate can typically only be observed when each animal is captured. We examine three methods to incorporate time varying, individual covariates of the survival probabilities into the analysis of data from mark‐recapture‐recovery experiments: deterministic imputation, a Bayesian imputation approach based on modeling the joint distribution of the covariate and the capture history, and a conditional approach considering only the events for which the associated covariate data are completely observed (the trinomial model). After describing the three methods, we compare results from their application to the analysis of the effect of body mass on the survival of Soay sheep (Ovis aries) on the Isle of Hirta, Scotland. Simulations based on these results are then used to make further comparisons. We conclude that both the trinomial model and Bayesian imputation method perform best in different situations. If the capture and recovery probabilities are all high, then the trinomial model produces precise, unbiased estimators that do not depend on any assumptions regarding the distribution of the covariate. In contrast, the Bayesian imputation method performs substantially better when capture and recovery probabilities are low, provided that the specified model of the covariate is a good approximation to the true data‐generating mechanism.  相似文献   

4.
Many studies have focused on determining the effect of the body mass index (BMI) on the mortality in different cohorts. In this article, we propose an additive‐multiplicative mean residual life (MRL) model to assess the effects of BMI and other risk factors on the MRL function of survival time in a cohort of Chinese type 2 diabetic patients. The proposed model can simultaneously manage additive and multiplicative risk factors and provide a comprehensible interpretation of their effects on the MRL function of interest. We develop an estimation procedure through pseudo partial score equations to obtain parameter estimates. We establish the asymptotic properties of the proposed estimators and conduct simulations to demonstrate the performance of the proposed method. The application of the procedure to a study on the life expectancy of type 2 diabetic patients reveals new insights into the extension of the life expectancy of such patients.  相似文献   

5.
In capture–recapture models, survival and capture probabilities can be modelled as functions of time‐varying covariates, such as temperature or rainfall. The Cormack–Jolly–Seber (CJS) model allows for flexible modelling of these covariates; however, the functional relationship may not be linear. We extend the CJS model by semi‐parametrically modelling capture and survival probabilities using a frequentist approach via P‐splines techniques. We investigate the performance of the estimators by conducting simulation studies. We also apply and compare these models with known semi‐parametric Bayesian approaches on simulated and real data sets.  相似文献   

6.
The proportion ratio (PR) of responses between an experimental treatment and a control treatment is one of the most commonly used indices to measure the relative treatment effect in a randomized clinical trial. We develop asymptotic and permutation‐based procedures for testing equality of treatment effects as well as derive confidence intervals of PRs for multivariate binary matched‐pair data under a mixed‐effects exponential risk model. To evaluate and compare the performance of these test procedures and interval estimators, we employ Monte Carlo simulation. When the number of matched pairs is large, we find that all test procedures presented here can perform well with respect to Type I error. When the number of matched pairs is small, the permutation‐based test procedures developed in this paper is of use. Furthermore, using test procedures (or interval estimators) based on a weighted linear average estimator of treatment effects can improve power (or gain precision) when the treatment effects on all response variables of interest are known to fall in the same direction. Finally, we apply the data taken from a crossover clinical trial that monitored several adverse events of an antidepressive drug to illustrate the practical use of test procedures and interval estimators considered here.  相似文献   

7.
Summary Gilbert, Rossini, and Shankarappa (2005 , Biometrics 61 , 106‐117) present four U‐statistic based tests to compare genetic diversity between different samples. The proposed tests improved upon previously used methods by accounting for the correlations in the data. We find, however, that the same correlations introduce an unacceptable bias in the sample estimators used for the variance and covariance of the inter‐sequence genetic distances for modest sample sizes. Here, we compute unbiased estimators for these and test the resulting improvement using simulated data. We also show that, contrary to the claims in Gilbert et al., it is not always possible to apply the Welch–Satterthwaite approximate t‐test, and we provide explicit formulas for the degrees of freedom to be used when, on the other hand, such approximation is indeed possible.  相似文献   

8.
Estimates of relatedness coefficients, based on genetic marker data, are often necessary for studies of genetics and ecology. Whilst many estimates based on method‐of‐moment or maximum‐likelihood methods exist for diploid organisms, no such estimators exist for organisms with multiple ploidy levels, which occur in some insect and plant species. Here, we extend five estimators to account for different levels of ploidy: one relatedness coefficient estimator, three coefficients of coancestry estimators and one maximum‐likelihood estimator. We use arrhenotoky (when unfertilized eggs develop into haploid males) as an example in evaluations of estimator performance by Monte Carlo simulation. Also, three virtual sex‐determination systems are simulated to evaluate their performances for higher levels of ploidy. Additionally, we used two real data sets to test the robustness of these estimators under actual conditions. We make available a software package, PolyRelatedness , for other researchers to apply to organisms that have various levels of ploidy.  相似文献   

9.
There is a need for epidemiological and medical researchers to identify new biomarkers (biological markers) that are useful in determining exposure levels and/or for the purposes of disease detection. Often this process is stunted by high testing costs associated with evaluating new biomarkers. Traditionally, biomarker assessments are individually tested within a target population. Pooling has been proposed to help alleviate the testing costs, where pools are formed by combining several individual specimens. Methods for using pooled biomarker assessments to estimate discriminatory ability have been developed. However, all these procedures have failed to acknowledge confounding factors. In this paper, we propose a regression methodology based on pooled biomarker measurements that allow the assessment of the discriminatory ability of a biomarker of interest. In particular, we develop covariate‐adjusted estimators of the receiver‐operating characteristic curve, the area under the curve, and Youden's index. We establish the asymptotic properties of these estimators and develop inferential techniques that allow one to assess whether a biomarker is a good discriminator between cases and controls, while controlling for confounders. The finite sample performance of the proposed methodology is illustrated through simulation. We apply our methods to analyze myocardial infarction (MI) data, with the goal of determining whether the pro‐inflammatory cytokine interleukin‐6 is a good predictor of MI after controlling for the subjects' cholesterol levels.  相似文献   

10.
Completeness of registration is one of the quality indicators usually reported by cancer registries. This allows researchers to assess how useful and representative the data is. Several methods have been suggested to estimate completeness. In this paper a multi‐state model for the process of cancer diagnosis and treatment is presented. In principle, every contact with a doctor during diagnosis, treatment, and aftercare can give rise to a cancer registry notification with a certain probability. Therefore the states included in the model are “incident tumour” and “death” but also contacts with doctors such as consultation of a general practitioner or specialised doctor, diagnostic procedures, therapeutic interventions, and aftercare. In this model transitions between states and possible notifications to a cancer registry after entering a state are simulated. Transition intensities are derived and used in simulation. Several capture‐recapture methods have been applied to the simulated data. Simulated “true” numbers of new cases and simulated numbers of registrations are both available. This allows to assess the validity of the completeness estimates and to compare the relative merits of the methods. In the scenarios investigated here, all capture‐recapture estimators tended to underestimate completeness. While a modified DCN method and one type of log‐linear model yielded quite reasonable estimates other methods exhibited large variability or grossly underestimated completeness. (© 2008 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

11.
To elucidate the molecular mechanisms underlying non‐alcoholic fatty liver disease (NAFLD), we recruited 86 subjects with varying degrees of hepatic steatosis (HS). We obtained experimental data on lipoprotein fluxes and used these individual measurements as personalized constraints of a hepatocyte genome‐scale metabolic model to investigate metabolic differences in liver, taking into account its interactions with other tissues. Our systems level analysis predicted an altered demand for NAD+ and glutathione (GSH) in subjects with high HS. Our analysis and metabolomic measurements showed that plasma levels of glycine, serine, and associated metabolites are negatively correlated with HS, suggesting that these GSH metabolism precursors might be limiting. Quantification of the hepatic expression levels of the associated enzymes further pointed to altered de novo GSH synthesis. To assess the effect of GSH and NAD+ repletion on the development of NAFLD, we added precursors for GSH and NAD+ biosynthesis to the Western diet and demonstrated that supplementation prevents HS in mice. In a proof‐of‐concept human study, we found improved liver function and decreased HS after supplementation with serine (a precursor to glycine) and hereby propose a strategy for NAFLD treatment.  相似文献   

12.
Count data sets are traditionally analyzed using the ordinary Poisson distribution. However, such a model has its applicability limited as it can be somewhat restrictive to handle specific data structures. In this case, it arises the need for obtaining alternative models that accommodate, for example, (a) zero‐modification (inflation or deflation at the frequency of zeros), (b) overdispersion, and (c) individual heterogeneity arising from clustering or repeated (correlated) measurements made on the same subject. Cases (a)–(b) and (b)–(c) are often treated together in the statistical literature with several practical applications, but models supporting all at once are less common. Hence, this paper's primary goal was to jointly address these issues by deriving a mixed‐effects regression model based on the hurdle version of the Poisson–Lindley distribution. In this framework, the zero‐modification is incorporated by assuming that a binary probability model determines which outcomes are zero‐valued, and a zero‐truncated process is responsible for generating positive observations. Approximate posterior inferences for the model parameters were obtained from a fully Bayesian approach based on the Adaptive Metropolis algorithm. Intensive Monte Carlo simulation studies were performed to assess the empirical properties of the Bayesian estimators. The proposed model was considered for the analysis of a real data set, and its competitiveness regarding some well‐established mixed‐effects models for count data was evaluated. A sensitivity analysis to detect observations that may impact parameter estimates was performed based on standard divergence measures. The Bayesian ‐value and the randomized quantile residuals were considered for model diagnostics.  相似文献   

13.
A nonparametric model for the multivariate one‐way design is discussed which entails continuous as well as discontinuous distributions and, therefore, allows for ordinal data. Nonparametric hypotheses are formulated by the normalized version of the marginal distribution functions as well as the common distribution functions. The differences between the distribution functions are described by means of the so‐called relative treatment effects, for which unbiased and consistent estimators are derived. The asymptotic distribution of the vector of the effect estimators is derived and under the marignal hypothesis a consistent estimator for the asymptotic covariance matrix is given. Nonparametric versions of the Wald‐type statistic, the ANOVA‐type statistic and the Lawley‐Hotelling statistic are considered and compared by means of a simulation study. Finally, these tests are applied to a psychiatric clinical trial.  相似文献   

14.
The meta‐analysis of diagnostic accuracy studies is often of interest in screening programs for many diseases. The typical summary statistics for studies chosen for a diagnostic accuracy meta‐analysis are often two dimensional: sensitivities and specificities. The common statistical analysis approach for the meta‐analysis of diagnostic studies is based on the bivariate generalized linear‐mixed model (BGLMM), which has study‐specific interpretations. In this article, we present a population‐averaged (PA) model using generalized estimating equations (GEE) for making inference on mean specificity and sensitivity of a diagnostic test in the population represented by the meta‐analytic studies. We also derive the marginalized counterparts of the regression parameters from the BGLMM. We illustrate the proposed PA approach through two dataset examples and compare performance of estimators of the marginal regression parameters from the PA model with those of the marginalized regression parameters from the BGLMM through Monte Carlo simulation studies. Overall, both marginalized BGLMM and GEE with sandwich standard errors maintained nominal 95% confidence interval coverage levels for mean specificity and mean sensitivity in meta‐analysis of 25 of more studies even under misspecification of the covariance structure of the bivariate positive test counts for diseased and nondiseased subjects.  相似文献   

15.
Designs incorporating more than one endpoint have become popular in drug development. One of such designs allows for incorporation of short‐term information in an interim analysis if the long‐term primary endpoint has not been yet observed for some of the patients. At first we consider a two‐stage design with binary endpoints allowing for futility stopping only based on conditional power under both fixed and observed effects. Design characteristics of three estimators: using primary long‐term endpoint only, short‐term endpoint only, and combining data from both are compared. For each approach, equivalent cut‐off point values for fixed and observed effect conditional power calculations can be derived resulting in the same overall power. While in trials stopping for futility the type I error rate cannot get inflated (it usually decreases), there is loss of power. In this study, we consider different scenarios, including different thresholds for conditional power, different amount of information available at the interim, different correlations and probabilities of success. We further extend the methods to adaptive designs with unblinded sample size reassessments based on conditional power with inverse normal method as the combination function. Two different futility stopping rules are considered: one based on the conditional power, and one from P‐values based on Z‐statistics of the estimators. Average sample size, probability to stop for futility and overall power of the trial are compared and the influence of the choice of weights is investigated.  相似文献   

16.
Summary Recently meta‐analysis has been widely utilized to combine information across multiple studies to evaluate a common effect. Integrating data from similar studies is particularly useful in genomic studies where the individual study sample sizes are not large relative to the number of parameters of interest. In this article, we are interested in developing robust prognostic rules for the prediction of t ‐year survival based on multiple studies. We propose to construct a composite score for prediction by fitting a stratified semiparametric transformation model that allows the studies to have related but not identical outcomes. To evaluate the accuracy of the resulting score, we provide point and interval estimators for the commonly used accuracy measures including the time‐specific receiver operating characteristic curves, and positive and negative predictive values. We apply the proposed procedures to develop prognostic rules for the 5‐year survival of breast cancer patients based on five breast cancer genomic studies.  相似文献   

17.
Multistate models can be successfully used for describing complex event history data, for example, describing stages in the disease progression of a patient. The so‐called “illness‐death” model plays a central role in the theory and practice of these models. Many time‐to‐event datasets from medical studies with multiple end points can be reduced to this generic structure. In these models one important goal is the modeling of transition rates but biomedical researchers are also interested in reporting interpretable results in a simple and summarized manner. These include estimates of predictive probabilities, such as the transition probabilities, occupation probabilities, cumulative incidence functions, and the sojourn time distributions. We will give a review of some of the available methods for estimating such quantities in the progressive illness‐death model conditionally (or not) on covariate measures. For some of these quantities estimators based on subsampling are employed. Subsampling, also referred to as landmarking, leads to small sample sizes and usually to heavily censored data leading to estimators with higher variability. To overcome this issue estimators based on a preliminary estimation (presmoothing) of the probability of censoring may be used. Among these, the presmoothed estimators for the cumulative incidences are new. We also introduce feasible estimation methods for the cumulative incidence function conditionally on covariate measures. The proposed methods are illustrated using real data. A comparative simulation study of several estimation approaches is performed and existing software in the form of R packages is discussed.  相似文献   

18.
The receiver operating characteristic (ROC) curve is often used to assess the usefulness of a diagnostic test. We present a new method to estimate the parameters of a popular semi‐parametric ROC model, called the binormal model. Our method is based on minimization of the functional distance between two estimators of an unknown transformation postulated by the model, and has a simple, closed‐form solution. We study the asymptotics of our estimators, show via simulation that they compare favorably with existing estimators, and illustrate how covariates may be incorporated into the norm minimization framework.  相似文献   

19.
There has been remarkably little attention to using the high resolution provided by genotyping‐by‐sequencing (i.e., RADseq and similar methods) for assessing relatedness in wildlife populations. A major hurdle is the genotyping error, especially allelic dropout, often found in this type of data that could lead to downward‐biased, yet precise, estimates of relatedness. Here, we assess the applicability of genotyping‐by‐sequencing for relatedness inferences given its relatively high genotyping error rate. Individuals of known relatedness were simulated under genotyping error, allelic dropout and missing data scenarios based on an empirical ddRAD data set, and their true relatedness was compared to that estimated by seven relatedness estimators. We found that an estimator chosen through such analyses can circumvent the influence of genotyping error, with the estimator of Ritland (Genetics Research, 67, 175) shown to be unaffected by allelic dropout and to be the most accurate when there is genotyping error. We also found that the choice of estimator should not rely solely on the strength of correlation between estimated and true relatedness as a strong correlation does not necessarily mean estimates are close to true relatedness. We also demonstrated how even a large SNP data set with genotyping error (allelic dropout or otherwise) or missing data still performs better than a perfectly genotyped microsatellite data set of tens of markers. The simulation‐based approach used here can be easily implemented by others on their own genotyping‐by‐sequencing data sets to confirm the most appropriate and powerful estimator for their data.  相似文献   

20.
In the capture‐recapture problem for two independent samples, the traditional estimator, calculated as the product of the two sample sizes divided by the number of sampled subjects appearing commonly in both samples, is well known to be a biased estimator of the population size and have no finite variance under direct or binomial sampling. To alleviate these theoretical limitations, the inverse sampling, in which we continue sampling subjects in the second sample until we obtain a desired number of marked subjects who appeared in the first sample, has been proposed elsewhere. In this paper, we consider five interval estimators of the population size, including the most commonly‐used interval estimator using Wald's statistic, the interval estimator using the logarithmic transformation, the interval estimator derived from a quadratic equation developed here, the interval estimator using the χ2‐approximation, and the interval estimator based on the exact negative binomial distribution. To evaluate and compare the finite sample performance of these estimators, we employ Monte Carlo simulation to calculate the coverage probability and the standardized average length of the resulting confidence intervals in a variety of situations. To study the location of these interval estimators, we calculate the non‐coverage probability in the two tails of the confidence intervals. Finally, we briefly discuss the optimal sample size determination for a given precision to minimize the expected total cost. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号