首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
When novel scientific questions arise after longitudinal binary data have been collected, the subsequent selection of subjects from the cohort for whom further detailed assessment will be undertaken is often necessary to efficiently collect new information. Key examples of additional data collection include retrospective questionnaire data, novel data linkage, or evaluation of stored biological specimens. In such cases, all data required for the new analyses are available except for the new target predictor or exposure. We propose a class of longitudinal outcome-dependent sampling schemes and detail a design corrected conditional maximum likelihood analysis for highly efficient estimation of time-varying and time-invariant covariate coefficients when resource limitations prohibit exposure ascertainment on all participants. Additionally, we detail an important study planning phase that exploits available cohort data to proactively examine the feasibility of any proposed substudy as well as to inform decisions regarding the most desirable study design. The proposed designs and associated analyses are discussed in the context of a study that seeks to examine the modifying effect of an interleukin-10 cytokine single nucleotide polymorphism on asthma symptom regression in adolescents participating Childhood Asthma Management Program Continuation Study. Using this example we assume that all data necessary to conduct the study are available except subject-specific genotype data. We also assume that these data would be ascertained by analyzing stored blood samples, the cost of which limits the sample size.  相似文献   

2.
Summary The nested case–control design is a relatively new type of observational study whereby a case–control approach is employed within an established cohort. In this design, we observe cases and controls longitudinally by sampling all cases whenever they occur but controls at certain time points. Controls can be obtained at time points randomly scheduled or prefixed for operational convenience. This design with longitudinal observations is efficient in terms of cost and duration, especially when the disease is rare and the assessment of exposure levels is difficult. In our design, we propose sequential sampling methods and study both (group) sequential testing and estimation methods so that the study can be stopped as soon as the stopping rule is satisfied. To make such a longitudinal sampling more efficient in terms of both numbers of subjects and replications, we propose applying sequential sampling methods to subjects and replications, simultaneously, until the information criterion is fulfilled. This simultaneous sequential sampling on subjects and replicates is more flexible for practitioners designing their sampling schemes, and is different from the classical approaches used in longitudinal studies. We newly define the σ‐field to accommodate our proposed sampling scheme, which contains mixtures of independent and correlated observations, and prove the asymptotic optimality of sequential estimation based on the martingale theories. We also prove that the independent increment structure is retained so that the group sequential method is applicable. Finally, we present results by employing sequential estimation and group sequential testing on both simulated data and real data on children's diarrhea.  相似文献   

3.
Cai J  Zeng D 《Biometrics》2004,60(4):1015-1024
In epidemiologic studies and disease prevention trials, interest often involves estimation of the relationship between some disease endpoints and individual exposure. In some studies, due to the rarity of the disease and the cost in collecting the exposure information for the entire cohort, a case-cohort design, which consists of a small random sample of the whole cohort and all the diseased subjects, is often used. Previous work has focused on analyzing data from the case-cohort design and few have discussed the sample size issues. In this article, we describe two tests for the case-cohort design, which can be treated as a natural generalization of log-rank test in the full cohort design. We derive an explicit form for power/sample size calculation based on these two tests. A number of simulation studies have been used to illustrate the efficiency of the tests for the case-cohort design. An example is provided on how to use the formula.  相似文献   

4.
In a typical case-control study, exposure information is collected at a single time point for the cases and controls. However, case-control studies are often embedded in existing cohort studies containing a wealth of longitudinal exposure history about the participants. Recent medical studies have indicated that incorporating past exposure history, or a constructed summary measure of cumulative exposure derived from the past exposure history, when available, may lead to more precise and clinically meaningful estimates of the disease risk. In this article, we propose a flexible Bayesian semiparametric approach to model the longitudinal exposure profiles of the cases and controls and then use measures of cumulative exposure based on a weighted integral of this trajectory in the final disease risk model. The estimation is done via a joint likelihood. In the construction of the cumulative exposure summary, we introduce an influence function, a smooth function of time to characterize the association pattern of the exposure profile on the disease status with different time windows potentially having differential influence/weights. This enables us to analyze how the present disease status of a subject is influenced by his/her past exposure history conditional on the current ones. The joint likelihood formulation allows us to properly account for uncertainties associated with both stages of the estimation process in an integrated manner. Analysis is carried out in a hierarchical Bayesian framework using reversible jump Markov chain Monte Carlo algorithms. The proposed methodology is motivated by, and applied to a case-control study of prostate cancer where longitudinal biomarker information is available for the cases and controls.  相似文献   

5.
Chen J  Chatterjee N 《Biometrics》2006,62(1):28-35
Genetic epidemiologic studies often collect genotype data at multiple loci within a genomic region of interest from a sample of unrelated individuals. One popular method for analyzing such data is to assess whether haplotypes, i.e., the arrangements of alleles along individual chromosomes, are associated with the disease phenotype or not. For many study subjects, however, the exact haplotype configuration on the pair of homologous chromosomes cannot be derived with certainty from the available locus-specific genotype data (phase ambiguity). In this article, we consider estimating haplotype-specific association parameters in the Cox proportional hazards model, using genotype, environmental exposure, and the disease endpoint data collected from cohort or nested case-control studies. We study alternative Expectation-Maximization algorithms for estimating haplotype frequencies from cohort and nested case-control studies. Based on a hazard function of the disease derived from the observed genotype data, we then propose a semiparametric method for joint estimation of relative-risk parameters and the cumulative baseline hazard function. The method is greatly simplified under a rare disease assumption, for which an asymptotic variance estimator is also proposed. The performance of the proposed estimators is assessed via simulation studies. An application of the proposed method is presented, using data from the Alpha-Tocopherol, Beta-Carotene Cancer Prevention Study.  相似文献   

6.
Biomedical researchers are often interested in estimating the effect of an environmental exposure in relation to a chronic disease endpoint. However, the exposure variable of interest may be measured with errors. In a subset of the whole cohort, a surrogate variable is available for the true unobserved exposure variable. The surrogate variable satisfies an additive measurement error model, but it may not have repeated measurements. The subset in which the surrogate variables are available is called a calibration sample. In addition to the surrogate variables that are available among the subjects in the calibration sample, we consider the situation when there is an instrumental variable available for all study subjects. An instrumental variable is correlated with the unobserved true exposure variable, and hence can be useful in the estimation of the regression coefficients. In this paper, we propose a nonparametric method for Cox regression using the observed data from the whole cohort. The nonparametric estimator is the best linear combination of a nonparametric correction estimator from the calibration sample and the difference of the naive estimators from the calibration sample and the whole cohort. The asymptotic distribution is derived, and the finite sample performance of the proposed estimator is examined via intensive simulation studies. The methods are applied to the Nutritional Biomarkers Study of the Women's Health Initiative.  相似文献   

7.
Chan KC  Wang MC 《Biometrics》2012,68(2):521-531
A prevalent sample consists of individuals who have experienced disease incidence but not failure event at the sampling time. We discuss methods for estimating the distribution function of a random vector defined at baseline for an incident disease population when data are collected by prevalent sampling. Prevalent sampling design is often more focused and economical than incident study design for studying the survival distribution of a diseased population, but prevalent samples are biased by design. Subjects with longer survival time are more likely to be included in a prevalent cohort, and other baseline variables of interests that are correlated with survival time are also subject to sampling bias induced by the prevalent sampling scheme. Without recognition of the bias, applying empirical distribution function to estimate the population distribution of baseline variables can lead to serious bias. In this article, nonparametric and semiparametric methods are developed for distribution estimation of baseline variables using prevalent data.  相似文献   

8.
Chorionic villus sampling (CVS) is a valued method of prenatal diagnosis that is often preferred over amniocentesis because it can be performed earlier, but which has also raised concern over a possible association with increased risk of terminal transverse limb deficiency (TTLD). We present and apply a meta-analytic method for estimating a combined dose-response effect from a series of case-control and cohort studies in which the exposure variable is interval-censored. Assuming coarsening at random for the interval-censoring, and calling upon the familiar result of Cornfield to pool case-control and cohort information on the association between a rare binary outcome and a multilevel exposure variable, we form a likelihood-based model to assess the effect of gestational age at the time of CVS on the presence or absence of a rare birth defect. Effect estimates are computed with a variant of the EM algorithm termed the method of weights, which enables the use of standard weighted regression software. Our findings suggest that CVS exposure at early gestational age leads to an increased risk of TTLD.  相似文献   

9.
With the rapid development of biomarkers and new technologies, large-scale biologically-based cohort studies present expanding opportunities for population-based research on disease etiology and early detection markers. The prostate, lung, colorectal and ovarian cancer (PLCO) screening trial is a large randomized trial designed to determine if screening for these cancers leads to mortality reduction for these diseases. Within the Trial, the PLCO etiology and early marker study (EEMS) identifies risk factors for cancer and other diseases and evaluates biologic markers for the early detection of disease. EEMS includes 155,000 volunteers who provide basic risk factor information. Serial blood samples are collected at each of six screening rounds (including one collection for cryopreserved whole blood) from screening arm participants (77,000 subjects) and buccal cells are collected from those in the control arm of the trial. Etiologic studies consider environmental (e.g., diet), biochemical, and genetic factors. Early detection studies focus on blood-based biologic markers of early disease. Clinical epidemiology is also an important component of the PLCO trial.  相似文献   

10.
The design and analysis of case-control studies with biased sampling   总被引:4,自引:0,他引:4  
A design is proposed for case-control studies in which selection of subjects for full variable ascertainment is based jointly on disease status and on easily obtained "screening" variables that may be related to the disease. Recruitment of subjects follows an independent Bernoulli sampling scheme, with recruitment probabilities set by the investigator in advance. In particular, the sampling can be set up to achieve, on average, frequency matching, provided prior estimates of the disease rates or odds ratios associated with screening variables such as age and sex are available. Alternatively--for example, when studying a rare exposure--one can enrich the sample with certain categories of subject. Following such a design, there are two valid approaches to logistic regression analysis, both of which allow for efficient estimation of effects associated with the screening variables that were allowed to bias the recruitment. The statistical properties of the estimators are compared, both for large samples, based on asymptotics, and for small samples, based on simulations.  相似文献   

11.
We introduce a method of parameter estimation for a random effects cure rate model. We also propose a methodology that allows us to account for nonignorable missing covariates in this class of models. The proposed method corrects for possible bias introduced by complete case analysis when missing data are not missing completely at random and is motivated by data from a pair of melanoma studies conducted by the Eastern Cooperative Oncology Group in which clustering by cohort or time of study entry was suspected. In addition, these models allow estimation of cure rates, which is desirable when we do not wish to assume that all subjects remain at risk of death or relapse from disease after sufficient follow-up. We develop an EM algorithm for the model and provide an efficient Gibbs sampling scheme for carrying out the E-step of the algorithm.  相似文献   

12.
ABSTRACT: BACKGROUND: Uveitis is an autoimmune disease of the eye that refers to any of a number of intraocular inflammatory conditions. Because it is a rare disease, uveitis is often overlooked, and the possible associations between uveitis and extra-ocular disease manifestations are not well known. The aim of this study was to characterise uveitis in a large sample of patients and to evaluate the relationship between uveitis and systemic diseases. METHODS: The present study is a cross-sectional study of a cohort of patients with uveitis. Records from consecutive uveitis patients who were seen by the Uveitis Service in the Department of Ophthalmology at the Medical University of Vienna between 1995 and 2009 were selected from the clinical databases. The cases were classified according to the Standardization of Uveitis Nomenclature Study Group criteria for uveitis. RESULTS: Data were available for 2619 patients, of whom 59.9% suffered from anterior, 14.8% from intermediate, 18.3% from posterior and 7.0% from panuveitis. 37.2% of all cases showed an association between uveitis and extra-organ diseases; diseases with primarily arthritic manifestations were seen in 10.1% of all cases, non-infectious systemic diseases (i.e., Behcet's disease, sarcoidosis or multiple sclerosis) in 8.4% and infectious uveitis in 18.7%. 49.4% of subjects suffering from anterior uveitis tested positively for the HLA-B27 antigen. In posterior uveitis cases 29% were caused by ocular toxoplasmosis and 17.7% by multifocal choroiditis. CONCLUSION: Ophthalmologists, rheumatologists, infectiologists, neurologists and general practitioners should be familiar with the differential diagnosis of uveitis. A better interdisciplinary approach could help in tailoring of the work-up, earlier diagnosis of co-existing diseases and management of uveitis patients.  相似文献   

13.
The frequency of inherited malformations as well as genetic disorders in newborns account for around 3-5%. These frequency is much higher in early stages of pregnancy, because serious malformations and genetic disorders usually lead to spontaneous abortion. Prenatal diagnosis allowed identification of malformations and/or some genetic syndromes in fetuses during the first trimester of pregnancy. Thereafter, taking into account the severity of the disorders the decision should be taken in regard of subsequent course of the pregnancy taking into account a possibilities of treatment, parent's acceptation of a handicapped child but also, in some cases the possibility of termination of the pregnancy. In prenatal testing, both screening and diagnostic procedures are included. Screening procedures such as first and second trimester biochemical and/or ultrasound screening, first trimester combined ultrasound/biochemical screening and integrated screening should be widely offered to pregnant women. However, interpretation of screening results requires awareness of both sensitivity and predictive value of these procedures. In prenatal diagnosis ultrasound/MRI searching as well as genetic procedures are offered to pregnant women. A variety of approaches for genetic prenatal analyses are now available, including preimplantation diagnosis, chorion villi sampling, amniocentesis, fetal blood sampling as well as promising experimental procedures (e.g. fetal cell and DNA isolation from maternal blood). An incredible progress in genetic methods opened new possibilities for valuable genetic diagnosis. Although karyotyping is widely accepted as golden standard, the discussion is ongoing throughout Europe concerning shifting to new genetic techniques which allow obtaining rapid results in prenatal diagnosis of aneuploidy (e.g. RAPID-FISH, MLPA, quantitative PCR).  相似文献   

14.
Cook RJ  Brumback BB  Wigg MB  Ryan LM 《Biometrics》2001,57(3):671-680
We describe a method for assessing dose-response effects from a series of case-control and cohort studies in which the exposure information is interval censored. The interval censoring of the exposure variable is dealt with through the use of retrospective models in which the exposure is treated as a multinomial response and disease status as a binary covariate. Polychotomous logistic regression models are adopted in which the dose-response relationship between exposure and disease may be modeled in a discrete or continuous fashion. Partial conditioning is possible to eliminate some of the nuisance parameters. The methods are applied to the motivating study of the relationship between chorionic villus sampling and the occurrence of terminal transverse limb reduction.  相似文献   

15.
The thyroid gland in children is one of the organs that is most sensitive to external exposure to X and gamma rays. However, data on the risk of thyroid cancer in children after exposure to radioactive iodines are sparse. The Chornobyl accident in Ukraine in 1986 led to the exposure of large populations to radioactive iodines, particularly (131)I. This paper describes an ongoing cohort study being conducted in Belarus and Ukraine that includes 25,161 subjects under the age of 18 years in 1986 who are being screened for thyroid diseases every 2 years. Individual thyroid doses are being estimated for all study subjects based on measurement of the radioactivity of the thyroid gland made in 1986 together with a radioecological model and interview data. Approximately 100 histologically confirmed thyroid cancers were detected as a consequence of the first round of screening. The data will enable fitting appropriate dose-response models, which are important in both radiation epidemiology and public health for prediction of risks from exposure to radioactive iodines from medical sources and any future nuclear accidents. Plans are to continue to follow-up the cohort for at least three screening cycles, which will lead to more precise estimates of risk.  相似文献   

16.
Case-control designs are widely used in rare disease studies. In a typical case-control study, data are collected from a sample of all available subjects who have experienced a disease (cases) and a sub-sample of subjects who have not experienced the disease (controls) in a study cohort. Cases are oversampled in case-control studies. Logistic regression is a common tool to estimate the relative risks of the disease with respect to a set of covariates. Very often in such a study, information of ages-at-onset of the disease for all cases and ages at survey of controls are known. Standard logistic regression analysis using age as a covariate is based on a dichotomous outcome and does not efficiently use such age-at-onset (time-to-event) information. We propose to analyze age-at-onset data using a modified case-cohort method by treating the control group as an approximation of a subcohort assuming rare events. We investigate the asymptotic bias of this approximation and show that the asymptotic bias of the proposed estimator is small when the disease rate is low. We evaluate the finite sample performance of the proposed method through a simulation study and illustrate the method using a breast cancer case-control data set.  相似文献   

17.
Summary With advances in modern medicine and clinical diagnosis, case–control data with characterization of finer subtypes of cases are often available. In matched case–control studies, missingness in exposure values often leads to deletion of entire stratum, and thus entails a significant loss in information. When subtypes of cases are treated as categorical outcomes, the data are further stratified and deletion of observations becomes even more expensive in terms of precision of the category‐specific odds‐ratio parameters, especially using the multinomial logit model. The stereotype regression model for categorical responses lies intermediate between the proportional odds and the multinomial or baseline category logit model. The use of this class of models has been limited as the structure of the model implies certain inferential challenges with nonidentifiability and nonlinearity in the parameters. We illustrate how to handle missing data in matched case–control studies with finer disease subclassification within the cases under a stereotype regression model. We present both Monte Carlo based full Bayesian approach and expectation/conditional maximization algorithm for the estimation of model parameters in the presence of a completely general missingness mechanism. We illustrate our methods by using data from an ongoing matched case–control study of colorectal cancer. Simulation results are presented under various missing data mechanisms and departures from modeling assumptions.  相似文献   

18.
The performance of diagnostic tests is often evaluated by estimating their sensitivity and specificity with respect to a traditionally accepted standard test regarded as a “gold standard” in making the diagnosis. Correlated samples of binary data arise in many fields of application. The fundamental unit for analysis is occasionally the site rather than the subject in site-specific studies. Statistical methods that take into account the within-subject corelation should be employed to estimate the sensitivity and the specificity of diagnostic tests since site-specific results within a subject can be highly correlated. I introduce several statistical methods for the estimation of the sensitivity and the specificity of sitespecific diagnostic tests. I apply these techniques to the data from a study involving an enzymatic diagnostic test to motivate and illustrate the estimation of the sensitivity and the specificity of periodontal diagnostic tests. I present results from a simulation study for the estimation of diagnostic sensitivity when the data are correlated within subjects. Through a simulation study, I compare the performance of the binomial estimator pCBE, the ratio estimator pCBE, the weighted estimator pCWE, the intracluster correlation estimator pCIC, and the generalized estimating equation (GEE) estimator PCGEE in terms of biases, observed variances, mean squared errors (MSE), relative efficiencies of their variances and 95 per cent coverage proportions. I recommend using PCBE when σ == 0. I recommend use of the weighted estimator PCWE when σ = 0.6. When σ == 0.2 or σ == 0.4, and the number of subjects is at least 30, PCGEE performs well.  相似文献   

19.
McNamee R 《Biometrics》2004,60(3):783-792
Two-phase designs for estimation of prevalence, where the first-phase classification is fallible and the second is accurate but relatively expensive, are not necessarily justified on efficiency grounds. However, they might be advantageous for dual-purpose studies, for example where prevalence estimation is followed by a clinical trial or case-control study, if they can identify cases of disease for the second study in a cost-effective way. Alternatively, they may be justified on ethical grounds if they can identify more, previously undetected but treatable cases of disease, than a simple random sample design. An approach to sampling is proposed, which formally combines the goals of efficient prevalence estimation and case detection by setting different notional study costs for investigating cases and noncases. Two variants of the method are compared with an "ethical" two-phase scheme proposed by Shrout and Newman (1989, Biometrics 45, 549-555), and with the most efficient scheme for prevalence estimation alone, in terms of the standard error of the prevalence estimate, the expected number of cases, and the fraction of cases among second-phase subjects, given a fixed budget. One variant yields the highest fraction and expected number of cases but also the largest standard errors. The other yields a higher fraction than Shrout and Newman's scheme and a similar number of cases but appears to do so more efficiently.  相似文献   

20.
Nested case-control sampling is designed to reduce the costs of large cohort studies. It is important to estimate the parameters of interest as efficiently as possible. We present a new maximum likelihood estimator (MLE) for nested case-control sampling in the context of Cox's proportional hazards model. The MLE is computed by the EM-algorithm, which is easy to implement in the proportional hazards setting. Standard errors are estimated by a numerical profile likelihood approach based on EM aided differentiation. The work was motivated by a nested case-control study that hypothesized that insulin-like growth factor I was associated with ischemic heart disease. The study was based on a population of 3784 Danes and 231 cases of ischemic heart disease where controls were matched on age and gender. We illustrate the use of the MLE for these data and show how the maximum likelihood framework can be used to obtain information additional to the relative risk estimates of covariates.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号