首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A predictive continuous time model is developed for continuous panel data to assess the effect of time‐varying covariates on the general direction of the movement of a continuous response that fluctuates over time. This is accomplished by reparameterizing the infinitesimal mean of an Ornstein–Uhlenbeck processes in terms of its equilibrium mean and a drift parameter, which assesses the rate that the process reverts to its equilibrium mean. The equilibrium mean is modeled as a linear predictor of covariates. This model can be viewed as a continuous time first‐order autoregressive regression model with time‐varying lag effects of covariates and the response, which is more appropriate for unequally spaced panel data than its discrete time analog. Both maximum likelihood and quasi‐likelihood approaches are considered for estimating the model parameters and their performances are compared through simulation studies. The simpler quasi‐likelihood approach is suggested because it yields an estimator that is of high efficiency relative to the maximum likelihood estimator and it yields a variance estimator that is robust to the diffusion assumption of the model. To illustrate the proposed model, an application to diastolic blood pressure data from a follow‐up study on cardiovascular diseases is presented. Missing observations are handled naturally with this model.  相似文献   

2.
In follow‐up studies, the disease event time can be subject to left truncation and right censoring. Furthermore, medical advancements have made it possible for patients to be cured of certain types of diseases. In this article, we consider a semiparametric mixture cure model for the regression analysis of left‐truncated and right‐censored data. The model combines a logistic regression for the probability of event occurrence with the class of transformation models for the time of occurrence. We investigate two techniques for estimating model parameters. The first approach is based on martingale estimating equations (EEs). The second approach is based on the conditional likelihood function given truncation variables. The asymptotic properties of both proposed estimators are established. Simulation studies indicate that the conditional maximum‐likelihood estimator (cMLE) performs well while the estimator based on EEs is very unstable even though it is shown to be consistent. This is a special and intriguing phenomenon for the EE approach under cure model. We provide insights into this issue and find that the EE approach can be improved significantly by assigning appropriate weights to the censored observations in the EEs. This finding is useful in overcoming the instability of the EE approach in some more complicated situations, where the likelihood approach is not feasible. We illustrate the proposed estimation procedures by analyzing the age at onset of the occiput‐wall distance event for patients with ankylosing spondylitis.  相似文献   

3.
Lam KF  Lee YW  Leung TL 《Biometrics》2002,58(2):316-323
In this article, the focus is on the analysis of multivariate survival time data with various types of dependence structures. Examples of multivariate survival data include clustered data and repeated measurements from the same subject, such as the interrecurrence times of cancer tumors. A random effect semiparametric proportional odds model is proposed as an alternative to the proportional hazards model. The distribution of the random effects is assumed to be multivariate normal and the random effect is assumed to act additively to the baseline log-odds function. This class of models, which includes the usual shared random effects model, the additive variance components model, and the dynamic random effects model as special cases, is highly flexible and is capable of modeling a wide range of multivariate survival data. A unified estimation procedure is proposed to estimate the regression and dependence parameters simultaneously by means of a marginal-likelihood approach. Unlike the fully parametric case, the regression parameter estimate is not sensitive to the choice of correlation structure of the random effects. The marginal likelihood is approximated by the Monte Carlo method. Simulation studies are carried out to investigate the performance of the proposed method. The proposed method is applied to two well-known data sets, including clustered data and recurrent event times data.  相似文献   

4.
The common endpoints for the evaluation of reproductive and developmental toxic effects are the number of dead/resorbed fetuses, the number of malformed fetuses, and the number of normal fetuses for each litter. The joint distribution of the three endpoints could be modelled by a Dirichlettrinomial distribution or by a product of two-beta-binomial distributions. A simulation experiment is used to investigate the biases of the maximum likelihood estimate (MLE) for the probability of adverse effects under the Dirichlet-trinomial model and the beta-binomial model. Also, the type I errors and powers of the likelihood ratio test for comparing the difference between treatment and control are evaluated for the two underlying models. In estimation, the two MLE's are comparable, the bias estimates are small. In testing, the likelihood ratio test is generally more powerful under the Dirichlet-trinomial model than the beta-binomial model. The type I error rate is greater than the nominal level using the Dirichlet-trinomial model in some cases, when the data are generated from the two-beta-binomial model, and it is less than the nominal level using the beta-binomial model in other cases, when the data are generated from the Dirichlet-trinomial model.  相似文献   

5.
We derive a multivariate survival model for age of onset data of a sibship from an additive genetic gamma frailty model constructed basing on the inheritance vectors, and investigate the properties of this model. Based on this model, we propose a retrospective likelihood approach for genetic linkage analysis using sibship data. This test is an allele-sharing-based test, and does not require specification of genetic models or the penetrance functions. This new approach can incorporate both affected and unaffected sibs, environmental covariates and age of onset or age at censoring information and, therefore, provides a practical solution for mapping genes for complex diseases with variable age of onset. Small simulation study indicates that the proposed method performs better than the commonly used allele-sharing-based methods for linkage analysis, especially when the population disease rate is high. We applied this method to a type 1 diabetes sib pair data set and a small breast cancer data set. Both simulated and real data sets also indicate that the method is relatively robust to the misspecification to the baseline hazard function.  相似文献   

6.
Summary Nested case–control (NCC) design is a popular sampling method in large epidemiological studies for its cost effectiveness to investigate the temporal relationship of diseases with environmental exposures or biological precursors. Thomas' maximum partial likelihood estimator is commonly used to estimate the regression parameters in Cox's model for NCC data. In this article, we consider a situation in which failure/censoring information and some crude covariates are available for the entire cohort in addition to NCC data and propose an improved estimator that is asymptotically more efficient than Thomas' estimator. We adopt a projection approach that, heretofore, has only been employed in situations of random validation sampling and show that it can be well adapted to NCC designs where the sampling scheme is a dynamic process and is not independent for controls. Under certain conditions, consistency and asymptotic normality of the proposed estimator are established and a consistent variance estimator is also developed. Furthermore, a simplified approximate estimator is proposed when the disease is rare. Extensive simulations are conducted to evaluate the finite sample performance of our proposed estimators and to compare the efficiency with Thomas' estimator and other competing estimators. Moreover, sensitivity analyses are conducted to demonstrate the behavior of the proposed estimator when model assumptions are violated, and we find that the biases are reasonably small in realistic situations. We further demonstrate the proposed method with data from studies on Wilms' tumor.  相似文献   

7.
Semiparametric analysis of zero-inflated count data   总被引:1,自引:0,他引:1  
Lam KF  Xue H  Cheung YB 《Biometrics》2006,62(4):996-1003
Medical and public health research often involve the analysis of count data that exhibit a substantially large proportion of zeros, such as the number of heart attacks and the number of days of missed primary activities in a given period. A zero-inflated Poisson regression model, which hypothesizes a two-point heterogeneity in the population characterized by a binary random effect, is generally used to model such data. Subjects are broadly categorized into the low-risk group leading to structural zero counts and high-risk (or normal) group so that the counts can be modeled by a Poisson regression model. The main aim is to identify the explanatory variables that have significant effects on (i) the probability that the subject is from the low-risk group by means of a logistic regression formulation; and (ii) the magnitude of the counts, given that the subject is from the high-risk group by means of a Poisson regression where the effects of the covariates are assumed to be linearly related to the natural logarithm of the mean of the counts. In this article we consider a semiparametric zero-inflated Poisson regression model that postulates a possibly nonlinear relationship between the natural logarithm of the mean of the counts and a particular covariate. A sieve maximum likelihood estimation method is proposed. Asymptotic properties of the proposed sieve maximum likelihood estimators are discussed. Under some mild conditions, the estimators are shown to be asymptotically efficient and normally distributed. Simulation studies were carried out to investigate the performance of the proposed method. For illustration purpose, the method is applied to a data set from a public health survey conducted in Indonesia where the variable of interest is the number of days of missed primary activities due to illness in a 4-week period.  相似文献   

8.
Wang  Xuexia  Boekstegers  Felix  Brinster  Regina 《BMC genetics》2018,19(1):109-117

Background

X chromosome inactivation (XCI) is an important gene regulation mechanism in females to equalize the expression levels of X chromosome between two sexes. Generally, one of two X chromosomes in females is randomly chosen to be inactivated. Nonrandom XCI (XCI skewing) is also observed in females, which has been reported to play an important role in many X-linked diseases. However, there is no statistical measure available for the degree of the XCI skewing based on family data in population genetics.

Results

In this article, we propose a statistical approach to measure the degree of the XCI skewing based on family trios, which is represented by a ratio of two genotypic relative risks in females. The point estimate of the ratio is obtained from the maximum likelihood estimates of two genotypic relative risks. When parental genotypes are missing in some family trios, the expectation-conditional-maximization algorithm is adopted to obtain the corresponding maximum likelihood estimates. Further, the confidence interval of the ratio is derived based on the likelihood ratio test. Simulation results show that the likelihood-based confidence interval has an accurate coverage probability under the situations considered. Also, we apply our proposed method to the rheumatoid arthritis data from USA for its practical use, and find out that a locus, rs2238907, may undergo the XCI skewing against the at-risk allele. But this needs to be further confirmed by molecular genetics.

Conclusions

The proposed statistical measure for the skewness of XCI is applicable to complete family trio data or family trio data with some paternal genotypes missing. The likelihood-based confidence interval has an accurate coverage probability under the situations considered. Therefore, our proposed statistical measure is generally recommended in practice for discovering the potential loci which undergo the XCI skewing.
  相似文献   

9.
We consider the estimation of a nonparametric smooth function of some event time in a semiparametric mixed effects model from repeatedly measured data when the event time is subject to right censoring. The within-subject correlation is captured by both cross-sectional and time-dependent random effects, where the latter is modeled by a nonhomogeneous Ornstein–Uhlenbeck stochastic process. When the censoring probability depends on other variables in the model, which often happens in practice, the event time data are not missing completely at random. Hence, the complete case analysis by eliminating all the censored observations may yield biased estimates of the regression parameters including the smooth function of the event time, and is less efficient. To remedy, we derive the likelihood function for the observed data by modeling the event time distribution given other covariates. We propose a two-stage pseudo-likelihood approach for the estimation of model parameters by first plugging an estimator of the conditional event time distribution into the likelihood and then maximizing the resulting pseudo-likelihood function. Empirical evaluation shows that the proposed method yields negligible biases while significantly reduces the estimation variability. This research is motivated by the project of hormone profile estimation around age at the final menstrual period for the cohort of women in the Michigan Bone Health and Metabolism Study.  相似文献   

10.
We present a new modification of nonlinear regression models for repeated measures data with heteroscedastic error structures by combining the transform-both-sides and weighting model from Caroll and Ruppert (1988) with the nonlinear random effects model from Lindstrom and Bates (1990). The proposed parameter estimators are a combination of pseudo maximum likelihood estimators for the transform-both-sides and weighting model and maximum likelihood (ML) or restricted maximum likelihood (REML) estimators for linear mixed effects models. The new method is investigated by analyzing simulated enzyme kinetic data published by Jones (1993).  相似文献   

11.
This paper presents a computer program for analyzing disease prevalence data from animal survival experiments in which there may also be some serial sacrifice. The method has been described in Biometrics 35 (1979) 221-234. The user is interrogated about the details of particular models he wishes to fit. Then a generalized EM algorithm is used to compute maximum likelihood estimates of various quantities of interest concerning the effects of treatment, time and presence of other diseases on the prevalences and lethalities of specific diseases of interest.  相似文献   

12.
Larsen K 《Biometrics》2004,60(1):85-92
Multiple categorical variables are commonly used in medical and epidemiological research to measure specific aspects of human health and functioning. To analyze such data, models have been developed considering these categorical variables as imperfect indicators of an individual's "true" status of health or functioning. In this article, the latent class regression model is used to model the relationship between covariates, a latent class variable (the unobserved status of health or functioning), and the observed indicators (e.g., variables from a questionnaire). The Cox model is extended to encompass a latent class variable as predictor of time-to-event, while using information about latent class membership available from multiple categorical indicators. The expectation-maximization (EM) algorithm is employed to obtain maximum likelihood estimates, and standard errors are calculated based on the profile likelihood, treating the nonparametric baseline hazard as a nuisance parameter. A sampling-based method for model checking is proposed. It allows for graphical investigation of the assumption of proportional hazards across latent classes. It may also be used for checking other model assumptions, such as no additional effect of the observed indicators given latent class. The usefulness of the model framework and the proposed techniques are illustrated in an analysis of data from the Women's Health and Aging Study concerning the effect of severe mobility disability on time-to-death for elderly women.  相似文献   

13.
In this article, we propose a two-stage approach to modeling multilevel clustered non-Gaussian data with sufficiently large numbers of continuous measures per cluster. Such data are common in biological and medical studies utilizing monitoring or image-processing equipment. We consider a general class of hierarchical models that generalizes the model in the global two-stage (GTS) method for nonlinear mixed effects models by using any square-root-n-consistent and asymptotically normal estimators from stage 1 as pseudodata in the stage 2 model, and by extending the stage 2 model to accommodate random effects from multiple levels of clustering. The second-stage model is a standard linear mixed effects model with normal random effects, but the cluster-specific distributions, conditional on random effects, can be non-Gaussian. This methodology provides a flexible framework for modeling not only a location parameter but also other characteristics of conditional distributions that may be of specific interest. For estimation of the population parameters, we propose a conditional restricted maximum likelihood (CREML) approach and establish the asymptotic properties of the CREML estimators. The proposed general approach is illustrated using quartiles as cluster-specific parameters estimated in the first stage, and applied to the data example from a collagen fibril development study. We demonstrate using simulations that in samples with small numbers of independent clusters, the CREML estimators may perform better than conditional maximum likelihood estimators, which are a direct extension of the estimators from the GTS method.  相似文献   

14.
For complex diseases, recent interest has focused on methods that take into account joint effects at interacting loci. Conditioning on effects of disease loci at known locations can lead to increased power to detect effects at other loci. Moreover, use of joint models allows investigation of the etiologic mechanisms that may be involved in the disease. Here we present a method for simultaneous analysis of the joint genetic effects at several loci that uses affected relative pairs. The method is a generalization of the two-locus LOD-score analysis for affected sib pairs proposed by Cordell et al. We derive expressions for the relative risk, lambdaR, to a relative of an affected individual, in terms of the additive and epistatic components of variance at an arbitrary number of disease loci, and we show how these can be used to fit a likelihood model to the identity-by-descent sharing among pairs of affected relatives in extended pedigrees. We implement the method by use of a stepwise strategy in which, given evidence of linkage to disease at m-1 locations on the genome, we calculate the conditional likelihood curve across the genome for an mth disease locus, using multipoint methods similar to those proposed by Kruglyak et al. We evaluate the properties of our method by use of simulated data and present an application to real data from families with insulin-dependent diabetes mellitus.  相似文献   

15.
Random-effects models for serial observations with binary response   总被引:9,自引:0,他引:9  
R Stiratelli  N Laird  J H Ware 《Biometrics》1984,40(4):961-971
This paper presents a general mixed model for the analysis of serial dichotomous responses provided by a panel of study participants. Each subject's serial responses are assumed to arise from a logistic model, but with regression coefficients that vary between subjects. The logistic regression parameters are assumed to be normally distributed in the population. Inference is based upon maximum likelihood estimation of fixed effects and variance components, and empirical Bayes estimation of random effects. Exact solutions are analytically and computationally infeasible, but an approximation based on the mode of the posterior distribution of the random parameters is proposed, and is implemented by means of the EM algorithm. This approximate method is compared with a simpler two-step method proposed by Korn and Whittemore (1979, Biometrics 35, 795-804), using data from a panel study of asthmatics originally described in that paper. One advantage of the estimation strategy described here is the ability to use all of the data, including that from subjects with insufficient data to permit fitting of a separate logistic regression model, as required by the Korn and Whittemore method. However, the new method is computationally intensive.  相似文献   

16.
A method for classifying chemicals with respect to carcinogenic potential based on short-term test results is presented. The method utilizes the logistic regression model to translate results from short-term toxicity assays into predictions of the likelihood that a chemical will be carcinogenic if tested in a long-term bioassay. The proposed method differs from previous approaches in two ways. First, statistical confidence limits on probabilities of cancer rather than central estimates of those probabilities are used for classification. Second, the method does not classify all chemicals in a data base with respect to carcinogenic potential. Instead, it identifies chemicals with highest and lowest likelihood of testing positive for carcinogenicity in the bioassay. A subset of chemicals with intermediate likelihood of being positive remains unclassified, and will require further testing, perhaps in a long-term bioassay. Two data bases of binary short-term and long-term test results from the literature are used to illustrate and evaluate the proposed procedure. A cross-validation analysis of one of the data sets suggests that, for a sufficiently rich data base of chemicals, the development of a robust predictive system to replace the bioassay for some unknown chemicals is a realistic goal.  相似文献   

17.
Yin G 《Biometrics》2005,61(2):552-558
Due to natural or artificial clustering, multivariate survival data often arise in biomedical studies, for example, a dental study involving multiple teeth from each subject. A certain proportion of subjects in the population who are not expected to experience the event of interest are considered to be "cured" or insusceptible. To model correlated or clustered failure time data incorporating a surviving fraction, we propose two forms of cure rate frailty models. One model naturally introduces frailty based on biological considerations while the other is motivated from the Cox proportional hazards frailty model. We formulate the likelihood functions based on piecewise constant hazards and derive the full conditional distributions for Gibbs sampling in the Bayesian paradigm. As opposed to the Cox frailty model, the proposed methods demonstrate great potential in modeling multivariate survival data with a cure fraction. We illustrate the cure rate frailty models with a root canal therapy data set.  相似文献   

18.

Background

In genetic studies of rare complex diseases it is common to ascertain familial data from population based registries through all incident cases diagnosed during a pre-defined enrollment period. Such an ascertainment procedure is typically taken into account in the statistical analysis of the familial data by constructing either a retrospective or prospective likelihood expression, which conditions on the ascertainment event. Both of these approaches lead to a substantial loss of valuable data.

Methodology and Findings

Here we consider instead the possibilities provided by a Bayesian approach to risk analysis, which also incorporates the ascertainment procedure and reference information concerning the genetic composition of the target population to the considered statistical model. Furthermore, the proposed Bayesian hierarchical survival model does not require the considered genotype or haplotype effects be expressed as functions of corresponding allelic effects. Our modeling strategy is illustrated by a risk analysis of type 1 diabetes mellitus (T1D) in the Finnish population-based on the HLA-A, HLA-B and DRB1 human leucocyte antigen (HLA) information available for both ascertained sibships and a large number of unrelated individuals from the Finnish bone marrow donor registry. The heterozygous genotype DR3/DR4 at the DRB1 locus was associated with the lowest predictive probability of T1D free survival to the age of 15, the estimate being 0.936 (0.926; 0.945 95% credible interval) compared to the average population T1D free survival probability of 0.995.

Significance

The proposed statistical method can be modified to other population-based family data ascertained from a disease registry provided that the ascertainment process is well documented, and that external information concerning the sizes of birth cohorts and a suitable reference sample are available. We confirm the earlier findings from the same data concerning the HLA-DR3/4 related risks for T1D, and also provide here estimated predictive probabilities of disease free survival as a function of age.  相似文献   

19.
Biswas S  Lin S 《Biometrics》2012,68(2):587-597
Rare variants have been heralded as key to uncovering "missing heritability" in complex diseases. These variants can now be genotyped using next-generation sequencing technologies; nonetheless, rare haplotypes may also result from combination of common single nucleotide polymorphisms available from genome-wide association studies (GWAS). The National Eye Institute's data on age-related macular degeneration (AMD) is such an example. Studies on AMD had identified potential rare variants; however, due to lack of appropriate statistical tools, effects of individual rare haplotypes were never studied. Here we develop a method for identifying association with rare haplotypes for case-control design. A logistic regression based retrospective likelihood is formulated and is regularized using logistic Bayesian LASSO (LBL). In particular, we penalize the regression coefficients using appropriate priors to weed out unassociated haplotypes, making it possible for the rare associated ones to stand out. We applied LBL to the AMD data and identified common and rare haplotypes in the complement factor H gene, gaining insights into rare variants' contributions to AMD beyond the current literature. This analysis also demonstrates the richness of GWAS data for mapping rare haplotypes-a potential largely unexplored. Additionally, we conducted simulations to investigate the performance of LBL and compare it with Hapassoc. Our results show that LBL is much more powerful in identifying rare associated haplotypes when the false positive rates for both approaches are kept the same.  相似文献   

20.

Background

Neonatal mortality contributes a large proportion towards early childhood mortality in developing countries, with considerable geographical variation at small areas within countries.

Methods

A geo-additive logistic regression model is proposed for quantifying small-scale geographical variation in neonatal mortality, and to estimate risk factors of neonatal mortality. Random effects are introduced to capture spatial correlation and heterogeneity. The spatial correlation can be modelled using the Markov random fields (MRF) when data is aggregated, while the two dimensional P-splines apply when exact locations are available, whereas the unstructured spatial effects are assigned an independent Gaussian prior. Socio-economic and bio-demographic factors which may affect the risk of neonatal mortality are simultaneously estimated as fixed effects and as nonlinear effects for continuous covariates. The smooth effects of continuous covariates are modelled by second-order random walk priors. Modelling and inference use the empirical Bayesian approach via penalized likelihood technique. The methodology is applied to analyse the likelihood of neonatal deaths, using data from the 2000 Malawi demographic and health survey. The spatial effects are quantified through MRF and two dimensional P-splines priors.

Results

Findings indicate that both fixed and spatial effects are associated with neonatal mortality.

Conclusions

Our study, therefore, suggests that the challenge to reduce neonatal mortality goes beyond addressing individual factors, but also require to understanding unmeasured covariates for potential effective interventions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号