首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Bivariate time series of counts with excess zeros relative to the Poisson process are common in many bioscience applications. Failure to account for the extra zeros in the analysis may result in biased parameter estimates and misleading inferences. A class of bivariate zero-inflated Poisson autoregression models is presented to accommodate the zero-inflation and the inherent serial dependency between successive observations. An autoregressive correlation structure is assumed in the random component of the compound regression model. Parameter estimation is achieved via an EM algorithm, by maximizing an appropriate log-likelihood function to obtain residual maximum likelihood estimates. The proposed method is applied to analyze a bivariate series from an occupational health study, in which the zero-inflated injury count events are classified as either musculoskeletal or non-musculoskeletal in nature. The approach enables the evaluation of the effectiveness of a participatory ergonomics intervention at the population level, in terms of reducing the overall incidence of lost-time injury and a simultaneous decline in the two mean injury rates.  相似文献   

2.
In clinical trials examining the incidence of pneumonia it is a common practice to measure infection via both invasive and non-invasive procedures. In the context of a recently completed randomized trial comparing two treatments the invasive procedure was only utilized in certain scenarios due to the added risk involved, and given that the level of the non-invasive procedure surpassed a given threshold. Hence, what was observed was bivariate data with a pattern of missingness in the invasive variable dependent upon the value of the observed non-invasive observation within a given pair. In order to compare two treatments with bivariate observed data exhibiting this pattern of missingness we developed a semi-parametric methodology utilizing the density-based empirical likelihood approach in order to provide a non-parametric approximation to Neyman-Pearson-type test statistics. This novel empirical likelihood approach has both a parametric and non-parametric components. The non-parametric component utilizes the observations for the non-missing cases, while the parametric component is utilized to tackle the case where observations are missing with respect to the invasive variable. The method is illustrated through its application to the actual data obtained in the pneumonia study and is shown to be an efficient and practical method.  相似文献   

3.
A statistic, derived from the combination of two dependent tests, is proposed for testing the hypothesis of equality of the means of a bivariate normal distribution with unknown common variance and correlation coefficient when observations are missing on one or both variates. The null distribution of the statistic is approximated by a well-known distribution. The empirical powers of the statistic are computed and compared with some of the known statistics. The comparisons support the use of the proposed test.  相似文献   

4.
Z Li  J M?tt?nen  M J Sillanp?? 《Heredity》2015,115(6):556-564
Linear regression-based quantitative trait loci/association mapping methods such as least squares commonly assume normality of residuals. In genetics studies of plants or animals, some quantitative traits may not follow normal distribution because the data include outlying observations or data that are collected from multiple sources, and in such cases the normal regression methods may lose some statistical power to detect quantitative trait loci. In this work, we propose a robust multiple-locus regression approach for analyzing multiple quantitative traits without normality assumption. In our method, the objective function is least absolute deviation (LAD), which corresponds to the assumption of multivariate Laplace distributed residual errors. This distribution has heavier tails than the normal distribution. In addition, we adopt a group LASSO penalty to produce shrinkage estimation of the marker effects and to describe the genetic correlation among phenotypes. Our LAD-LASSO approach is less sensitive to the outliers and is more appropriate for the analysis of data with skewedly distributed phenotypes. Another application of our robust approach is on missing phenotype problem in multiple-trait analysis, where the missing phenotype items can simply be filled with some extreme values, and be treated as outliers. The efficiency of the LAD-LASSO approach is illustrated on both simulated and real data sets.  相似文献   

5.
Attempts to estimate human energy expenditure by use of doubly labeled water have produced three methods currently used for calculating carbon dioxide production from isotope disappearance data: 1) the two-point method, 2) the regression method, and 3) the integration method. An ideal data set was used to determine the error produced in the calculated energy expenditure for each method when specific variables were perturbed. The analysis indicates that some of the calculation methods are more susceptible to perturbations in certain variables than others. Results from an experiment on one adult human subject are used to illustrate the potential for error in actual data. Samples of second void urine, 24-h urine, and breath collected every other day for 21 days are used to calculate the average daily energy expenditure by three calculation methods. The difference between calculated energy expenditure and metabolizable energy on a weight-maintenance diet is used to estimate the error associated with the doubly labeled water method.  相似文献   

6.
Longitudinal data are common in clinical trials and observational studies, where missing outcomes due to dropouts are always encountered. Under such context with the assumption of missing at random, the weighted generalized estimating equation (WGEE) approach is widely adopted for marginal analysis. Model selection on marginal mean regression is a crucial aspect of data analysis, and identifying an appropriate correlation structure for model fitting may also be of interest and importance. However, the existing information criteria for model selection in WGEE have limitations, such as separate criteria for the selection of marginal mean and correlation structures, unsatisfactory selection performance in small‐sample setups, and so forth. In particular, there are few studies to develop joint information criteria for selection of both marginal mean and correlation structures. In this work, by embedding empirical likelihood into the WGEE framework, we propose two innovative information criteria named a joint empirical Akaike information criterion and a joint empirical Bayesian information criterion, which can simultaneously select the variables for marginal mean regression and also correlation structure. Through extensive simulation studies, these empirical‐likelihood‐based criteria exhibit robustness, flexibility, and outperformance compared to the other criteria including the weighted quasi‐likelihood under the independence model criterion, the missing longitudinal information criterion, and the joint longitudinal information criterion. In addition, we provide a theoretical justification of our proposed criteria, and present two real data examples in practice for further illustration.  相似文献   

7.
Longitudinal data usually consist of a number of short time series. A group of subjects or groups of subjects are followed over time and observations are often taken at unequally spaced time points, and may be at different times for different subjects. When the errors and random effects are Gaussian, the likelihood of these unbalanced linear mixed models can be directly calculated, and nonlinear optimization used to obtain maximum likelihood estimates of the fixed regression coefficients and parameters in the variance components. For binary longitudinal data, a two state, non-homogeneous continuous time Markov process approach is used to model serial correlation within subjects. Formulating the model as a continuous time Markov process allows the observations to be equally or unequally spaced. Fixed and time varying covariates can be included in the model, and the continuous time model allows the estimation of the odds ratio for an exposure variable based on the steady state distribution. Exact likelihoods can be calculated. The initial probability distribution on the first observation on each subject is estimated using logistic regression that can involve covariates, and this estimation is embedded in the overall estimation. These models are applied to an intervention study designed to reduce children's sun exposure.  相似文献   

8.
Missing data are a great concern in longitudinal studies, because few subjects will have complete data and missingness could be an indicator of an adverse outcome. Analyses that exclude potentially informative observations due to missing data can be inefficient or biased. To assess the extent of these problems in the context of genetic analyses, we compared case-wise deletion to two multiple imputation methods available in the popular SAS package, the propensity score and regression methods. For both the real and simulated data sets, the propensity score and regression methods produced results similar to case-wise deletion. However, for the simulated data, the estimates of heritability for case-wise deletion and the two multiple imputation methods were much lower than for the complete data. This suggests that if missingness patterns are correlated within families, then imputation methods that do not allow this correlation can yield biased results.  相似文献   

9.
Zhang N  Little RJ 《Biometrics》2012,68(3):933-942
Summary We consider the linear regression of outcome Y on regressors W and Z with some values of W missing, when our main interest is the effect of Z on Y, controlling for W. Three common approaches to regression with missing covariates are (i) complete‐case analysis (CC), which discards the incomplete cases, and (ii) ignorable likelihood methods, which base inference on the likelihood based on the observed data, assuming the missing data are missing at random ( Rubin, 1976b ), and (iii) nonignorable modeling, which posits a joint distribution of the variables and missing data indicators. Another simple practical approach that has not received much theoretical attention is to drop the regressor variables containing missing values from the regression modeling (DV, for drop variables). DV does not lead to bias when either (i) the regression coefficient of W is zero or (ii) W and Z are uncorrelated. We propose a pseudo‐Bayesian approach for regression with missing covariates that compromises between the CC and DV estimates, exploiting information in the incomplete cases when the data support DV assumptions. We illustrate favorable properties of the method by simulation, and apply the proposed method to a liver cancer study. Extension of the method to more than one missing covariate is also discussed.  相似文献   

10.
GEE with Gaussian estimation of the correlations when data are incomplete   总被引:4,自引:0,他引:4  
This paper considers a modification of generalized estimating equations (GEE) for handling missing binary response data. The proposed method uses Gaussian estimation of the correlation parameters, i.e., the estimating function that yields an estimate of the correlation parameters is obtained from the multivariate normal likelihood. The proposed method yields consistent estimates of the regression parameters when data are missing completely at random (MCAR). However, when data are missing at random (MAR), consistency may not hold. In a simulation study with repeated binary outcomes that are missing at random, the magnitude of the potential bias that can arise is examined. The results of the simulation study indicate that, when the working correlation matrix is correctly specified, the bias is almost negligible for the modified GEE. In the simulation study, the proposed modification of GEE is also compared to the standard GEE, multiple imputation, and weighted estimating equations approaches. Finally, the proposed method is illustrated using data from a longitudinal clinical trial comparing two therapeutic treatments, zidovudine (AZT) and didanosine (ddI), in patients with HIV.  相似文献   

11.
In this paper, repeated measures with intraclass correlation model is considered when the observations are missing at random. An exact test for the equality of the mean components and simultaneous confidence intervals (Scheffé and Bonferroni inequality types) are given for linear contrasts of the mean components when the missing observations are of a monotone type. When the missing observations are not of the monotone type, the maximum likelihood estimates are obtained numerically by iterative methods given in Srivastava and Carter (1986). These estimators are then used to obtain asymptotic tests and confidence intervals for the equality of mean components and linear contrasts, respectively. An example is given to illustrate the method.  相似文献   

12.
The concordance correlation coefficient (CCC) and the probability of agreement (PA) are two frequently used measures for evaluating the degree of agreement between measurements generated by two different methods. In this paper, we consider the CCC and the PA using the bivariate normal distribution for modeling the observations obtained by two measurement methods. The main aim of this paper is to develop diagnostic tools for the detection of those observations that are influential on the maximum likelihood estimators of the CCC and the PA using the local influence methodology but not based on the likelihood displacement. Thus, we derive first‐ and second‐order measures considering the case‐weight perturbation scheme. The proposed methodology is illustrated through a Monte Carlo simulation study and using a dataset from a clinical study on transient sleep disorder. Empirical results suggest that under certain circumstances first‐order local influence measures may be more powerful than second‐order measures for the detection of influential observations.  相似文献   

13.
Several statistics are proposed for testing the hypothesis of equality of the means of bivariate normal distribution with unknown variances and correlation coefficient when observations are missing on both variatea. The null distributions of the statistics are approximated by well-known distributions. The empirical sizes and powers of the statistics are computed and compared with paired t test and some of the known statistics based on available data. The comparisons support the use of two of the statistics proposed in this paper.  相似文献   

14.
This paper considers the use of ante-dependence models in problems with repeated measures through time. These are conditional regression models which reflect the dependence of a measure on some of the previous observations from the same subject. We present maximum likelihood estimators of the covariance matrix and procedures for selecting the order of ante-degendence based on penalized like-lihoods. Extensions to missing data situations are discussed. We propose Wald-type test statistics and apply them in two situations common in experiments with repeated measures: one with pre-study observations and another one with small sample size relative to the number of time periods. In these examples, tests assuming ante-dependence find effects which are not detected using competing procedures.  相似文献   

15.
D F Heitjan 《Biometrics》1991,47(2):549-562
The problem of accounting for the grouping of continuous, bivariate data in regression analyses is considered. Reasons why grouping must be taken seriously are advanced, and a strategy for accounting for grouping is demonstrated. The specific model asserts that, in the absence of grouping, the data would be bivariate normal. This model is used to adjust estimates of parameters in a regression relating disease severity to a grouped exposure variable, using data on pneumoconiosis in English coal miners (Ashford, 1959, Biometrics 15, 573-581). The choice of computing methods is discussed and likelihood formulas are presented.  相似文献   

16.
Yu ZF  Catalano PJ 《Biometrics》2005,61(3):757-766
The neurotoxic effects of chemical agents are often investigated in controlled studies on rodents, with multiple binary and continuous endpoints routinely collected. One goal is to conduct quantitative risk assessment to determine safe dose levels. Such studies face two major challenges for continuous outcomes. First, characterizing risk and defining a benchmark dose are difficult. Usually associated with an adverse binary event, risk is clearly definable in quantal settings as presence or absence of an event; finding a similar probability scale for continuous outcomes is less clear. Often, an adverse event is defined for continuous outcomes as any value below a specified cutoff level in a distribution assumed normal or log normal. Second, while continuous outcomes are traditionally analyzed separately for such studies, recent literature advocates also using multiple outcomes to assess risk. We propose a method for modeling and quantitative risk assessment for bivariate continuous outcomes that address both difficulties by extending existing percentile regression methods. The model is likelihood based; it allows separate dose-response models for each outcome while accounting for the bivariate correlation and overall characterization of risk. The approach to estimation of a benchmark dose is analogous to that for quantal data without the need to specify arbitrary cutoff values. We illustrate our methods with data from a neurotoxicity study of triethyl tin exposure in rats.  相似文献   

17.
Two statistics are proposed for testing the hypothesis of equality of the means of a bivariate normal distribution with unknown common variance and correlation coefficient when observations are missing on both variates. One of the statistics reduces to the one proposed by Bhoj (1978, 1984) when the unpaired observations on the variates are equal. The distributions of the statistics are approximated by well known distributions under the null hypothesis. The empirical powers of the tests are computed and compared with those of some known statistics. The comparison supports the use of one of the statistics proposed in this paper.  相似文献   

18.
Measurement of total energy expenditure using [2H,18O]water requires both accurate and precise determination of the rates of disappearance of 2H and 18O from body water over time and determination of the 2H and 18O pool sizes. However, the impact of the isotopic determination of body water upon the determination of energy expenditure is often overlooked. For measurement of total body water per se, the delay after administration before sampling body fluids becomes important, and saliva sampling can be used to resolve the timing of early samples for body water determination. For energy expenditure measurement per se, linear regression can be used to define the initial dilution. Because the hydrogen tracer dilutes into a pool significantly larger than body water pool per se due to the presence of labile hydrogens, a correction to the isotope pool size must be applied. The theoretical calculations of the exchangeable hydrogen pool presented here suggest that the hydrogen pool size is <3% greater than the body water pool and data are provided to support this idea. Finally, the two approaches used to define the body water pool space contribution to the calculation of energy expenditure using 2H218O are reviewed. Using a pool size based upon the average of the two pool spaces limits the effect of pool size error in the calculation of energy expenditure.  相似文献   

19.
Chen Q  Ibrahim JG 《Biometrics》2006,62(1):177-184
We consider a class of semiparametric models for the covariate distribution and missing data mechanism for missing covariate and/or response data for general classes of regression models including generalized linear models and generalized linear mixed models. Ignorable and nonignorable missing covariate and/or response data are considered. The proposed semiparametric model can be viewed as a sensitivity analysis for model misspecification of the missing covariate distribution and/or missing data mechanism. The semiparametric model consists of a generalized additive model (GAM) for the covariate distribution and/or missing data mechanism. Penalized regression splines are used to express the GAMs as a generalized linear mixed effects model, in which the variance of the corresponding random effects provides an intuitive index for choosing between the semiparametric and parametric model. Maximum likelihood estimates are then obtained via the EM algorithm. Simulations are given to demonstrate the methodology, and a real data set from a melanoma cancer clinical trial is analyzed using the proposed methods.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号