首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Various inference procedures for linear regression models with censored failure times have been studied extensively. Recent developments on efficient algorithms to implement these procedures enhance the practical usage of such models in survival analysis. In this article, we present robust inferences for certain covariate effects on the failure time in the presence of “nuisance” confounders under a semiparametric, partial linear regression setting. Specifically, the estimation procedures for the regression coefficients of interest are derived from a working linear model and are valid even when the function of the confounders in the model is not correctly specified. The new proposals are illustrated with two examples and their validity for cases with practical sample sizes is demonstrated via a simulation study.  相似文献   

2.
Scientists may wish to analyze correlated outcome data with constraints among the responses. For example, piecewise linear regression in a longitudinal data analysis can require use of a general linear mixed model combined with linear parameter constraints. Although well developed for standard univariate models, there are no general results that allow a data analyst to specify a mixed model equation in conjunction with a set of constraints on the parameters. We resolve the difficulty by precisely describing conditions that allow specifying linear parameter constraints that insure the validity of estimates and tests in a general linear mixed model. The recommended approach requires only straightforward and noniterative calculations to implement. We illustrate the convenience and advantages of the methods with a comparison of cognitive developmental patterns in a study of individuals from infancy to early adulthood for children from low-income families.  相似文献   

3.
Constraints arise naturally in many scientific experiments/studies such as in, epidemiology, biology, toxicology, etc. and often researchers ignore such information when analyzing their data and use standard methods such as the analysis of variance (ANOVA). Such methods may not only result in a loss of power and efficiency in costs of experimentation but also may result poor interpretation of the data. In this paper we discuss constrained statistical inference in the context of linear mixed effects models that arise naturally in many applications, such as in repeated measurements designs, familial studies and others. We introduce a novel methodology that is broadly applicable for a variety of constraints on the parameters. Since in many applications sample sizes are small and/or the data are not necessarily normally distributed and furthermore error variances need not be homoscedastic (i.e. heterogeneity in the data) we use an empirical best linear unbiased predictor (EBLUP) type residual based bootstrap methodology for deriving critical values of the proposed test. Our simulation studies suggest that the proposed procedure maintains the desired nominal Type I error while competing well with other tests in terms of power. We illustrate the proposed methodology by re-analyzing a clinical trial data on blood mercury level. The methodology introduced in this paper can be easily extended to other settings such as nonlinear and generalized regression models.  相似文献   

4.
The method of mixed regression is considered for the estimation of coefficients in a linear regression model when incomplete prior information is available, and two families of improved estimators stemming from Stein-rule are proposed. Their properties are studied when disturbances are normal but small.  相似文献   

5.
A new test of random subject effects in linear regression models is presented. The test is robust against heteroskedasticity and its asymptotic distribution is derived under a sequence of local alternatives. The finite sample properties of the test are studied in a simulation experiment and an empirical example. The results presented show that the new test is to be preferred over earlier test proposed.  相似文献   

6.
Percentage is widely used to describe different results in food microbiology, e.g., probability of microbial growth, percent inactivated, and percent of positive samples. Four sets of percentage data, percent-growth-positive, germination extent, probability for one cell to grow, and maximum fraction of positive tubes, were obtained from our own experiments and the literature. These data were modeled using linear and logistic regression. Five methods were used to compare the goodness of fit of the two models: percentage of predictions closer to observations, range of the differences (predicted value minus observed value), deviation of the model, linear regression between the observed and predicted values, and bias and accuracy factors. Logistic regression was a better predictor of at least 78% of the observations in all four data sets. In all cases, the deviation of logistic models was much smaller. The linear correlation between observations and logistic predictions was always stronger. Validation (accomplished using part of one data set) also demonstrated that the logistic model was more accurate in predicting new data points. Bias and accuracy factors were found to be less informative when evaluating models developed for percentage data, since neither of these indices can compare predictions at zero. Model simplification for the logistic model was demonstrated with one data set. The simplified model was as powerful in making predictions as the full linear model, and it also gave clearer insight in determining the key experimental factors.  相似文献   

7.
The minimum dispersion linear unbiased estimators of the vector of parameters in a linear regression model are compared when the parameters of the model are subject to stochastic linear restrictions with different dispersion matrices of the disturbances involved in them.  相似文献   

8.
It is shown that the KRUSKAL -WALLIS statistic may be decomposed into components designed to detect trends in the underlying distributions. Extensions to other K sample linear rank statistics and to nonparametric regression statistics are noted.  相似文献   

9.
In sample surveys, it is usual to make use of auxiliary information to increase the precision of the estimators. We propose a new chain ratio estimator and regression estimator of a finite population mean using linear combination of two auxiliary variables and obtain the mean squared error (MSE) equations for the proposed estimators. We find theoretical conditions that make proposed estimators more efficient than the traditional multivariate ratio estimator and the regression estimator using information of two auxiliary variables.  相似文献   

10.

Background

Populational linkage disequilibrium and within-family linkage are commonly used for QTL mapping and marker assisted selection. The combination of both results in more robust and accurate locations of the QTL, but models proposed so far have been either single marker, complex in practice or well fit to a particular family structure.

Results

We herein present linear model theory to come up with additive effects of the QTL alleles in any member of a general pedigree, conditional to observed markers and pedigree, accounting for possible linkage disequilibrium among QTLs and markers. The model is based on association analysis in the founders; further, the additive effect of the QTLs transmitted to the descendants is a weighted (by the probabilities of transmission) average of the substitution effects of founders'' haplotypes. The model allows for non-complete linkage disequilibrium QTL-markers in the founders. Two submodels are presented: a simple and easy to implement Haley-Knott type regression for half-sib families, and a general mixed (variance component) model for general pedigrees. The model can use information from all markers. The performance of the regression method is compared by simulation with a more complex IBD method by Meuwissen and Goddard. Numerical examples are provided.

Conclusion

The linear model theory provides a useful framework for QTL mapping with dense marker maps. Results show similar accuracies but a bias of the IBD method towards the center of the region. Computations for the linear regression model are extremely simple, in contrast with IBD methods. Extensions of the model to genomic selection and multi-QTL mapping are straightforward.  相似文献   

11.
Iwao’s mean crowding-mean density relation can be treated both as a linear function describing the biological characteristics of a species at a population level, or a regression model fitted to empirical data (Iwao’s patchiness regression). In this latter form its parameters are commonly used to construct sampling plans for insect pests, which are characteristically patchily distributed or overdispersed. It is shown in this paper that modifying both the linear function and statistical model to force the intercept or lower functional limit through the origin results in more intuitive biological interpretation of parameters and better sampling economy. Firstly, forcing the function through the origin has the effect of ensuring that zero crowding occurs when zero individuals occupy a patch. Secondly, it ensures that negative values of the intercept, which do not yield an intuitive biological interpretation, will not arise. It is shown analytically that sequential sampling plans based on regression through the origin should be more efficient compared to plans based on conventional regression. For two overdispersed data sets, through-origin based plans collected a significantly lower sample size during validation than plans based on conventional regression, but the improvement in sampling efficiency was not large enough to be of practical benefit. No difference in sample size was observed when through-origin and conventional regression based plans were validated using underdispersed data. A field researcher wishing to adopt a through-origin form of Iwao’s regression for the biological reasons outlined above can therefore be confident that their sampling strategies will not be affected by doing so.  相似文献   

12.
The one‐degree‐of‐freedom Cochran‐Armitage (CA) test statistic for linear trend has been widely applied in various dose‐response studies (e.g., anti‐ulcer medications and short‐term antibiotics, animal carcinogenicity bioassays and occupational toxicant studies). This approximate statistic relies, however, on asymptotic theory that is reliable only when the sample sizes are reasonably large and well balanced across dose levels. For small, sparse, or skewed data, the asymptotic theory is suspect and exact conditional method (based on the CA statistic) seems to provide a dependable alternative. Unfortunately, the exact conditional method is only practical for the linear logistic model from which the sufficient statistics for the regression coefficients can be obtained explicitly. In this article, a simple and efficient recursive polynomial multiplication algorithm for exact unconditional test (based on the CA statistic) for detecting a linear trend in proportions is derived. The method is applicable for all choices of the model with monotone trend including logistic, probit, arcsine, extreme value and one hit. We also show that this algorithm can be easily extended to exact unconditional power calculation for studies with up to a moderately large sample size. A real example is given to illustrate the applicability of the proposed method.  相似文献   

13.
This article considers the problem of simultaneous prediction of actual and average values of the study variable in a linear regression model when a set of linear restrictions binding the regression coefficients is available, and analyzes the performance properties of predictors arising from the methods of restricted regression and mixed regression besides least squares.  相似文献   

14.
In a recent publication, A. Lundin, P. Arner, and J. Hellmér [Anal. Biochem. 177, 125-131 (1989)] describe a method whereby kinetic substrate assays can be performed when the assay mixture includes a significant contaminating levels of substrate. Their method requires various rearrangements of the data, and involves three separate linear regression calculations. We show how the same data may be analyzed directly, and far more simply, by nonlinear regression. Unlike the linear regression method, nonlinear regression allows direct calculation of the actual values for Km, Vmax, and the concentration of contaminating substrate (as well as estimates of their standard errors); the former method gives only apparent values. The nonlinear regression technique is also statistically a more valid means of analysis, as the rearrangements required to give linearized equations will considerably distort the error distribution and render simple unweighted linear regression inappropriate. The ease of incorporating extra parameters into standard equations when nonlinear regression is used is further illustrated by fitting enzyme reaction data which describe a first-order process when a significant nonspecific background is present. For this equation no simple rearranged linear plot is possible, but nonlinear regression is easily applied to determine the kinetic parameters.  相似文献   

15.
OBJECTIVES: The question of interest is estimating the relationship between haplotypes and an outcome measure, based upon unphased genotypes. The outcome of interest might be predicting the presence of disease in a logistic model, predicting a numeric drug response in a linear model, or predicting survival time in a parametric survival model with censoring. Explanatory variables may include phased haplotype design variables, environmental variables, or interactions between them. METHODS: We extend existing generalized linear haplotype models to parametric survival outcomes. To improve the stability of model variance estimates, a profile likelihood solution is proposed. An adjustment for population stratification is also considered. Here we investigate data sampled from known 'strata' (e.g., gender or ethnicity) that influence haplotype prior probabilities and thus the regression model weights. Differing linear model variance estimates, and the effect of stratification and departures from Hardy-Weinberg Equilibrium (HWE) on parameter estimates, are compared and contrasted via simulation. RESULTS: From simulations, we observed an improvement in statistical power when using a solution to profile likelihood equations. We also saw that stratification had little impact on estimates. Haplotypes that are not in HWE had a negative impact on power to test hypotheses. Finally, profile likelihood solutions for haplotypes deviating from HWE had improved power and confidence interval coverage of regression model coefficients.  相似文献   

16.
Exposure measurement error can result in a biased estimate of the association between an exposure and outcome. When the exposure–outcome relationship is linear on the appropriate scale (e.g. linear, logistic) and the measurement error is classical, that is the result of random noise, the result is attenuation of the effect. When the relationship is non‐linear, measurement error distorts the true shape of the association. Regression calibration is a commonly used method for correcting for measurement error, in which each individual's unknown true exposure in the outcome regression model is replaced by its expectation conditional on the error‐prone measure and any fully measured covariates. Regression calibration is simple to execute when the exposure is untransformed in the linear predictor of the outcome regression model, but less straightforward when non‐linear transformations of the exposure are used. We describe a method for applying regression calibration in models in which a non‐linear association is modelled by transforming the exposure using a fractional polynomial model. It is shown that taking a Bayesian estimation approach is advantageous. By use of Markov chain Monte Carlo algorithms, one can sample from the distribution of the true exposure for each individual. Transformations of the sampled values can then be performed directly and used to find the expectation of the transformed exposure required for regression calibration. A simulation study shows that the proposed approach performs well. We apply the method to investigate the relationship between usual alcohol intake and subsequent all‐cause mortality using an error model that adjusts for the episodic nature of alcohol consumption.  相似文献   

17.
We compared the accuracies of four genomic-selection prediction methods as affected by marker density, level of linkage disequilibrium (LD), quantitative trait locus (QTL) number, sample size, and level of replication in populations generated from multiple inbred lines. Marker data on 42 two-row spring barley inbred lines were used to simulate high and low LD populations from multiple inbred line crosses: the first included many small full-sib families and the second was derived from five generations of random mating. True breeding values (TBV) were simulated on the basis of 20 or 80 additive QTL. Methods used to derive genomic estimated breeding values (GEBV) were random regression best linear unbiased prediction (RR–BLUP), Bayes-B, a Bayesian shrinkage regression method, and BLUP from a mixed model analysis using a relationship matrix calculated from marker data. Using the best methods, accuracies of GEBV were comparable to accuracies from phenotype for predicting TBV without requiring the time and expense of field evaluation. We identified a trade-off between a method's ability to capture marker-QTL LD vs. marker-based relatedness of individuals. The Bayesian shrinkage regression method primarily captured LD, the BLUP methods captured relationships, while Bayes-B captured both. Under most of the study scenarios, mixed-model analysis using a marker-derived relationship matrix (BLUP) was more accurate than methods that directly estimated marker effects, suggesting that relationship information was more valuable than LD information. When markers were in strong LD with large-effect QTL, or when predictions were made on individuals several generations removed from the training data set, however, the ranking of method performance was reversed and BLUP had the lowest accuracy.  相似文献   

18.
It is assumed that a known, correct, linear regression model (model I) is given. Let the problem be based on a Bayesian estimation of the regression parameter so that any available a priori information regarding this parameter can be used. This Bayesian estimation is, squared loss, an optimal strategy for the overall problem, which is divided into an estimation and a design problem. For practical reasons, the effort involved in performing the experiment will be taken into account as costs. In other words, the experimental design must result in the greatest possible accuracy for a given total cost (restriction of the sample size n). The linear cost function k(x) = 1 + c (x - a)/(b - a) is used to construct costoptimal experimental designs for simple linear regression by means of V = H = [a, b] in a way similar to that used for classical optimality criteria. The complicated structures of these designs and the difficulty in determining them by a direct approach have made it appear advisable to describe an iterative procedure for the construction of cost-optimal designs.  相似文献   

19.
选择回归方程自变量的条件数法及其在RK手术中的应用   总被引:1,自引:1,他引:1  
选择合适的自变量是确定线性回归模型的首要问题,本文以消除自变量之间的复共线性为目标,介绍了一种选择回归方程自变量的条件数法,并在RK手术的结果预测问题中采用了这一方法。  相似文献   

20.
In multiple linear regression, test for the discordancy of a single outlier in the response variable is usually based on the ‘maximum studentized residual’ statistic. Exact critical values for the test statistic t are not available. Upper bounds for the critical values have been found by SRIKANTAN (1961), PRESCOTT (1975) and LUND (1975). In this note we show that all these upper bounds are algebraically equivalent.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号