首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In this article, we have considered two families of predictors for the simultaneous prediction of actual and average values of study variable in a linear regression model when a set of stochastic linear constraints binding the regression coefficients is available. These families arise from the method of mixed regression estimation. Performance properties of these families are analyzed when the objective is to predict values outside the sample and within the sample.  相似文献   

2.
Saville BR  Herring AH 《Biometrics》2009,65(2):369-376
Summary .  Deciding which predictor effects may vary across subjects is a difficult issue. Standard model selection criteria and test procedures are often inappropriate for comparing models with different numbers of random effects due to constraints on the parameter space of the variance components. Testing on the boundary of the parameter space changes the asymptotic distribution of some classical test statistics and causes problems in approximating Bayes factors. We propose a simple approach for testing random effects in the linear mixed model using Bayes factors. We scale each random effect to the residual variance and introduce a parameter that controls the relative contribution of each random effect free of the scale of the data. We integrate out the random effects and the variance components using closed-form solutions. The resulting integrals needed to calculate the Bayes factor are low-dimensional integrals lacking variance components and can be efficiently approximated with Laplace's method. We propose a default prior distribution on the parameter controlling the contribution of each random effect and conduct simulations to show that our method has good properties for model selection problems. Finally, we illustrate our methods on data from a clinical trial of patients with bipolar disorder and on data from an environmental study of water disinfection by-products and male reproductive outcomes.  相似文献   

3.
Zhang D  Davidian M 《Biometrics》2001,57(3):795-802
Normality of random effects is a routine assumption for the linear mixed model, but it may be unrealistic, obscuring important features of among-individual variation. We relax this assumption by approximating the random effects density by the seminonparameteric (SNP) representation of Gallant and Nychka (1987, Econometrics 55, 363-390), which includes normality as a special case and provides flexibility in capturing a broad range of nonnormal behavior, controlled by a user-chosen tuning parameter. An advantage is that the marginal likelihood may be expressed in closed form, so inference may be carried out using standard optimization techniques. We demonstrate that standard information criteria may be used to choose the tuning parameter and detect departures from normality, and we illustrate the approach via simulation and using longitudinal data from the Framingham study.  相似文献   

4.
Lin X  Ryan L  Sammel M  Zhang D  Padungtod C  Xu X 《Biometrics》2000,56(2):593-601
We propose a scaled linear mixed model to assess the effects of exposure and other covariates on multiple continuous outcomes. The most general form of the model allows a different exposure effect for each outcome. An important special case is a model that represents the exposure effects using a common global measure that can be characterized in terms of effect sizes. Correlations among different outcomes within the same subject are accommodated using random effects. We develop two approaches to model fitting, including the maximum likelihood method and the working parameter method. A key feature of both methods is that they can be easily implemented by repeatedly calling software for fitting standard linear mixed models, e.g., SAS PROC MIXED. Compared to the maximum likelihood method, the working parameter method is easier to implement and yields fully efficient estimators of the parameters of interest. We illustrate the proposed methods by analyzing data from a study of the effects of occupational pesticide exposure on semen quality in a cohort of Chinese men.  相似文献   

5.
The functional form of spillover, measured as a gradient of abundance of fish, may provide insight about processes that control the spatial distribution of fish inside and outside the MPA. In this study, we aimed to infer on spillover mechanism of Diplodus spp. (family Sparidae) from a Mediterranean MPA (Carry-le-Rouet, France) from visual censuses and artisanal fisheries data. From the existing literature, three potential functional forms of spillover such as a linear gradient, an exponential gradient and a logistic gradient are defined. Each functional form is included in a spatial generalized linear mixed model allowing accounting for spatial autocorrelation of data. We select between the different forms of gradients by using a Bayesian model selection procedure. In a first step, the functional form of the spillover for visual census and artisanal fishing data is assessed separately. For both sets of data, our model selection favoured the negative exponential model, evidencing a decrease of the spatial abundance of fish vanishing around 1000 m from the MPA border. We combined both datasets in a joint model by including an observability parameter. This parameter captures how the different sources of data quantify the underlying spatial distribution of the harvested species. This enabled us to demonstrate that the different sampling methods do not affect the estimation of the underlying spatial distribution of Diplodus spp. inside and outside the MPA. We show that data from different sources can be pooled through spatial generalized linear mixed model. Our findings allow to better understand the underlying mechanisms that control spillover of fish from MPA.  相似文献   

6.
We present a method for estimating the parameters in random effects models for survival data when covariates are subject to missingness. Our method is more general than the usual frailty model as it accommodates a wide range of distributions for the random effects, which are included as an offset in the linear predictor in a manner analogous to that used in generalized linear mixed models. We propose using a Monte Carlo EM algorithm along with the Gibbs sampler to obtain parameter estimates. This method is useful in reducing the bias that may be incurred using complete-case methods in this setting. The methodology is applied to data from Eastern Cooperative Oncology Group melanoma clinical trials in which observations were believed to be clustered and several tumor characteristics were not always observed.  相似文献   

7.
Random effects selection in linear mixed models   总被引:2,自引:0,他引:2  
Chen Z  Dunson DB 《Biometrics》2003,59(4):762-769
We address the important practical problem of how to select the random effects component in a linear mixed model. A hierarchical Bayesian model is used to identify any random effect with zero variance. The proposed approach reparameterizes the mixed model so that functions of the covariance parameters of the random effects distribution are incorporated as regression coefficients on standard normal latent variables. We allow random effects to effectively drop out of the model by choosing mixture priors with point mass at zero for the random effects variances. Due to the reparameterization, the model enjoys a conditionally linear structure that facilitates the use of normal conjugate priors. We demonstrate that posterior computation can proceed via a simple and efficient Markov chain Monte Carlo algorithm. The methods are illustrated using simulated data and real data from a study relating prenatal exposure to polychlorinated biphenyls and psychomotor development of children.  相似文献   

8.
In this article, we propose a two-stage approach to modeling multilevel clustered non-Gaussian data with sufficiently large numbers of continuous measures per cluster. Such data are common in biological and medical studies utilizing monitoring or image-processing equipment. We consider a general class of hierarchical models that generalizes the model in the global two-stage (GTS) method for nonlinear mixed effects models by using any square-root-n-consistent and asymptotically normal estimators from stage 1 as pseudodata in the stage 2 model, and by extending the stage 2 model to accommodate random effects from multiple levels of clustering. The second-stage model is a standard linear mixed effects model with normal random effects, but the cluster-specific distributions, conditional on random effects, can be non-Gaussian. This methodology provides a flexible framework for modeling not only a location parameter but also other characteristics of conditional distributions that may be of specific interest. For estimation of the population parameters, we propose a conditional restricted maximum likelihood (CREML) approach and establish the asymptotic properties of the CREML estimators. The proposed general approach is illustrated using quartiles as cluster-specific parameters estimated in the first stage, and applied to the data example from a collagen fibril development study. We demonstrate using simulations that in samples with small numbers of independent clusters, the CREML estimators may perform better than conditional maximum likelihood estimators, which are a direct extension of the estimators from the GTS method.  相似文献   

9.
OBJECTIVES: The question of interest is estimating the relationship between haplotypes and an outcome measure, based upon unphased genotypes. The outcome of interest might be predicting the presence of disease in a logistic model, predicting a numeric drug response in a linear model, or predicting survival time in a parametric survival model with censoring. Explanatory variables may include phased haplotype design variables, environmental variables, or interactions between them. METHODS: We extend existing generalized linear haplotype models to parametric survival outcomes. To improve the stability of model variance estimates, a profile likelihood solution is proposed. An adjustment for population stratification is also considered. Here we investigate data sampled from known 'strata' (e.g., gender or ethnicity) that influence haplotype prior probabilities and thus the regression model weights. Differing linear model variance estimates, and the effect of stratification and departures from Hardy-Weinberg Equilibrium (HWE) on parameter estimates, are compared and contrasted via simulation. RESULTS: From simulations, we observed an improvement in statistical power when using a solution to profile likelihood equations. We also saw that stratification had little impact on estimates. Haplotypes that are not in HWE had a negative impact on power to test hypotheses. Finally, profile likelihood solutions for haplotypes deviating from HWE had improved power and confidence interval coverage of regression model coefficients.  相似文献   

10.
The paper deals with the quadratic invariant estimators of the linear functions of variance components in mixed linear model. The estimator with locally minimal mean square error with respect to a parameter ? is derived. Under the condition of normality of the vector Y the theoretical values of MSE of several types of estimators are compared in two different mixed models; under a different types of distributions a simulation study is carried out for the behaviour of derived estimators.  相似文献   

11.
The potency, or fitness, of a protein-based drug can be enhanced by changing the sequence of its underlying protein. We present a novel stochastic model for the sequence-fitness relation, and estimate its four parameters from industrial data. Using this model, we formulate and analyze two variants of the protein design problem. In the single-period design problem, the designer needs to decide under capacity constraints which set of sequences to screen in order to maximize the expected fitness of the best sequence in the set. In the more general two-period design problem, the designer can afford two screening rounds and needs to allocate resources optimally across the two periods to maximize the same objective function. Analytical and simulation results allow us to assess the utility of the proposed design strategies for various parameter regimes.  相似文献   

12.
Liu D  Lin X  Ghosh D 《Biometrics》2007,63(4):1079-1088
We consider a semiparametric regression model that relates a normal outcome to covariates and a genetic pathway, where the covariate effects are modeled parametrically and the pathway effect of multiple gene expressions is modeled parametrically or nonparametrically using least-squares kernel machines (LSKMs). This unified framework allows a flexible function for the joint effect of multiple genes within a pathway by specifying a kernel function and allows for the possibility that each gene expression effect might be nonlinear and the genes within the same pathway are likely to interact with each other in a complicated way. This semiparametric model also makes it possible to test for the overall genetic pathway effect. We show that the LSKM semiparametric regression can be formulated using a linear mixed model. Estimation and inference hence can proceed within the linear mixed model framework using standard mixed model software. Both the regression coefficients of the covariate effects and the LSKM estimator of the genetic pathway effect can be obtained using the best linear unbiased predictor in the corresponding linear mixed model formulation. The smoothing parameter and the kernel parameter can be estimated as variance components using restricted maximum likelihood. A score test is developed to test for the genetic pathway effect. Model/variable selection within the LSKM framework is discussed. The methods are illustrated using a prostate cancer data set and evaluated using simulations.  相似文献   

13.
In community-intervention trials, communities, rather than individuals, are randomized to experimental arms. Generalized linear mixed models offer a flexible parametric framework for the evaluation of community-intervention trials, incorporating both systematic and random variations at the community and individual levels. We propose here a simple two-stage inference method for generalized linear mixed models, specifically tailored to the analysis of community-intervention trials. In the first stage, community-specific random effects are estimated from individual-level data, adjusting for the effects of individual-level covariates. This reduces the model approximately to a linear mixed model with the unit of analysis being community. Because the number of communities is typically small in community-intervention studies, we apply the small-sample inference method of Kenward and Roger (1997, Biometrics53, 983-997) to the linear mixed model of second stage. We show by simulation that, under typical settings of community-intervention studies, the proposed approach improves the inference on the intervention-effect parameter uniformly over both the linearized mixed-effect approach and the adaptive Gaussian quadrature approach for generalized linear mixed models. This work is motivated by a series of large randomized trials that test community interventions for promoting cancer preventive lifestyles and behaviors.  相似文献   

14.

Background

Populational linkage disequilibrium and within-family linkage are commonly used for QTL mapping and marker assisted selection. The combination of both results in more robust and accurate locations of the QTL, but models proposed so far have been either single marker, complex in practice or well fit to a particular family structure.

Results

We herein present linear model theory to come up with additive effects of the QTL alleles in any member of a general pedigree, conditional to observed markers and pedigree, accounting for possible linkage disequilibrium among QTLs and markers. The model is based on association analysis in the founders; further, the additive effect of the QTLs transmitted to the descendants is a weighted (by the probabilities of transmission) average of the substitution effects of founders'' haplotypes. The model allows for non-complete linkage disequilibrium QTL-markers in the founders. Two submodels are presented: a simple and easy to implement Haley-Knott type regression for half-sib families, and a general mixed (variance component) model for general pedigrees. The model can use information from all markers. The performance of the regression method is compared by simulation with a more complex IBD method by Meuwissen and Goddard. Numerical examples are provided.

Conclusion

The linear model theory provides a useful framework for QTL mapping with dense marker maps. Results show similar accuracies but a bias of the IBD method towards the center of the region. Computations for the linear regression model are extremely simple, in contrast with IBD methods. Extensions of the model to genomic selection and multi-QTL mapping are straightforward.  相似文献   

15.
We examine memory models for multisite capture–recapture data. This is an important topic, as animals may exhibit behavior that is more complex than simple first‐order Markov movement between sites, when it is necessary to devise and fit appropriate models to data. We consider the Arnason–Schwarz model for multisite capture–recapture data, which incorporates just first‐order Markov movement, and also two alternative models that allow for memory, the Brownie model and the Pradel model. We use simulation to compare two alternative tests which may be undertaken to determine whether models for multisite capture–recapture data need to incorporate memory. Increasing the complexity of models runs the risk of introducing parameters that cannot be estimated, irrespective of how much data are collected, a feature which is known as parameter redundancy. Rouan et al. (JABES, 2009, pp 338–355) suggest a constraint that may be applied to overcome parameter redundancy when it is present in multisite memory models. For this case, we apply symbolic methods to derive a simpler constraint, which allows more parameters to be estimated, and give general results not limited to a particular configuration. We also consider the effect sparse data can have on parameter redundancy and recommend minimum sample sizes. Memory models for multisite capture–recapture data can be highly complex and difficult to fit to data. We emphasize the importance of a structured approach to modeling such data, by considering a priori which parameters can be estimated, which constraints are needed in order for estimation to take place, and how much data need to be collected. We also give guidance on the amount of data needed to use two alternative families of tests for whether models for multisite capture–recapture data need to incorporate memory.  相似文献   

16.
Constraints arise naturally in many scientific experiments/studies such as in, epidemiology, biology, toxicology, etc. and often researchers ignore such information when analyzing their data and use standard methods such as the analysis of variance (ANOVA). Such methods may not only result in a loss of power and efficiency in costs of experimentation but also may result poor interpretation of the data. In this paper we discuss constrained statistical inference in the context of linear mixed effects models that arise naturally in many applications, such as in repeated measurements designs, familial studies and others. We introduce a novel methodology that is broadly applicable for a variety of constraints on the parameters. Since in many applications sample sizes are small and/or the data are not necessarily normally distributed and furthermore error variances need not be homoscedastic (i.e. heterogeneity in the data) we use an empirical best linear unbiased predictor (EBLUP) type residual based bootstrap methodology for deriving critical values of the proposed test. Our simulation studies suggest that the proposed procedure maintains the desired nominal Type I error while competing well with other tests in terms of power. We illustrate the proposed methodology by re-analyzing a clinical trial data on blood mercury level. The methodology introduced in this paper can be easily extended to other settings such as nonlinear and generalized regression models.  相似文献   

17.
The traditional method for estimating the linear function of fixed parameters in mixed linear model is a two-stage procedure. In the first stage of this procedure the variance components estimators are calculated and next in the second stage these estimators are taken as true values of variance components to estimating the linear function of fixed parameters according to generalized least squares method. In this paper the general mixed linear model is considered in which a matrix related to fixed parameters and or/a dispersion matrix of observation vector may be deficient in rank. It is shown that the estimators of a set of functions of fixed parameters obtained in second stage are unbiased if only the observation vector is symmetrically distributed about its expected value and the estimators of variance components from first stage are translation-invariant and are even functions of the observation vector.  相似文献   

18.
A class of generalized linear mixed models can be obtained by introducing random effects in the linear predictor of a generalized linear model, e.g. a split plot model for binary data or count data. Maximum likelihood estimation, for normally distributed random effects, involves high-dimensional numerical integration, with severe limitations on the number and structure of the additional random effects. An alternative estimation procedure based on an extension of the iterative re-weighted least squares procedure for generalized linear models will be illustrated on a practical data set involving carcass classification of cattle. The data is analysed as overdispersed binomial proportions with fixed and random effects and associated components of variance on the logit scale. Estimates are obtained with standard software for normal data mixed models. Numerical restrictions pertain to the size of matrices to be inverted. This can be dealt with by absorption techniques familiar from e.g. mixed models in animal breeding. The final model fitted to the classification data includes four components of variance and a multiplicative overdispersion factor. Basically the estimation procedure is a combination of iterated least squares procedures and no full distributional assumptions are needed. A simulation study based on the classification data is presented. This includes a study of procedures for constructing confidence intervals and significance tests for fixed effects and components of variance. The simulation results increase confidence in the usefulness of the estimation procedure.  相似文献   

19.
Summary In a microarray experiment, one experimental design is used to obtain expression measures for all genes. One popular analysis method involves fitting the same linear mixed model for each gene, obtaining gene‐specific p‐values for tests of interest involving fixed effects, and then choosing a threshold for significance that is intended to control false discovery rate (FDR) at a desired level. When one or more random factors have zero variance components for some genes, the standard practice of fitting the same full linear mixed model for all genes can result in failure to control FDR. We propose a new method that combines results from the fit of full and selected linear mixed models to identify differentially expressed genes and provide FDR control at target levels when the true underlying random effects structure varies across genes.  相似文献   

20.
This article demonstrates the use of mixed effects models for characterizing individual and sample average growth curves based on serial anthropometric data. These models are advancement over conventional general linear regression because they effectively handle the hierarchical nature of serial growth data. Using body weight data on 70 infants in the Born in Bradford study, we demonstrate how a mixed effects model provides a better fit than a conventional regression model. Further, we demonstrate how mixed effects models can be used to explore the influence of environmental factors on the sample average growth curve. Analyzing data from 183 infant boys (aged 3–15 months) from rural South India, we show how maternal education shapes infant growth patterns as early as within the first 6 months of life. The presented analyses highlight the utility of mixed effects models for analyzing serial growth data because they allow researchers to simultaneously predict individual curves, estimate sample average curves, and investigate the effects of environmental exposure variables. Am J Phys Anthropol, 2013. © 2012 Wiley Periodicals, Inc.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号