首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In the development of structural equation models (SEMs), observed variables are usually assumed to be normally distributed. However, this assumption is likely to be violated in many practical researches. As the non‐normality of observed variables in an SEM can be obtained from either non‐normal latent variables or non‐normal residuals or both, semiparametric modeling with unknown distribution of latent variables or unknown distribution of residuals is needed. In this article, we find that an SEM becomes nonidentifiable when both the latent variable distribution and the residual distribution are unknown. Hence, it is impossible to estimate reliably both the latent variable distribution and the residual distribution without parametric assumptions on one or the other. We also find that the residuals in the measurement equation are more sensitive to the normality assumption than the latent variables, and the negative impact on the estimation of parameters and distributions due to the non‐normality of residuals is more serious. Therefore, when there is no prior knowledge about parametric distributions for either the latent variables or the residuals, we recommend making parametric assumption on latent variables, and modeling residuals nonparametrically. We propose a semiparametric Bayesian approach using the truncated Dirichlet process with a stick breaking prior to tackle the non‐normality of residuals in the measurement equation. Simulation studies and a real data analysis demonstrate our findings, and reveal the empirical performance of the proposed methodology. A free WinBUGS code to perform the analysis is available in Supporting Information.  相似文献   

2.
Larsen K 《Biometrics》2004,60(1):85-92
Multiple categorical variables are commonly used in medical and epidemiological research to measure specific aspects of human health and functioning. To analyze such data, models have been developed considering these categorical variables as imperfect indicators of an individual's "true" status of health or functioning. In this article, the latent class regression model is used to model the relationship between covariates, a latent class variable (the unobserved status of health or functioning), and the observed indicators (e.g., variables from a questionnaire). The Cox model is extended to encompass a latent class variable as predictor of time-to-event, while using information about latent class membership available from multiple categorical indicators. The expectation-maximization (EM) algorithm is employed to obtain maximum likelihood estimates, and standard errors are calculated based on the profile likelihood, treating the nonparametric baseline hazard as a nuisance parameter. A sampling-based method for model checking is proposed. It allows for graphical investigation of the assumption of proportional hazards across latent classes. It may also be used for checking other model assumptions, such as no additional effect of the observed indicators given latent class. The usefulness of the model framework and the proposed techniques are illustrated in an analysis of data from the Women's Health and Aging Study concerning the effect of severe mobility disability on time-to-death for elderly women.  相似文献   

3.
When observing data on a patient-reported outcome measure in, for example, clinical trials, the variables observed are often correlated and intended to measure a latent variable. In addition, such data are also often characterized by a hierarchical structure, meaning that the outcome is repeatedly measured within patients. To analyze such data, it is important to use an appropriate statistical model, such as structural equation modeling (SEM). However, researchers may rely on simpler statistical models that are applied to an aggregated data structure. For example, correlated variables are combined into one sum score that approximates a latent variable. This may have implications when, for example, the sum score consists of indicators that relate differently to the latent variable being measured. This study compares three models that can be applied to analyze such data: the multilevel multiple indicators multiple causes (ML-MIMIC) model, a univariate multilevel model, and a mixed analysis of variance (ANOVA) model. The focus is on the estimation of a cross-level interaction effect that presents the difference over time on the patient-reported outcome between two treatment groups. The ML-MIMIC model is an SEM-type model that considers the relationship between the indicators and the latent variable in a multilevel setting, whereas the univariate multilevel and mixed ANOVA model rely on sum scores to approximate the latent variable. In addition, the mixed ANOVA model uses aggregated second-level means as outcome. This study showed that the ML-MIMIC model produced unbiased cross-level interaction effect estimates when the relationships between the indicators and the latent variable being measured varied across indicators. In contrast, under similar conditions, the univariate multilevel and mixed ANOVA model underestimated the cross-level interaction effect.  相似文献   

4.
Many applications of biomedical science involve unobservable constructs, from measurement of health states to severity of complex diseases. The primary aim of measurement is to identify relevant pieces of observable information that thoroughly describe the construct of interest. Validation of the construct is often performed separately. Noting the increasing popularity of latent variable methods in biomedical research, we propose a Multiple Indicator Multiple Cause (MIMIC) latent variable model that combines item reduction and validation. Our joint latent variable model accounts for the bias that occurs in the traditional 2-stage process. The methods are motivated by an example from the Physical Activity and Lymphedema clinical trial in which the objectives were to describe lymphedema severity through self-reported Likert scale symptoms and to determine the relationship between symptom severity and a "gold standard" diagnostic measure of lymphedema. The MIMIC model identified 1 symptom as a potential candidate for removal. We present this paper as an illustration of the advantages of joint latent variable models and as an example of the applicability of these models for biomedical research.  相似文献   

5.
Studies of latent traits often collect data for multiple items measuring different aspects of the trait. For such data, it is common to consider models in which the different items are manifestations of a normal latent variable, which depends on covariates through a linear regression model. This article proposes a flexible Bayesian alternative in which the unknown latent variable density can change dynamically in location and shape across levels of a predictor. Scale mixtures of underlying normals are used in order to model flexibly the measurement errors and allow mixed categorical and continuous scales. A dynamic mixture of Dirichlet processes is used to characterize the latent response distributions. Posterior computation proceeds via a Markov chain Monte Carlo algorithm, with predictive densities used as a basis for inferences and evaluation of model fit. The methods are illustrated using data from a study of DNA damage in response to oxidative stress.  相似文献   

6.
Bursting excitable cell models by a slow Ca2+ current   总被引:2,自引:0,他引:2  
Bursting in excitable cells is a phenomenon that has attracted the interest of many electrophysiologists and non-linear dynamicists. In this paper, we present two models that give rise to bursting in action potentials. The membrane of the first model contains a voltage-activated Ca2+ channel that inactivates very slowly upon depolarization and a delayed K+ channel that is activated by voltage. This model consists of three dynamic variables--the gating variable of K+ channel (n), inactivation gating variable of the Ca2+ channel (f), and membrane potential (V). The membrane of the second model contains a voltage-activated Na+ channel that inactivates rather fast upon depolarization. This model contains altogether five dynamic variables--the Na+ inactivation gating variable (h) and Ca2+ activation variable (d), in addition to the three dynamic variables in the first model. With the first model, we show how various interesting bursting patterns may arise from such a simple three dynamic variable model. We also demonstrate that a slowly inactivating voltage-dependent Ca2+ channel may play the key role in the genesis of bursting. With the second model, we show how the participation of a quickly inactivating fast inward current may lead to a neuronal type of bursting, multi-peaked oscillations, and chaos, as the rates of the gating variables change.  相似文献   

7.
To model processes we propose merging idiographic filter measurement with dynamic factor analysis. This involves testing whether or not the same latent dynamics (concurrent and lagged factor interrelations) can describe different individuals' observed multivariate time series. The methodology allows fitting, across different individuals, dynamic factor models that are invariant with respect to the latent dynamics, but not necessarily the factor loadings (measurement model). This methodology allows the same latent process to manifest differently from one individual to another, thus recognizing that the process is general but its realization in a given person is to some degree idiosyncratic. The approach is illustrated with empirical data.  相似文献   

8.
Latent class regression on latent factors   总被引:1,自引:0,他引:1  
In the research of public health, psychology, and social sciences, many research questions investigate the relationship between a categorical outcome variable and continuous predictor variables. The focus of this paper is to develop a model to build this relationship when both the categorical outcome and the predictor variables are latent (i.e. not observable directly). This model extends the latent class regression model so that it can include regression on latent predictors. Maximum likelihood estimation is used and two numerical methods for performing it are described: the Monte Carlo expectation and maximization algorithm and Gaussian quadrature followed by quasi-Newton algorithm. A simulation study is carried out to examine the behavior of the model under different scenarios. A data example involving adolescent health is used for demonstration where the latent classes of eating disorders risk are predicted by the latent factor body satisfaction.  相似文献   

9.
Dunson DB  Perreault SD 《Biometrics》2001,57(1):302-308
This article describes a general class of factor analytic models for the analysis of clustered multivariate data in the presence of informative missingness. We assume that there are distinct sets of cluster-level latent variables related to the primary outcomes and to the censoring process, and we account for dependency between these latent variables through a hierarchical model. A linear model is used to relate covariates and latent variables to the primary outcomes for each subunit. A generalized linear model accounts for covariate and latent variable effects on the probability of censoring for subunits within each cluster. The model accounts for correlation within clusters and within subunits through a flexible factor analytic framework that allows multiple latent variables and covariate effects on the latent variables. The structure of the model facilitates implementation of Markov chain Monte Carlo methods for posterior estimation. Data from a spermatotoxicity study are analyzed to illustrate the proposed approach.  相似文献   

10.
Houseman EA  Coull BA  Betensky RA 《Biometrics》2006,62(4):1062-1070
Genomic data are often characterized by a moderate to large number of categorical variables observed for relatively few subjects. Some of the variables may be missing or noninformative. An example of such data is loss of heterozygosity (LOH), a dichotomous variable, observed on a moderate number of genetic markers. We first consider a latent class model where, conditional on unobserved membership in one of k classes, the variables are independent with probabilities determined by a regression model of low dimension q. Using a family of penalties including the ridge and LASSO, we extend this model to address higher-dimensional problems. Finally, we present an orthogonal map that transforms marker space to a space of "features" for which the constrained model has better predictive power. We demonstrate these methods on LOH data collected at 19 markers from 93 brain tumor patients. For this data set, the existing unpenalized latent class methodology does not produce estimates. Additionally, we show that posterior classes obtained from this method are associated with survival for these patients.  相似文献   

11.
We consider the problem of testing a statistical hypothesiswhere the scientifically meaningful test statistic is a functionof latent variables. In particular, we consider detection ofgenetic linkage, where the latent variables are patterns ofinheritance at specific genome locations. Introduced by Geyer& Meeden (2005), fuzzy p-values are random variables, describedby their probability distributions, that are interpreted asp-values. For latent variable problems, we introduce the notionof a fuzzy p-value as having the conditional distribution ofthe latent p-value given the observed data, where the latentp-value is the random variable that would be the p-value ifthe latent variables were observed. The fuzzy p-value provides an exact test using two sets of simulationsof the latent variables under the null hypothesis, one unconditionaland the other conditional on the observed data. It providesnot only an expression of the strength of the evidence againstthe null hypothesis but also an expression of the uncertaintyin that expression owing to lack of knowledge of the latentvariables. We illustrate these features with an example of simulateddata mimicking a real example of the detection of genetic linkage.  相似文献   

12.
Patient-reported outcomes (PRO) have gained importance in clinical and epidemiological research and aim at assessing quality of life, anxiety or fatigue for instance. Item Response Theory (IRT) models are increasingly used to validate and analyse PRO. Such models relate observed variables to a latent variable (unobservable variable) which is commonly assumed to be normally distributed. A priori sample size determination is important to obtain adequately powered studies to determine clinically important changes in PRO. In previous developments, the Raschpower method has been proposed for the determination of the power of the test of group effect for the comparison of PRO in cross-sectional studies with an IRT model, the Rasch model. The objective of this work was to evaluate the robustness of this method (which assumes a normal distribution for the latent variable) to violations of distributional assumption. The statistical power of the test of group effect was estimated by the empirical rejection rate in data sets simulated using a non-normally distributed latent variable. It was compared to the power obtained with the Raschpower method. In both cases, the data were analyzed using a latent regression Rasch model including a binary covariate for group effect. For all situations, both methods gave comparable results whatever the deviations from the model assumptions. Given the results, the Raschpower method seems to be robust to the non-normality of the latent trait for determining the power of the test of group effect.  相似文献   

13.
Finite mixture modeling with mixture outcomes using the EM algorithm   总被引:10,自引:0,他引:10  
Muthén B  Shedden K 《Biometrics》1999,55(2):463-469
This paper discusses the analysis of an extended finite mixture model where the latent classes corresponding to the mixture components for one set of observed variables influence a second set of observed variables. The research is motivated by a repeated measurement study using a random coefficient model to assess the influence of latent growth trajectory class membership on the probability of a binary disease outcome. More generally, this model can be seen as a combination of latent class modeling and conventional mixture modeling. The EM algorithm is used for estimation. As an illustration, a random-coefficient growth model for the prediction of alcohol dependence from three latent classes of heavy alcohol use trajectories among young adults is analyzed.  相似文献   

14.

Background

Computerized adaptive testing (CAT) utilizes latent variable measurement model parameters that are typically assumed to be equivalently applicable to all people. Biased latent variable scores may be obtained in samples that are heterogeneous with respect to a specified measurement model. We examined the implications of sample heterogeneity with respect to CAT-predicted patient-reported outcomes (PRO) scores for the measurement of pain.

Methods

A latent variable mixture modeling (LVMM) analysis was conducted using data collected from a heterogeneous sample of people in British Columbia, Canada, who were administered the 36 pain domain items of the CAT-5D-QOL. The fitted LVMM was then used to produce data for a simulation analysis. We evaluated bias by comparing the referent PRO scores of the LVMM with PRO scores predicted by a “conventional” CAT (ignoring heterogeneity) and a LVMM-based “mixture” CAT (accommodating heterogeneity).

Results

The LVMM analysis indicated support for three latent classes with class proportions of 0.25, 0.30 and 0.45, which suggests that the sample was heterogeneous. The simulation analyses revealed differences between the referent PRO scores and the PRO scores produced by the “conventional” CAT. The “mixture” CAT produced PRO scores that were nearly equivalent to the referent scores.

Conclusion

Bias in PRO scores based on latent variable models may result when population heterogeneity is ignored. Improved accuracy could be obtained by using CATs that are parameterized using LVMM.  相似文献   

15.
Lee SY  Song XY 《Biometrics》2004,60(3):624-636
A general two-level latent variable model is developed to provide a comprehensive framework for model comparison of various submodels. Nonlinear relationships among the latent variables in the structural equations at both levels, as well as the effects of fixed covariates in the measurement and structural equations at both levels, can be analyzed within the framework. Moreover, the methodology can be applied to hierarchically mixed continuous, dichotomous, and polytomous data. A Monte Carlo EM algorithm is implemented to produce the maximum likelihood estimate. The E-step is completed by approximating the conditional expectations through observations that are simulated by Markov chain Monte Carlo methods, while the M-step is completed by conditional maximization. A procedure is proposed for computing the complicated observed-data log likelihood and the BIC for model comparison. The methods are illustrated by using a real data set.  相似文献   

16.
Roy J  Lin X 《Biometrics》2000,56(4):1047-1054
Multiple outcomes are often used to properly characterize an effect of interest. This paper proposes a latent variable model for the situation where repeated measures over time are obtained on each outcome. These outcomes are assumed to measure an underlying quantity of main interest from different perspectives. We relate the observed outcomes using regression models to a latent variable, which is then modeled as a function of covariates by a separate regression model. Random effects are used to model the correlation due to repeated measures of the observed outcomes and the latent variable. An EM algorithm is developed to obtain maximum likelihood estimates of model parameters. Unit-specific predictions of the latent variables are also calculated. This method is illustrated using data from a national panel study on changes in methadone treatment practices.  相似文献   

17.
This paper proposes a two-part model for studying transitions between health states over time when multiple, discrete health indicators are available. The includes a measurement model positing underlying latent health states and a transition model between latent health states over time. Full maximum likelihood estimation procedures are computationally complex in this latent variable framework, making only a limited class of models feasible and estimation of standard errors problematic. For this reason, an estimating equations analogue of the pseudo-likelihood method for the parameters of interest, namely the transition model parameters, is considered. The finite sample properties of the proposed procedure are investigated through a simulation study and the importance of choosing strong indicators of the latent variable is demonstrated. The applicability of the methodology is illustrated with health survey data measuring disability in the elderly from the Longitudinal Study of Aging.  相似文献   

18.
Larsen K 《Biometrics》2005,61(4):1049-1055
This article is motivated by the Women's Health and Aging Study, where information about physical functioning was recorded along with death information in a group of elderly women. The focus is on determining whether having difficulties in daily living tasks is accompanied by a higher mortality rate. To this end, a two-parameter logistic regression model is used for the modeling of binary questionnaire data assuming an underlying continuous latent variable, difficulty in daily living. The Cox model is used for the survival information, and the continuous latent variable is included as an explanatory variable along with other observed variables. Parameters are estimated by maximizing the likelihood for the joint distribution of the items and the time-to-event information. In addition to presenting a new statistical model, this article also illustrates the use of the model in a real data setting and addresses the more practical issues of model building, diagnostics, and parameter interpretation.  相似文献   

19.
Learning causality from data is known as the causal discovery problem, and it is an important and relatively new field. In many applications, there often exist latent variables, if such latent variables are completely ignored, which can lead to the estimation results seriously biased. In this paper, a method of combining exploratory factor analysis and path analysis (EFA-PA) is proposed to infer the causality in the presence of latent variables. Our method expands latent variables as well as their linear causal relationships with observed variables, which enhances the accuracy of causal models. Such model can be thought of as the simplest possible causal models for continuous data. The EFA-PA is very similar to that of structural equation model, but the theoretical model established by the structural equation model needs to be modified in the process of data fitting until the ideal model is established.The model gained by EFA-PA not only avoids subjectivity but also reduces estimation complexity. It is found that the EFA-PA estimation model is superior to the other models. EFA-PA can provides a basis for the correct estimation of the causal relationship between the observed variables in the presence of latent variables. The experiment shows that EFA-PA is better than the structural equation model.  相似文献   

20.
High costs associated with many fermentation processes in an increasingly competitive industry make any prompt application of modern control techniques to industrial bioprocesses very desirable. However, this is often hampered by the lack of adequate mathematical models, on the one hand, and by the absence of continuous, on-line measurement of the most relevant process variables, on the other hand. This paper addresses these problems and offers a new strategy to control continuous bioprocesses using a hierarchical structure such that neither structured process models nor continuous measurement of all relevant variables have to be available. The control system consists of two layers. The lower layer represents a dynamic adaptive follow-up control of a continuously measured output — in our case dissolved oxygen concentration. This variable is supposed to be strongly correlated with the key output variable — in our case cellular concentration which is not continuously available for measurement. The higher layer is then designed to maintain a desired profile of the process key output using a set-point optimising control technique. The Integrated System Optimisation and Parameter Estimation method used operates on an appropriately chosen steady-state performance criterion. A prerequisite for successful application of the proposed approach is an approximate steady-state model, describing the relationship between the measured output and the process key output variable. Furthermore, occasional in situ, off-line or laboratory measurement values of the key output variable are needed. Promising simulation results of the biomass concentration control, by manipulating the air flow-rate in the continuous bakers' yeast culture are presented.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号