期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

GOLDmineR: improving models for classifying patients with chest pain

Bernstein L Bradley K Zarich S 《The Yale journal of biology and medicine》2002,75(4):183-198

The laboratory is dealing with reporting tests as information needed to make clinical decisions. The traditional statistical quality control measures which assigns reference ranges based on 95 percent confidence intervals is insufficient for diagnostic tests that assign risk. We construct a basis for risk assignment by a method that builds on the 2 x 2 contingency table used to calculate the C2 goodness-of-fit and Bayesian estimates. The widely used logistic regression is a subset of the regression method, as it only considers dichotomous outcome choices. We use examples of multivalued predictor(s) and a multivalued as well as dichotomous outcome. Outcomes analyses are quite easy using the ordinal logit regression model. 相似文献

2.

Assessing Toxicities in a Clinical Trial: Bayesian Inference for Ordinal Data Nested within Categories

L.G. Leon‐Novelo X. Zhou B. Nebiyou Bekele P. Müller 《Biometrics》2010,66(3):966-974

Summary This article addresses modeling and inference for ordinal outcomes nested within categorical responses. We propose a mixture of normal distributions for latent variables associated with the ordinal data. This mixture model allows us to fix without loss of generality the cutpoint parameters that link the latent variable with the observed ordinal outcome. Moreover, the mixture model is shown to be more flexible in estimating cell probabilities when compared to the traditional Bayesian ordinal probit regression model with random cutpoint parameters. We extend our model to take into account possible dependence among the outcomes in different categories. We apply the model to a randomized phase III study to compare treatments on the basis of toxicities recorded by type of toxicity and grade within type. The data include the different (categorical) toxicity types exhibited in each patient. Each type of toxicity has an (ordinal) grade associated to it. The dependence among the different types of toxicity exhibited by the same patient is modeled by introducing patient‐specific random effects. 相似文献

3.

A penalized latent class model for ordinal data

Desantis SM Houseman EA Coull BA Stemmer-Rachamimov A Betensky RA 《Biostatistics (Oxford, England)》2008,9(2):249-262

Latent class models provide a useful framework for clustering observations based on several features. Application of latent class methodology to correlated, high-dimensional ordinal data poses many challenges. Unconstrained analyses may not result in an estimable model. Thus, information contained in ordinal variables may not be fully exploited by researchers. We develop a penalized latent class model to facilitate analysis of high-dimensional ordinal data. By stabilizing maximum likelihood estimation, we are able to fit an ordinal latent class model that would otherwise not be identifiable without application of strict constraints. We illustrate our methodology in a study of schwannoma, a peripheral nerve sheath tumor, that included 3 clinical subtypes and 23 ordinal histological measures. 相似文献

4.

A general class of pattern mixture models for nonignorable dropout with many possible dropout times

Roy J Daniels MJ 《Biometrics》2008,64(2):538-545

Summary . In this article we consider the problem of fitting pattern mixture models to longitudinal data when there are many unique dropout times. We propose a marginally specified latent class pattern mixture model. The marginal mean is assumed to follow a generalized linear model, whereas the mean conditional on the latent class and random effects is specified separately. Because the dimension of the parameter vector of interest (the marginal regression coefficients) does not depend on the assumed number of latent classes, we propose to treat the number of latent classes as a random variable. We specify a prior distribution for the number of classes, and calculate (approximate) posterior model probabilities. In order to avoid the complications with implementing a fully Bayesian model, we propose a simple approximation to these posterior probabilities. The ideas are illustrated using data from a longitudinal study of depression in HIV-infected women. 相似文献

5.

Factors affecting the fruiting of bilberries: an analysis of categorical data set

Jussi Kuusipalo 《Plant Ecology》1988,76(1-2):71-77

The present study is an application of categorical data analysis in ecological research. The approach is based on logistic regression following an exploratory graphical analysis. The material was collected in an extensive forest inventory in which a set of observations was made by eye in a stratified random sample of 262 mature upland forest stands in South Finland. The problem was to interpret the variation in the fertility of the tillers of bilberry (Vaccinium myrtillus). In the field, the fertility was recorded as a three-class ordinal variable. The information available for the interpretation included the visually estimated density of the tree crowns, the soil fertility class determined using Cajander's forest site types and the percent cover of V. myrtillus.The GLM framework was employed in successive stages of the data analysis in order to find a model to fit the data. For this, the three-class ordinal response variable was reduced to two classes: stands characterized by (a) sterile and (b) fertile bilberry tillers. Successful prediction of the distribution of these two types of forest stands was achieved with a logistic-regression model by using canopy coverage and soil fertility classes as predictor variables. The generalized linear modelling framework is suitable for studying many ecological problems even when only rough categorical estimates of environmental scalars are available. 相似文献

6.

A Bayesian Latent Variable Mixture Model for Longitudinal Fetal Growth

James C. Slaughter Amy H. Herring John M. Thorp 《Biometrics》2009,65(4):1233-1242

Summary Fetal growth restriction is a leading cause of perinatal morbidity and mortality that could be reduced if high‐risk infants are identified early in pregnancy. We propose a Bayesian model for aggregating 18 longitudinal ultrasound measurements of fetal size and blood flow into three underlying, continuous latent factors. Our procedure is more flexible than typical latent variable methods in that we relax the normality assumptions by allowing the latent factors to follow finite mixture distributions. Using mixture distributions also permits us to cluster individuals with similar observed characteristics and identify latent classes of subjects who are more likely to be growth or blood flow restricted during pregnancy. We also use our latent variable mixture distribution model to identify a clinically meaningful latent class of subjects with low birth weight and early gestational age. We then examine the association of latent classes of intrauterine growth restriction with latent classes of birth outcomes as well as observed maternal covariates including fetal gender and maternal race, parity, body mass index, and height. Our methods identified a latent class of subjects who have increased blood flow restriction and below average intrauterine size during pregnancy. These subjects were more likely to be growth restricted at birth than a class of individuals with typical size and blood flow. 相似文献

7.

A score test for linkage analysis of ordinal traits based on IBD sharing

Feng R Zhang H 《Biostatistics (Oxford, England)》2008,9(1):114-127

Statistical methods for linkage analysis are well established for both binary and quantitative traits. However, numerous diseases including cancer and psychiatric disorders are rated on discrete ordinal scales. To analyze pedigree data with ordinal traits, we recently proposed a latent variable model which has higher power to detect linkage using ordinal traits than methods using the dichotomized traits. The challenge with the latent variable model is that the likelihood is usually very complicated, and as a result, the computation of the likelihood ratio statistic is too intensive for large pedigrees. In this paper, we derive a computationally efficient score statistic based on the identity-by-decent sharing information between relatives. Using simulation studies, we examined the asymptotic distribution of the test statistic and the power of our proposed test under various levels of heritability. We compared the computing time as well as power of the score test with the likelihood ratio test. We then applied our method for the Collaborative Study on the Genetics of Alcoholism and performed a genome scan to map susceptibility genes for alcohol dependence. We found a strong linkage signal on chromosome 4. 相似文献

8.

Calculating Ordinal Regression Models in SAS and S‐Plus

Ralf Bender Axel Benner 《Biometrical journal. Biometrische Zeitschrift》2000,42(6):677-699

Although a number of regression models for ordinal responses have been proposed, these models are not widely known and applied in epidemiology and biomedical research. Overviews of these models are either highly technical or consider only a small part of this class of models so that it is difficult to understand the features of the models and to recognize important relations between them. In this paper we give an overview of logistic regression models for ordinal data based upon cumulative and conditional probabilities. We show how the most popular ordinal regression models, namely the proportional odds model and the continuation ratio model, are embedded in the framework of generalized linear models. We describe the characteristics and interpretations of these models and show how the calculations can be performed by means of SAS and S‐Plus. We illustrate and compare the methods by applying them to data of a study investigating the effect of several risk factors on diabetic retinopathy. A special aspect is the violation of the usual assumption of equal slopes which makes the correct application of standard models impossible. We show how to use extensions of the standard models to work adequately with this situation. 相似文献

9.

Logit model in prospective coronary heart disease (CHD) risk factors prediction in Saudi population

《Saudi Journal of Biological Sciences》2021,28(12):7027-7036

Analysis through logistic regression explored to investigate the relationship between binary or multivariable ordinal response probability and in one or more explanatory variables. The main objectives of this study to investigate advanced prediction risk factor of Coronary Heart Disease (CHD) using a logit model. Attempts made to reduce risk factors, increase public or professional awareness. Logit model used to evaluate the probability of a person develop CHD, considering any factors such as age, gender, high low-density lipoprotein (LDL) cholesterol, low high-density lipoprotein (HDL) cholesterol, high blood pressure, family history of CHD younger than 45, diabetes, smoking, being post-menopausal for women and being older than 45 for men. Logit concept of brief statistics described with slight modification to estimate the parameters testing for the significance of the coefficients, confidence interval fits the simple, multiple logit models. Besides, interpretation of the fitted logit regression model introduced. Variables showing best results within the scientific context, good explanation data assessed to fit an estimated logit model containing chosen variables, this present experiment used the statistical inference procedure; chi-square distribution, likelihood ratio, Score, or Wald test and goodness-of-fit. Health promotion started with increased public or professional awareness improved for early detection of CHD, to reduce the risk of mortality, aimed to be Saudi vision by 2030. 相似文献

10.

Feature-specific penalized latent class analysis for genomic data

Houseman EA Coull BA Betensky RA 《Biometrics》2006,62(4):1062-1070

Genomic data are often characterized by a moderate to large number of categorical variables observed for relatively few subjects. Some of the variables may be missing or noninformative. An example of such data is loss of heterozygosity (LOH), a dichotomous variable, observed on a moderate number of genetic markers. We first consider a latent class model where, conditional on unobserved membership in one of k classes, the variables are independent with probabilities determined by a regression model of low dimension q. Using a family of penalties including the ridge and LASSO, we extend this model to address higher-dimensional problems. Finally, we present an orthogonal map that transforms marker space to a space of "features" for which the constrained model has better predictive power. We demonstrate these methods on LOH data collected at 19 markers from 93 brain tumor patients. For this data set, the existing unpenalized latent class methodology does not produce estimates. Additionally, we show that posterior classes obtained from this method are associated with survival for these patients. 相似文献

11.

A land-cover map for South and Southeast Asia derived from SPOT-VEGETATION data 总被引：1，自引：0，他引：1

H.-J. Stibig A.S. Belward P.S. Roy U. Rosalina-Wasrin S. Agrawal P.K. Joshi Hildanus R. Beuchle S. Fritz S. Mubareka C. Giri 《Journal of Biogeography》2007,34(4):625-637

相似文献

12.

Bayesian mapping of genomewide interacting quantitative trait loci for ordinal traits 总被引：2，自引：0，他引：2

下载免费PDF全文

Yi N Banerjee S Pomp D Yandell BS 《Genetics》2007,176(3):1855-1864

Development of statistical methods and software for mapping interacting QTL has been the focus of much recent research. We previously developed a Bayesian model selection framework, based on the composite model space approach, for mapping multiple epistatic QTL affecting continuous traits. In this study we extend the composite model space approach to complex ordinal traits in experimental crosses. We jointly model main and epistatic effects of QTL and environmental factors on the basis of the ordinal probit model (also called threshold model) that assumes a latent continuous trait underlies the generation of the ordinal phenotypes through a set of unknown thresholds. A data augmentation approach is developed to jointly generate the latent data and the thresholds. The proposed ordinal probit model, combined with the composite model space framework for continuous traits, offers a convenient way for genomewide interacting QTL analysis of ordinal traits. We illustrate the proposed method by detecting new QTL and epistatic effects for an ordinal trait, dead fetuses, in a F(2) intercross of mice. Utility and flexibility of the method are also demonstrated using a simulated data set. Our method has been implemented in the freely available package R/qtlbim, which greatly facilitates the general usage of the Bayesian methodology for genomewide interacting QTL analysis for continuous, binary, and ordinal traits in experimental crosses. 相似文献

13.

Parametric and Nonparametric Analyses of Repeated Ordinal Categorical Data

Julio M. Singer Frederico Z. Poleto Patrícia Rosa 《Biometrical journal. Biometrische Zeitschrift》2004,46(4):460-473

We compare two models for the analysis of repeated ordinal categorical data: the classical parametric model for means of scores assigned to the categories of the response variable and a nonparametric model based on relative effects derived from the marginal distribution functions of the response. An example in the field of Dentistry is used to illustrate and to compare the models. We also consider a simulation study to evaluate the type‐I error rates and the power of tests under both models in a balanced design setup. The simulation results suggest that both approaches behave similarly for equally spaced scores but may perform differently otherwise. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim) 相似文献

14.

Use of ordinal categorical variables in skeletal assessment of sex from the cranium

Lyle W. Konigsberg Samantha M. Hens 《American journal of physical anthropology》1998,107(1):97-112

In anthropological studies, visual indicators of sex are traditionally scored on an ordinal categorical scale. Logistic and probit regression models are commonly used statistical tools for the analysis of ordinal categorical data. These models provide unbiased estimates of the posterior probabilities of sex conditional on observed indicators, but they do so only under certain conditions. We suggest a more general method for sexing using a multivariate cumulative probit model and examine both single indicator and multivariate indicator models on a sample of 138 crania from a Late Mississippian site in middle Tennessee. The crania were scored for five common sex indicators: superciliary arch form, chin form, size of mastoid process, shape of the supraorbital margin, and nuchal cresting. Independent assessment of sex for each individual is based on pubic indicators. The traditional logistic regressions are cumbersome because of limitations imposed by missing data. The logistic regression correctly classified 66/74 males and 46/64 females, with an overall correct classification of 81%. The cumulative probit model classified 64/74 males correctly and 51/64 females correctly for an overall correct classification rate of 83%. Finally, we apply parameters estimated from the logit and probit models to find posterior probabilities of sex assignment for 296 additional crania for which pubic indicators were absent or ambiguous. Am J Phys Anthropol 107:97–112, 1998. © 1998 Wiley-Liss, Inc. 相似文献

15.

The influence of management history on spatial prediction of Eryngium spinalba,an endangered endemic species

Damien Marage Luc Garraud Jean‐Claude Rameau 《应用植被学》2008,11(1):139-148

Question: Spatial prediction of plant populations is essential for conservation management. This is especially true for rare and/or threatened endemic species, for which knowledge of determinants of distribution is necessary to mitigate threats and counteract decline. We therefore ask if the distribution of an endemic species can be accurately predicted by georeferenced environmental variables or, if anthropogenic variables also need to be taken into account. Location: Alps, Hautes‐Alpes, France. Methods: Potential distribution area and abundance of Eryngium spinalba were predicted with logistic regression and ordinal logistic regression, respectively, in a 57‐km² watershed. Results: Aspect, global solar radiation in March, elevation and grazing pressure were the main predictors of the probability of occurrence of Eryngium spinalba. Taking into account the persistence of agro‐pastoral activities by diachronic analysis (Napoleonic cadastral map and orthorectified photographs) improved predictions from the model and the level of spatial concordance with independent surveys. Conclusions: Niche modelling improved our understanding of the distribution of this threatened species which, in the context of land abandonment, is diminishing as a result of the decline of its favoured habitats. The key role of pastoral activities and historic continuity for its distribution and persistence was clearly demonstrated. 相似文献

16.

The logistic transform for bounded outcome scores

Lesaffre E Rizopoulos D Tsonaka R 《Biostatistics (Oxford, England)》2007,8(1):72-85

The logistic transformation, originally suggested by Johnson (1949), is applied to analyze responses that are restricted to a finite interval (e.g. (0,1)), so-called bounded outcome scores. Bounded outcome scores often have a non-standard distribution, e.g. J- or U-shaped, precluding classical parametric statistical approaches for analysis. Applying the logistic transformation on a normally distributed random variable, gives rise to a logit-normal (LN) distribution. This distribution can take a variety of shapes on (0,1). Further, the model can be extended to correct for (baseline) covariates. Therefore, the method could be useful for comparative clinical trials. Bounded outcomes can be found in many research areas, e.g. drug compliance research, quality-of-life studies, and pain (and pain relief) studies using visual analog scores, but all these scores can attain the boundary values 0 or 1. A natural extension of the above approach is therefore to assume a latent score on 0,1) having a LN distribution. Two cases are considered: (a) the bounded outcome score is a proportion where the true probabilities have a LN distribution on (0,1) and (b) the bounded outcome score on [0,1] is a coarsened version of a latent score with a LN distribution on (0,1). We also allow the variance (on the transformed scale) to depend on treatment. The usefulness of our approach for comparative clinical trials will be assessed in this paper. It turns out to be important to distinguish the case of equal and unequal variances. For a bounded outcome score of the second type and with equal variances, our approach comes close to ordinal probit (OP) regression. However, ignoring the inequality of variances can lead to highly biased parameter estimates. A simulation study compares the performance of our approach with the two-sample Wilcoxon test and with OP regression. Finally, the different methods are illustrated on two data sets. 相似文献

17.

Climate and satellite-derived land cover for predicting breeding bird distribution in the Great Lakes Basin 总被引：5，自引：0，他引：5

L. A. Venier J. Pearce J. E. McKee D. W. McKenney G. J. Niemi 《Journal of Biogeography》2004,31(2):315-331

Aim We examined relationships between breeding bird distribution of 10 forest songbirds in the Great Lakes Basin, large‐scale climate and the distribution of land cover types as estimated by advanced very high resolution radiometer (AVHRR) and multi‐spectral scanner (MSS) land cover classifications. Our objective was to examine the ability of regional climate, AVHRR (1 km resolution) land cover and MSS (200 m resolution) land cover to predict the distribution of breeding forest birds at the scale of the Great Lakes Basin and at the resolution of Breeding Bird Atlas data (5–10 km²). Specifically we addressed the following questions. (1) How well do AVHRR or MSS classifications capture the variation in distribution of bird species? (2) Is one land cover classification more useful than the other for predicting distribution? (3) How do models based on climate compare with models based on land cover? (4) Can the combination of both climate and land cover improve the predictive ability of these models. Location Modelling was conducted over the area of the Great Lakes Basin including parts of Ontario, Canada and parts of Illinois, Indiana, Michigan, New York, Ohio, Pennsylvania Wisconsin, and Minnesota, USA. Methods We conducted single variable logistic regression with the forest classes of AVHRR and MSS land cover using evidence of breeding as the response variable. We conducted multiple logistic regression with stepwise selection to select models from five sets of explanatory variables (AVHRR, MSS, climate, AVHRR + climate, MSS + climate). Results Generally, species were related to both AVHRR and MSS land cover types in the direction expected based on the known local habitat use of the species. Neither land cover classification appeared to produce consistently more intuitive results. Good models were generated using each of the explanatory data sets examined here. And at least one but usually all five variable sets produced acceptable or excellent models for each species. Main conclusions Both climate and large scale land cover were effective predictors of the distribution of the 10 forest bird species examined here. Models generated from these data had good classification accuracy of independent validation data. Good models were produced from all explanatory data sets or combinations suggesting that the distribution of climate, AVHRR land cover, and MSS land cover all captured similar variance in the distribution of the birds. It is difficult to separate the effects of climate and vegetation on the species’ distributions at this scale. 相似文献

18.

Non‐proportional odds multivariate logistic regression of ordinal family data

下载免费PDF全文

Sophie G. Zaloumis Katrina J. Scurrah Stephen B. Harrap Justine A. Ellis Lyle C. Gurrin 《Biometrical journal. Biometrische Zeitschrift》2015,57(2):286-303

Methods to examine whether genetic and/or environmental sources can account for the residual variation in ordinal family data usually assume proportional odds. However, standard software to fit the non‐proportional odds model to ordinal family data is limited because the correlation structure of family data is more complex than for other types of clustered data. To perform these analyses we propose the non‐proportional odds multivariate logistic regression model and take a simulation‐based approach to model fitting using Markov chain Monte Carlo methods, such as partially collapsed Gibbs sampling and the Metropolis algorithm. We applied the proposed methodology to male pattern baldness data from the Victorian Family Heart Study. 相似文献

19.

Bayesian multilocus association mapping on ordinal and censored traits and its application to the analysis of genetic variation among Oryza sativa L. germplasms

Hiroyoshi Iwata Kaworu Ebana Shuichi Fukuoka Jean-Luc Jannink Takeshi Hayashi 《TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik》2009,118(5):865-880

Association mapping can be a powerful tool for detecting quantitative trait loci (QTLs) without requiring line-crossing experiments. We previously proposed a Bayesian approach for simultaneously mapping multiple QTLs by a regression method that directly incorporates estimates of the population structure. In the present study, we extended our method to analyze ordinal and censored traits, since both types of traits are common in the evaluation of germplasm collections. Ordinal-probit and tobit models were employed to analyze ordinal and censored traits, respectively. In both models, we postulated the existence of a latent continuous variable associated with the observable data, and we used a Markov-chain Monte Carlo algorithm to sample the latent variable and determine the model parameters. We evaluated the efficiency of our approach by using simulated- and real-trait analyses of a rice germplasm collection. Simulation analyses based on real marker data showed that our models could reduce both false-positive and false-negative rates in detecting QTLs to reasonable levels. Simulation analyses based on highly polymorphic marker data, which were generated by coalescent simulations, showed that our models could be applied to genotype data based on highly polymorphic marker systems, like simple sequence repeats. For the real traits, we analyzed heading date as a censored trait and amylose content and the shape of milled rice grains as ordinal traits. We found significant markers that may be linked to previously reported QTLs. Our approach will be useful for whole-genome association mapping of ordinal and censored traits in rice germplasm collections. 相似文献

20.

Multilevel Latent Class Models with Dirichlet Mixing Distribution

Chong‐Zhi Di Karen Bandeen‐Roche 《Biometrics》2011,67(1):86-96

Summary Latent class analysis (LCA) and latent class regression (LCR) are widely used for modeling multivariate categorical outcomes in social science and biomedical studies. Standard analyses assume data of different respondents to be mutually independent, excluding application of the methods to familial and other designs in which participants are clustered. In this article, we consider multilevel latent class models, in which subpopulation mixing probabilities are treated as random effects that vary among clusters according to a common Dirichlet distribution. We apply the expectation‐maximization (EM) algorithm for model fitting by maximum likelihood (ML). This approach works well, but is computationally intensive when either the number of classes or the cluster size is large. We propose a maximum pairwise likelihood (MPL) approach via a modified EM algorithm for this case. We also show that a simple latent class analysis, combined with robust standard errors, provides another consistent, robust, but less‐efficient inferential procedure. Simulation studies suggest that the three methods work well in finite samples, and that the MPL estimates often enjoy comparable precision as the ML estimates. We apply our methods to the analysis of comorbid symptoms in the obsessive compulsive disorder study. Our models' random effects structure has more straightforward interpretation than those of competing methods, thus should usefully augment tools available for LCA of multilevel data. 相似文献