首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
MOTIVATION: In clinical practice, pathological phenotypes are often labelled with ordinal scales rather than binary, e.g. the Gleason grading system for tumour cell differentiation. However, in the literature of microarray analysis, these ordinal labels have been rarely treated in a principled way. This paper describes a gene selection algorithm based on Gaussian processes to discover consistent gene expression patterns associated with ordinal clinical phenotypes. The technique of automatic relevance determination is applied to represent the significance level of the genes in a Bayesian inference framework. RESULTS: The usefulness of the proposed algorithm for ordinal labels is demonstrated by the gene expression signature associated with the Gleason score for prostate cancer data. Our results demonstrate how multi-gene markers that may be initially developed with a diagnostic or prognostic application in mind are also useful as an investigative tool to reveal associations between specific molecular and cellular events and features of tumour physiology. Our algorithm can also be applied to microarray data with binary labels with results comparable to other methods in the literature.  相似文献   

2.
3.
Variance-component methods are popular and flexible analytic tools for elucidating the genetic mechanisms of complex quantitative traits from pedigree data. However, variance-component methods typically assume that the trait of interest follows a multivariate normal distribution within a pedigree. Studies have shown that violation of this normality assumption can lead to biased parameter estimates and inflations in type-I error. This limits the application of variance-component methods to more general trait outcomes, whether continuous or categorical in nature. In this paper, we develop and apply a general variance-component framework for pedigree analysis of continuous and categorical outcomes. We develop appropriate models using generalized-linear mixed model theory and fit such models using approximate maximum-likelihood procedures. Using our proposed method, we demonstrate that one can perform variance-component pedigree analysis on outcomes that follow any exponential-family distribution. Additionally, we also show how one can modify the method to perform pedigree analysis of ordinal outcomes. We also discuss extensions of our variance-component framework to accommodate pedigrees ascertained based on trait outcome. We demonstrate the feasibility of our method using both simulated data and data from a genetic study of ovarian insufficiency.  相似文献   

4.
Latent class models provide a useful framework for clustering observations based on several features. Application of latent class methodology to correlated, high-dimensional ordinal data poses many challenges. Unconstrained analyses may not result in an estimable model. Thus, information contained in ordinal variables may not be fully exploited by researchers. We develop a penalized latent class model to facilitate analysis of high-dimensional ordinal data. By stabilizing maximum likelihood estimation, we are able to fit an ordinal latent class model that would otherwise not be identifiable without application of strict constraints. We illustrate our methodology in a study of schwannoma, a peripheral nerve sheath tumor, that included 3 clinical subtypes and 23 ordinal histological measures.  相似文献   

5.
6.
Although there has been a recent proliferation in maximum‐likelihood (ML)‐based tree estimation methods based on a fixed sequence alignment (MSA), little research has been done on incorporating indel information in this traditional framework. We show, using a simple model on a single character example, that a trivial alignment of a different form than that previously identified for parsimony is optimal in ML under standard assumptions treating indels as “missing” data, but that it is not optimal when indels are incorporated into the character alphabet. We show that the optimality of the trivial alignment is not an artefact of simplified theory assumptions by demonstrating that trivial alignment likelihoods of five different multiple sequence alignment datasets exhibit this phenomenon. These results demonstrate the need for use of indel information in likelihood analysis on fixed MSAs, and suggest that caution must be exercised when drawing conclusions from software implementations claiming improvements in likelihood scores under an indels‐as‐missing assumption. © The Willi Hennig Society 2012.  相似文献   

7.
Although a number of regression models for ordinal responses have been proposed, these models are not widely known and applied in epidemiology and biomedical research. Overviews of these models are either highly technical or consider only a small part of this class of models so that it is difficult to understand the features of the models and to recognize important relations between them. In this paper we give an overview of logistic regression models for ordinal data based upon cumulative and conditional probabilities. We show how the most popular ordinal regression models, namely the proportional odds model and the continuation ratio model, are embedded in the framework of generalized linear models. We describe the characteristics and interpretations of these models and show how the calculations can be performed by means of SAS and S‐Plus. We illustrate and compare the methods by applying them to data of a study investigating the effect of several risk factors on diabetic retinopathy. A special aspect is the violation of the usual assumption of equal slopes which makes the correct application of standard models impossible. We show how to use extensions of the standard models to work adequately with this situation.  相似文献   

8.
9.
Alzheimer's disease gradually affects several components including the cerebral dimension with brain atrophies, the cognitive dimension with a decline in various functions, and the functional dimension with impairment in the daily living activities. Understanding how such dimensions interconnect is crucial for Alzheimer's disease research. However, it requires to simultaneously capture the dynamic and multidimensional aspects and to explore temporal relationships between dimensions. We propose an original dynamic structural model that accounts for all these features. The model defines dimensions as latent processes and combines a multivariate linear mixed model and a system of difference equations to model trajectories and temporal relationships between latent processes in finely discrete time. Dimensions are simultaneously related to their observed (possibly multivariate) markers through nonlinear equations of observation. Parameters are estimated in the maximum likelihood framework enjoying a closed form for the likelihood. We demonstrate in a simulation study that this dynamic model in discrete time benefits the same causal interpretation of temporal relationships as models defined in continuous time as long as the discretization step remains small. The model is then applied to the data of the Alzheimer's Disease Neuroimaging Initiative. Three longitudinal dimensions (cerebral anatomy, cognitive ability, and functional autonomy) measured by six markers are analyzed, and their temporal structure is contrasted between different clinical stages of Alzheimer's disease.  相似文献   

10.
This paper compares the fine‐scale genetic structure of quantitative traits and allozyme markers within a natural population of Centaurea jacea s.l. To that end, a spatial autocorrelation approach is developed based on pairwise correlation coefficients between individuals and using sib families. Statistical properties of the proposed statistics are investigated with numerical simulations. Our results show that most quantitative traits have a significant spatial structure for their genetic component. On average, allozyme markers and the genetic component of quantitative traits have similar patterns of spatial autocorrelation that are consistent with a neutral model of isolation by distance. We also show evidence that environmental heterogeneity generates a spatial structure for the environmental component of quantitative traits. Results are discussed in terms of mechanisms generating spatial structure and are compared with those obtained on a large geographical scale.  相似文献   

11.

Stress fibers (SFs) in cells transmit external forces to cell nuclei, altering the DNA structure, gene expression, and cell activity. To determine whether SFs are involved in mechanosignal transduction upon intraluminal pressure, this study investigated the SF direction in smooth muscle cells (SMCs) in aortic tissue and strain in the SF direction. Aortic tissues were fixed under physiological pressure of 120 mmHg. First, we observed fluorescently labeled SFs using two-photon microscopy. It was revealed that SFs in the same smooth muscle layers were aligned in almost the same direction, and the absolute value of the alignment angle from the circumferential direction was 16.8° ± 5.2° (n = 96, mean ± SD). Second, we quantified the strain field in the aortic tissue in reference to photo-bleached markers. It was found in the radial-circumferential plane that the largest strain direction was − 21.3° ± 11.1°, and the zero normal strain direction was 28.1° ± 10.2°. Thus, the SFs in aortic SMCs were not in line with neither the largest strain direction nor the zero strain direction, although their orientation was relatively close to the zero strain direction. These results suggest that SFs in aortic SMCs undergo stretch, but not maximal and transmit the force to nuclei under intraluminal pressure.

  相似文献   

12.
For an r × ctable with ordinal responses, odds ratios are commonly used to describe the relationship between the row and column variables. This article shows two types of ordinal odds ratios where local‐global odds ratios are used to compare several groups on a c‐category ordinal response and a global odds ratio is used to measure the global association between a pair of ordinal responses. When there is a stratification factor, we consider Mantel‐Haenszel (MH) type estimators of these odds ratios to summarize the association from several strata. Like the ordinary MH estimator of the common odds ratio for several 2 × 2 contingency tables, the estimators are used when the association is not expected to vary drastically among the strata. Also, the estimators are consistent under the ordinary asymptotic framework in which the number of strata is fixed and also under sparse asymptotics in which the number of strata grows with the sample size. Compared to the maximum likelihood estimators, simulations find that the MH type estimators perform better especially when each stratum has few observations. This article provides variances and covariances formulae for the local‐global odds ratios estimators and applies the bootstrap method to obtain a standard error for the global odds ratio estimator. At the end, we discuss possible ways of testing the homogeneity assumption.  相似文献   

13.
Genetic drift of influenza virus genomic sequences occurs through the combined effects of sequence alterations introduced by a low-fidelity polymerase and the varying selective pressures experienced as the virus migrates through different host environments. While traditional phylogenetic analysis is useful in tracking the evolutionary heritage of these viruses, the specific genetic determinants that dictate important phenotypic characteristics are often difficult to discern within the complex genetic background arising through evolution. Here we describe a novel influenza virus sequence feature variant type (Flu-SFVT) approach, made available through the public Influenza Research Database resource (www.fludb.org), in which variant types (VTs) identified in defined influenza virus protein sequence features (SFs) are used for genotype-phenotype association studies. Since SFs have been defined for all influenza virus proteins based on known structural, functional, and immune epitope recognition properties, the Flu-SFVT approach allows the rapid identification of the molecular genetic determinants of important influenza virus characteristics and their connection to underlying biological functions. We demonstrate the use of the SFVT approach to obtain statistical evidence for effects of NS1 protein sequence variations in dictating influenza virus host range restriction.  相似文献   

14.
《PloS one》2015,10(5)
Analytical ultracentrifugation (AUC) is a first principles based method to determine absolute sedimentation coefficients and buoyant molar masses of macromolecules and their complexes, reporting on their size and shape in free solution. The purpose of this multi-laboratory study was to establish the precision and accuracy of basic data dimensions in AUC and validate previously proposed calibration techniques. Three kits of AUC cell assemblies containing radial and temperature calibration tools and a bovine serum albumin (BSA) reference sample were shared among 67 laboratories, generating 129 comprehensive data sets. These allowed for an assessment of many parameters of instrument performance, including accuracy of the reported scan time after the start of centrifugation, the accuracy of the temperature calibration, and the accuracy of the radial magnification. The range of sedimentation coefficients obtained for BSA monomer in different instruments and using different optical systems was from 3.655 S to 4.949 S, with a mean and standard deviation of (4.304 ± 0.188) S (4.4%). After the combined application of correction factors derived from the external calibration references for elapsed time, scan velocity, temperature, and radial magnification, the range of s-values was reduced 7-fold with a mean of 4.325 S and a 6-fold reduced standard deviation of ± 0.030 S (0.7%). In addition, the large data set provided an opportunity to determine the instrument-to-instrument variation of the absolute radial positions reported in the scan files, the precision of photometric or refractometric signal magnitudes, and the precision of the calculated apparent molar mass of BSA monomer and the fraction of BSA dimers. These results highlight the necessity and effectiveness of independent calibration of basic AUC data dimensions for reliable quantitative studies.  相似文献   

15.
We consider the bivariate situation of some quantitative, ordinal, binary or censored response variable and some quantitative or ordinal exposure variable (dose) with a hypothetical effect on the response. Data can either be the outcome of a planned dose‐response experiment with only few dose levels or of an observational study where, for example, both exposure and response variable are observed within each individual. We are interested in testing the null hypothesis of no effect of the dose variable vs. a dose‐response function depending on an unknown ‘threshold’ parameter. The variety of dose‐response functions considered ranges from no observed effect level (NOEL) models to umbrella alternatives. Here we discuss generalizations of the method of Lausen & Schumacher (Biometrics, 1992, 48, 73–85)which are based on combinations of two‐sample rank statistics and rank statistics for trend. Our approach may be seen as a generalization of a proposal for change‐point problems. Using the approach of Davies (Biometrika, 1987, 74, 33–43)we derive and approximate the asymptotic null distribution for a large number of thresholds considered. We use an improved Bonferroni inequality as approximation for a small number of thresholds considered. Moreover, we analyse the small sample behaviour by means of a Monte‐Carlo study. Our paper is illustrated by examples from clinical research and epidemiology.  相似文献   

16.
17.
Direct optimization (DO) of 126 nuclear‐encoded SSU rRNA diatom sequences was conducted. The optimal phylogeny indicated several unique relationships with respect to those recovered from a maximum likelihood (ML) analysis of an alignment based on maximizing primary and secondary structural similarity between 126 nuclear‐encoded SSU rRNA diatom sequences ( Medlin and Kaczmarska, 2004 ). Dividing diatoms into the subdivisions Coscinodiscophytina and Bacillariophytina was not supported by the DO phylogeny, due to the paraphyly of the former. The same pertains to Coscinodiscophyceae, Mediophyceae, Thalassiosira, Fragilaria and Amphora. The ordinal‐level classification of the diatoms proposed by Round et al. (1990 ) was for the most part found to be unsupported. The DO phylogeny represented a more rigorous hypothesis than the ML tree because DO maximized character congruence during the homology testing (i.e., alignment/tree search) process whereas the non‐phylogenetic similarity‐based alignment used in the ML analysis did not. The above statement is supported by “controlled” parsimony analyses of 35 sequences, which strongly suggested that dissimilarities in the DO and ML tree structure were due to the specific homology testing approach used. It could not be precluded that differences in taxon sampling and the use of a dissimilar optimality criteria contributed to discrepancies in the structure of the optimal ML and DO trees.  相似文献   

18.
Large‐scale agreement studies are becoming increasingly common in medical settings to gain better insight into discrepancies often observed between experts' classifications. Ordered categorical scales are routinely used to classify subjects' disease and health conditions. Summary measures such as Cohen's weighted kappa are popular approaches for reporting levels of association for pairs of raters' ordinal classifications. However, in large‐scale studies with many raters, assessing levels of association can be challenging due to dependencies between many raters each grading the same sample of subjects' results and the ordinal nature of the ratings. Further complexities arise when the focus of a study is to examine the impact of rater and subject characteristics on levels of association. In this paper, we describe a flexible approach based upon the class of generalized linear mixed models to assess the influence of rater and subject factors on association between many raters' ordinal classifications. We propose novel model‐based measures for large‐scale studies to provide simple summaries of association similar to Cohen's weighted kappa while avoiding prevalence and marginal distribution issues that Cohen's weighted kappa is susceptible to. The proposed summary measures can be used to compare association between subgroups of subjects or raters. We demonstrate the use of hypothesis tests to formally determine if rater and subject factors have a significant influence on association, and describe approaches for evaluating the goodness‐of‐fit of the proposed model. The performance of the proposed approach is explored through extensive simulation studies and is applied to a recent large‐scale cancer breast cancer screening study.  相似文献   

19.
Marginal methods have been widely used for the analysis of longitudinal ordinal and categorical data. These models do not require full parametric assumptions on the joint distribution of repeated response measurements but only specify the marginal or even association structures. However, inference results obtained from these methods often incur serious bias when variables are subject to error. In this paper, we tackle the problem that misclassification exists in both response and categorical covariate variables. We develop a marginal method for misclassification adjustment, which utilizes second‐order estimating functions and a functional modeling approach, and can yield consistent estimates and valid inference for mean and association parameters. We propose a two‐stage estimation approach for cases in which validation data are available. Our simulation studies show good performance of the proposed method under a variety of settings. Although the proposed method is phrased to data with a longitudinal design, it also applies to correlated data arising from clustered and family studies, in which association parameters may be of scientific interest. The proposed method is applied to analyze a dataset from the Framingham Heart Study as an illustration.  相似文献   

20.
Although many of the statistical techniques used in comparative biology were originally developed in quantitative genetics, subsequent development of comparative techniques has progressed in relative isolation. Consequently, many of the new and planned developments in comparative analysis already have well‐tested solutions in quantitative genetics. In this paper, we take three recent publications that develop phylogenetic meta‐analysis, either implicitly or explicitly, and show how they can be considered as quantitative genetic models. We highlight some of the difficulties with the proposed solutions, and demonstrate that standard quantitative genetic theory and software offer solutions. We also show how results from Bayesian quantitative genetics can be used to create efficient Markov chain Monte Carlo algorithms for phylogenetic mixed models, thereby extending their generality to non‐Gaussian data. Of particular utility is the development of multinomial models for analysing the evolution of discrete traits, and the development of multi‐trait models in which traits can follow different distributions. Meta‐analyses often include a nonrandom collection of species for which the full phylogenetic tree has only been partly resolved. Using missing data theory, we show how the presented models can be used to correct for nonrandom sampling and show how taxonomies and phylogenies can be combined to give a flexible framework with which to model dependence.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号