首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Gaussian process functional regression modeling for batch data   总被引:2,自引:0,他引:2  
A Gaussian process functional regression model is proposed for the analysis of batch data. Covariance structure and mean structure are considered simultaneously, with the covariance structure modeled by a Gaussian process regression model and the mean structure modeled by a functional regression model. The model allows the inclusion of covariates in both the covariance structure and the mean structure. It models the nonlinear relationship between a functional output variable and a set of functional and nonfunctional covariates. Several applications and simulation studies are reported and show that the method provides very good results for curve fitting and prediction.  相似文献   

2.
Joint regression analysis of correlated data using Gaussian copulas   总被引:2,自引:0,他引:2  
Song PX  Li M  Yuan Y 《Biometrics》2009,65(1):60-68
Summary .  This article concerns a new joint modeling approach for correlated data analysis. Utilizing Gaussian copulas, we present a unified and flexible machinery to integrate separate one-dimensional generalized linear models (GLMs) into a joint regression analysis of continuous, discrete, and mixed correlated outcomes. This essentially leads to a multivariate analogue of the univariate GLM theory and hence an efficiency gain in the estimation of regression coefficients. The availability of joint probability models enables us to develop a full maximum likelihood inference. Numerical illustrations are focused on regression models for discrete correlated data, including multidimensional logistic regression models and a joint model for mixed normal and binary outcomes. In the simulation studies, the proposed copula-based joint model is compared to the popular generalized estimating equations, which is a moment-based estimating equation method to join univariate GLMs. Two real-world data examples are used in the illustration.  相似文献   

3.
Esme Isik 《Luminescence》2022,37(8):1321-1327
Thermoluminescence (TL) is defined as a luminescence phenomenon that can be detected when an insulator or semiconductor is thermally stimulated. Defective crystals store radiation until they are stimulated. Thermoluminescence is a method of monitoring the absorbed dose of dosimeters. The irradiation crystal is heated to 500°C to display the absorbed dose as a luminescent light. The TL dosimetric properties of calcite obtained from nature were investigated in this study. Machine learning was also examined using Gaussian process regression (GPR) for stimulated TL characteristics. According to the experimental output, the TL glow curve had two main peaks located at 90°C and 240°C with good dosimetric properties. In the four regression models of GPR, the data for the heating rate of 3°C s−1 have the lowest residual.  相似文献   

4.
5.
We propose a novel methodology for predicting human gait pattern kinematics based on a statistical and stochastic approach using a method called Gaussian process regression (GPR). We selected 14 body parameters that significantly affect the gait pattern and 14 joint motions that represent gait kinematics. The body parameter and gait kinematics data were recorded from 113 subjects by anthropometric measurements and a motion capture system. We generated a regression model with GPR for gait pattern prediction and built a stochastic function mapping from body parameters to gait kinematics based on the database and GPR, and validated the model with a cross validation method. The function can not only produce trajectories for the joint motions associated with gait kinematics, but can also estimate the associated uncertainties. Our approach results in a novel, low-cost and subject-specific method for predicting gait kinematics with only the subject's body parameters as the necessary input, and also enables a comprehensive understanding of the correlation and uncertainty between body parameters and gait kinematics.  相似文献   

6.

Background  

Quality assessment of microarray data is an important and often challenging aspect of gene expression analysis. This task frequently involves the examination of a variety of summary statistics and diagnostic plots. The interpretation of these diagnostics is often subjective, and generally requires careful expert scrutiny.  相似文献   

7.
PurposeFour-dimensional computed tomography (4D-CT) plays a useful role in many clinical situations. However, due to the hardware limitation of system, dense sampling along superior–inferior direction is often not practical. In this paper, we develop a novel multiple Gaussian process regression model to enhance the superior-inferior resolution for lung 4D-CT based on transversal structures.MethodsThe proposed strategy is based on the observation that high resolution transversal images can recover missing pixels in the superior-inferior direction. Based on this observation and motived by random forest algorithm, we employ multiple Gaussian process regression model learned from transversal images to improve superior–inferior resolution. Specifically, we first randomly sample 3 × 3 patches from original transversal images. The central pixel of these patches and the eight-neighbour pixels of their corresponding degraded versions form the label and input of training data, respectively. Multiple Gaussian process regression model is then built on the basis of multiple training subsets obtained by random sampling. Finally, the central pixel of the patch is estimated based on the proposed model, with the eight-neighbour pixels of each 3 × 3 patch from interpolated superior-inferior direction images as inputs.ResultsThe performance of our method is extensively evaluated using simulated and publicly available datasets. Our experiments show the remarkable performance of the proposed method.ConclusionsIn this paper, we propose a new approach to improve the 4D-CT resolution, which does not require any external data and hardware support, and can produce clear coronal/sagittal images for easy viewing.  相似文献   

8.
9.

Background

Faecal egg counts are a common indicator of nematode infection and since it is a heritable trait, it provides a marker for selective breeding. However, since resistance to disease changes as the adaptive immune system develops, quantifying temporal changes in heritability could help improve selective breeding programs. Faecal egg counts can be extremely skewed and difficult to handle statistically. Therefore, previous heritability analyses have log transformed faecal egg counts to estimate heritability on a latent scale. However, such transformations may not always be appropriate. In addition, analyses of faecal egg counts have typically used univariate rather than multivariate analyses such as random regression that are appropriate when traits are correlated. We present a method for estimating the heritability of untransformed faecal egg counts over the grazing season using random regression.

Results

Replicating standard univariate analyses, we showed the dependence of heritability estimates on choice of transformation. Then, using a multitrait model, we exposed temporal correlations, highlighting the need for a random regression approach. Since random regression can sometimes involve the estimation of more parameters than observations or result in computationally intractable problems, we chose to investigate reduced rank random regression. Using standard software (WOMBAT), we discuss the estimation of variance components for log transformed data using both full and reduced rank analyses. Then, we modelled the untransformed data assuming it to be negative binomially distributed and used Metropolis Hastings to fit a generalized reduced rank random regression model with an additive genetic, permanent environmental and maternal effect. These three variance components explained more than 80 % of the total phenotypic variation, whereas the variance components for the log transformed data accounted for considerably less. The heritability, on a link scale, increased from around 0.25 at the beginning of the grazing season to around 0.4 at the end.

Conclusions

Random regressions are a useful tool for quantifying sources of variation across time. Our MCMC (Markov chain Monte Carlo) algorithm provides a flexible approach to fitting random regression models to non-normal data. Here we applied the algorithm to negative binomially distributed faecal egg count data, but this method is readily applicable to other types of overdispersed data.  相似文献   

10.

Background  

Normalization is a basic step in microarray data analysis. A proper normalization procedure ensures that the intensity ratios provide meaningful measures of relative expression values.  相似文献   

11.
Stationary points embedded in the derivatives are often critical for a model to be interpretable and may be considered as key features of interest in many applications. We propose a semiparametric Bayesian model to efficiently infer the locations of stationary points of a nonparametric function, which also produces an estimate of the function. We use Gaussian processes as a flexible prior for the underlying function and impose derivative constraints to control the function's shape via conditioning. We develop an inferential strategy that intentionally restricts estimation to the case of at least one stationary point, bypassing possible mis-specifications in the number of stationary points and avoiding the varying dimension problem that often brings in computational complexity. We illustrate the proposed methods using simulations and then apply the method to the estimation of event-related potentials derived from electroencephalography (EEG) signals. We show how the proposed method automatically identifies characteristic components and their latencies at the individual level, which avoids the excessive averaging across subjects that is routinely done in the field to obtain smooth curves. By applying this approach to EEG data collected from younger and older adults during a speech perception task, we are able to demonstrate how the time course of speech perception processes changes with age.  相似文献   

12.
A semiparametric additive regression model for longitudinal data   总被引:2,自引:0,他引:2  
Martinussen  T; Scheike  TH 《Biometrika》1999,86(3):691-702
  相似文献   

13.
A latent-class mixture model for incomplete longitudinal Gaussian data   总被引:2,自引:1,他引:1  
Summary .   In the analyses of incomplete longitudinal clinical trial data, there has been a shift, away from simple methods that are valid only if the data are missing completely at random, to more principled ignorable analyses, which are valid under the less restrictive missing at random assumption. The availability of the necessary standard statistical software nowadays allows for such analyses in practice. While the possibility of data missing not at random (MNAR) cannot be ruled out, it is argued that analyses valid under MNAR are not well suited for the primary analysis in clinical trials. Rather than either forgetting about or blindly shifting to an MNAR framework, the optimal place for MNAR analyses is within a sensitivity-analysis context. One such route for sensitivity analysis is to consider, next to selection models, pattern-mixture models or shared-parameter models. The latter can also be extended to a latent-class mixture model, the approach taken in this article. The performance of the so-obtained flexible model is assessed through simulations and the model is applied to data from a depression trial.  相似文献   

14.
M Ghandi  MA Beer 《PloS one》2012,7(8):e38695
Data normalization is a crucial preliminary step in analyzing genomic datasets. The goal of normalization is to remove global variation to make readings across different experiments comparable. In addition, most genomic loci have non-uniform sensitivity to any given assay because of variation in local sequence properties. In microarray experiments, this non-uniform sensitivity is due to different DNA hybridization and cross-hybridization efficiencies, known as the probe effect. In this paper we introduce a new scheme, called Group Normalization (GN), to remove both global and local biases in one integrated step, whereby we determine the normalized probe signal by finding a set of reference probes with similar responses. Compared to conventional normalization methods such as Quantile normalization and physically motivated probe effect models, our proposed method is general in the sense that it does not require the assumption that the underlying signal distribution be identical for the treatment and control, and is flexible enough to correct for nonlinear and higher order probe effects. The Group Normalization algorithm is computationally efficient and easy to implement. We also describe a variant of the Group Normalization algorithm, called Cross Normalization, which efficiently amplifies biologically relevant differences between any two genomic datasets.  相似文献   

15.

Background  

Recently, a large number of methods for the analysis of microarray data have been proposed but there are few comparisons of their relative performances. By using so-called spike-in experiments, it is possible to characterize the analyzed data and thereby enable comparisons of different analysis methods.  相似文献   

16.
17.
Yi G  Shi JQ  Choi T 《Biometrics》2011,67(4):1285-1294
The model based on Gaussian process (GP) prior and a kernel covariance function can be used to fit nonlinear data with multidimensional covariates. It has been used as a flexible nonparametric approach for curve fitting, classification, clustering, and other statistical problems, and has been widely applied to deal with complex nonlinear systems in many different areas particularly in machine learning. However, it is a challenging problem when the model is used for the large-scale data sets and high-dimensional data, for example, for the meat data discussed in this article that have 100 highly correlated covariates. For such data, it suffers from large variance of parameter estimation and high predictive errors, and numerically, it suffers from unstable computation. In this article, penalized likelihood framework will be applied to the model based on GPs. Different penalties will be investigated, and their ability in application given to suit the characteristics of GP models will be discussed. The asymptotic properties will also be discussed with the relevant proofs. Several applications to real biomechanical and bioinformatics data sets will be reported.  相似文献   

18.
19.
The names used by biologists to label the observations they make are imprecise. This is an issue as workers increasingly seek to exploit data gathered from multiple, unrelated sources on line. Even when the international codes of nomenclature are followed strictly the resulting names (Taxon Names) do not uniquely identify the taxa (Taxon Concepts) that have been described by taxonomists but merely groups of type specimens. A standard data model for exchange of taxonomic information is described. It addresses this issue by facilitating explicit communication of information about Taxon Concepts and their associated names. A representation of this model as a XML Schema is introduced and the implications of the use of Globally Unique Identifiers discussed.  相似文献   

20.
Liu LC  Hedeker D 《Biometrics》2006,62(1):261-268
A mixed-effects item response theory model that allows for three-level multivariate ordinal outcomes and accommodates multiple random subject effects is proposed for analysis of multivariate ordinal outcomes in longitudinal studies. This model allows for the estimation of different item factor loadings (item discrimination parameters) for the multiple outcomes. The covariates in the model do not have to follow the proportional odds assumption and can be at any level. Assuming either a probit or logistic response function, maximum marginal likelihood estimation is proposed utilizing multidimensional Gauss-Hermite quadrature for integration of the random effects. An iterative Fisher scoring solution, which provides standard errors for all model parameters, is used. An analysis of a longitudinal substance use data set, where four items of substance use behavior (cigarette use, alcohol use, marijuana use, and getting drunk or high) are repeatedly measured over time, is used to illustrate application of the proposed model.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号