首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
Screening mammography aims to identify breast cancer early and secondarily measures breast density to classify women at higher or lower than average risk for future breast cancer in the general population. Despite the strong association of individual mammography features to breast cancer risk, the statistical literature on mammogram imaging data is limited. While functional principal component analysis (FPCA) has been studied in the literature for extracting image-based features, it is conducted independently of the time-to-event response variable. With the consideration of building a prognostic model for precision prevention, we present a set of flexible methods, supervised FPCA (sFPCA) and functional partial least squares (FPLS), to extract image-based features associated with the failure time while accommodating the added complication from right censoring. Throughout the article, we hope to demonstrate that one method is favored over the other under different clinical setups. The proposed methods are applied to the motivating data set from the Joanne Knight Breast Health cohort at Siteman Cancer Center. Our approaches not only obtain the best prediction performance compared to the benchmark model, but also reveal different risk patterns within the mammograms.  相似文献   

2.
Chenlin Zhang  Huazhen Lin  Li Liu  Jin Liu  Yi Li 《Biometrics》2023,79(3):2232-2245
Functional data analysis has emerged as a powerful tool in response to the ever-increasing resources and efforts devoted to collecting information about response curves or anything that varies over a continuum. However, limited progress has been made with regard to linking the covariance structures of response curves to external covariates, as most functional models assume a common covariance structure. We propose a new functional regression model with covariate-dependent mean and covariance structures. Particularly, by allowing variances of random scores to be covariate-dependent, we identify eigenfunctions for each individual from the set of eigenfunctions that govern the variation patterns across all individuals, resulting in high interpretability and prediction power. We further propose a new penalized quasi-likelihood procedure that combines regularization and B-spline smoothing for model selection and estimation and establish the convergence rate and asymptotic normality of the proposed estimators. The utility of the developed method is demonstrated via simulations, as well as an analysis of the Avon Longitudinal Study of Parents and Children concerning parental effects on the growth curves of their offspring, which yields biologically interesting results.  相似文献   

3.
Many biomedical studies have identified important imaging biomarkers that are associated with both repeated clinical measures and a survival outcome. The functional joint model (FJM) framework, proposed by Li and Luo in 2017, investigates the association between repeated clinical measures and survival data, while adjusting for both high-dimensional images and low-dimensional covariates based on the functional principal component analysis (FPCA). In this paper, we propose a novel algorithm for the estimation of FJM based on the functional partial least squares (FPLS). Our numerical studies demonstrate that, compared to FPCA, the proposed FPLS algorithm can yield more accurate and robust estimation and prediction performance in many important scenarios. We apply the proposed FPLS algorithm to a neuroimaging study. Data used in preparation of this article were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database.  相似文献   

4.
Gaussian process functional regression modeling for batch data   总被引:2,自引:0,他引:2  
A Gaussian process functional regression model is proposed for the analysis of batch data. Covariance structure and mean structure are considered simultaneously, with the covariance structure modeled by a Gaussian process regression model and the mean structure modeled by a functional regression model. The model allows the inclusion of covariates in both the covariance structure and the mean structure. It models the nonlinear relationship between a functional output variable and a set of functional and nonfunctional covariates. Several applications and simulation studies are reported and show that the method provides very good results for curve fitting and prediction.  相似文献   

5.
Shrinkage Estimators for Covariance Matrices   总被引:1,自引:0,他引:1  
Estimation of covariance matrices in small samples has been studied by many authors. Standard estimators, like the unstructured maximum likelihood estimator (ML) or restricted maximum likelihood (REML) estimator, can be very unstable with the smallest estimated eigenvalues being too small and the largest too big. A standard approach to more stably estimating the matrix in small samples is to compute the ML or REML estimator under some simple structure that involves estimation of fewer parameters, such as compound symmetry or independence. However, these estimators will not be consistent unless the hypothesized structure is correct. If interest focuses on estimation of regression coefficients with correlated (or longitudinal) data, a sandwich estimator of the covariance matrix may be used to provide standard errors for the estimated coefficients that are robust in the sense that they remain consistent under misspecification of the covariance structure. With large matrices, however, the inefficiency of the sandwich estimator becomes worrisome. We consider here two general shrinkage approaches to estimating the covariance matrix and regression coefficients. The first involves shrinking the eigenvalues of the unstructured ML or REML estimator. The second involves shrinking an unstructured estimator toward a structured estimator. For both cases, the data determine the amount of shrinkage. These estimators are consistent and give consistent and asymptotically efficient estimates for regression coefficients. Simulations show the improved operating characteristics of the shrinkage estimators of the covariance matrix and the regression coefficients in finite samples. The final estimator chosen includes a combination of both shrinkage approaches, i.e., shrinking the eigenvalues and then shrinking toward structure. We illustrate our approach on a sleep EEG study that requires estimation of a 24 x 24 covariance matrix and for which inferences on mean parameters critically depend on the covariance estimator chosen. We recommend making inference using a particular shrinkage estimator that provides a reasonable compromise between structured and unstructured estimators.  相似文献   

6.
Chen H  Wang Y 《Biometrics》2011,67(3):861-870
In this article, we propose penalized spline (P-spline)-based methods for functional mixed effects models with varying coefficients. We decompose longitudinal outcomes as a sum of several terms: a population mean function, covariates with time-varying coefficients, functional subject-specific random effects, and residual measurement error processes. Using P-splines, we propose nonparametric estimation of the population mean function, varying coefficient, random subject-specific curves, and the associated covariance function that represents between-subject variation and the variance function of the residual measurement errors which represents within-subject variation. Proposed methods offer flexible estimation of both the population- and subject-level curves. In addition, decomposing variability of the outcomes as a between- and within-subject source is useful in identifying the dominant variance component therefore optimally model a covariance function. We use a likelihood-based method to select multiple smoothing parameters. Furthermore, we study the asymptotics of the baseline P-spline estimator with longitudinal data. We conduct simulation studies to investigate performance of the proposed methods. The benefit of the between- and within-subject covariance decomposition is illustrated through an analysis of Berkeley growth data, where we identified clearly distinct patterns of the between- and within-subject covariance functions of children's heights. We also apply the proposed methods to estimate the effect of antihypertensive treatment from the Framingham Heart Study data.  相似文献   

7.
Zhang Z  Wriggers W 《Proteins》2006,64(2):391-403
Multivariate statistical methods are widely used to extract functional collective motions from macromolecular molecular dynamics (MD) simulations. In principal component analysis (PCA), a covariance matrix of positional fluctuations is diagonalized to obtain orthogonal eigenvectors and corresponding eigenvalues. The first few eigenvectors usually correspond to collective modes that approximate the functional motions in the protein. However, PCA representations are globally coherent by definition and, for a large biomolecular system, do not converge on the time scales accessible to MD. Also, the forced orthogonalization of modes leads to complex dependencies that are not necessarily consistent with the symmetry of biological macromolecules and assemblies. Here, we describe for the first time the application of local feature analysis (LFA) to construct a topographic representation of functional dynamics in terms of local features. The LFA representations are low dimensional, and like PCA provide a reduced basis set for collective motions, but they are sparsely distributed and spatially localized. This yields a more reliable assignment of essential dynamics modes across different MD time windows. Also, the intrinsic dynamics of local domains is more extensively sampled than that of globally coherent PCA modes.  相似文献   

8.
Classical multivariate mixed models that acknowledge the correlation of patients through the incorporation of normal error terms are widely used in cohort studies. Violation of the normality assumption can make the statistical inference vague. In this paper, we propose a Bayesian parametric approach by relaxing this assumption and substituting some flexible distributions in fitting multivariate mixed models. This strategy allows for the skewness and the heavy tails of error‐term distributions and thus makes inferences robust to the violation. This approach uses flexible skew‐elliptical distributions, including skewed, fat, or thin‐tailed distributions, and imposes the normal model as a special case. We use real data obtained from a prospective cohort study on the low back pain to illustrate the usefulness of our proposed approach.  相似文献   

9.
In this paper, we propose a functional partially linear regression model with latent group structures to accommodate the heterogeneous relationship between a scalar response and functional covariates. The proposed model is motivated by a salinity tolerance study of barley families, whose main objective is to detect salinity tolerant barley plants. Our model is flexible, allowing for heterogeneous functional coefficients while being efficient by pooling information within a group for estimation. We develop an algorithm in the spirit of the K-means clustering to identify latent groups of the subjects under study. We establish the consistency of the proposed estimator, derive the convergence rate and the asymptotic distribution, and develop inference procedures. We show by simulation studies that the proposed method has higher accuracy for recovering latent groups and for estimating the functional coefficients than existing methods. The analysis of the barley data shows that the proposed method can help identify groups of barley families with different salinity tolerant abilities.  相似文献   

10.
Gervini  Daniel 《Biometrika》2008,95(3):587-600
We present robust estimators for the mean and the principalcomponents of a stochastic process in . Robustness and asymptotic properties of theestimators are studied theoretically, by simulation and by example.It is shown that the proposed estimators are generally morerobust to outliers than the commonly used sample mean and principalcomponents, although their properties depend on the spacingsof the eigenvalues of the covariance function.  相似文献   

11.
Yue Wei  Yi Liu  Tao Sun  Wei Chen  Ying Ding 《Biometrics》2020,76(2):619-629
Several gene-based association tests for time-to-event traits have been proposed recently to detect whether a gene region (containing multiple variants), as a set, is associated with the survival outcome. However, for bivariate survival outcomes, to the best of our knowledge, there is no statistical method that can be directly applied for gene-based association analysis. Motivated by a genetic study to discover the gene regions associated with the progression of a bilateral eye disease, age-related macular degeneration (AMD), we implement a novel functional regression (FR) method under the copula framework. Specifically, the effects of variants within a gene region are modeled through a functional linear model, which then contributes to the marginal survival functions within the copula. Generalized score test statistics are derived to test for the association between bivariate survival traits and the genetic region. Extensive simulation studies are conducted to evaluate the type I error control and power performance of the proposed approach, with comparisons to several existing methods for a single survival trait, as well as the marginal Cox FR model using the robust sandwich estimator for bivariate survival traits. Finally, we apply our method to a large AMD study, the Age-related Eye Disease Study, and to identify the gene regions that are associated with AMD progression.  相似文献   

12.
Summary Case–cohort sampling is a commonly used and efficient method for studying large cohorts. Most existing methods of analysis for case–cohort data have concerned the analysis of univariate failure time data. However, clustered failure time data are commonly encountered in public health studies. For example, patients treated at the same center are unlikely to be independent. In this article, we consider methods based on estimating equations for case–cohort designs for clustered failure time data. We assume a marginal hazards model, with a common baseline hazard and common regression coefficient across clusters. The proposed estimators of the regression parameter and cumulative baseline hazard are shown to be consistent and asymptotically normal, and consistent estimators of the asymptotic covariance matrices are derived. The regression parameter estimator is easily computed using any standard Cox regression software that allows for offset terms. The proposed estimators are investigated in simulation studies, and demonstrated empirically to have increased efficiency relative to some existing methods. The proposed methods are applied to a study of mortality among Canadian dialysis patients.  相似文献   

13.
Motivated by recent work involving the analysis of biomedical imaging data, we present a novel procedure for constructing simultaneous confidence corridors for the mean of imaging data. We propose to use flexible bivariate splines over triangulations to handle an irregular domain of the images that is common in brain imaging studies and in other biomedical imaging applications. The proposed spline estimators of the mean functions are shown to be consistent and asymptotically normal under some regularity conditions. We also provide a computationally efficient estimator of the covariance function and derive its uniform consistency. The procedure is also extended to the two-sample case in which we focus on comparing the mean functions from two populations of imaging data. Through Monte Carlo simulation studies, we examine the finite sample performance of the proposed method. Finally, the proposed method is applied to analyze brain positron emission tomography data in two different studies. One data set used in preparation of this article was obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database.  相似文献   

14.
The MobilEe‐study was the first cross‐sectional population‐based study to investigate possible health effects of mobile communication networks on children using personal dosimetry. Exposure was assessed every second resulting in 86,400 measurements over 24 h for each participant. Therefore, a functional approach to analyze the exposure data was considered appropriate. The aim was to categorize exposure taking into account the course of the measurements over 24 h. The analyses were based on the 480 maxima of each 3 min time interval. Exposure was classified using a nonparametric functional method. Heterogeneity of a sample of functional data was assessed by comparing the functional mode and mean of the distribution of a functional variable. The partition was built within a descending hierarchical method. The resulting exposure groups were compared with categories derived from a standard method, which used the average exposure over 24 h and set the cut‐off at the 90th percentile. The functional classification resulted in a splitting of the exposure data into two groups. Plots of the mean curves showed that the groups could be interpreted as children with “low exposure” (88%) and “higher exposure” (12%). These groups were comparable with categories of the standard method. No association between the categorized exposure and well‐being was observed in logistic regression models. The functional classification approach yielded a plausible partition of the exposure data. The comparability with the standard approach might be due to the data structure and should not be generalized to other exposures. Bioelectromagnetics 30:261–269, 2009. © 2009 Wiley‐Liss, Inc.  相似文献   

15.
MOTIVATION: Temporal gene expression profiles provide an important characterization of gene function, as biological systems are predominantly developmental and dynamic. We propose a method of classifying collections of temporal gene expression curves in which individual expression profiles are modeled as independent realizations of a stochastic process. The method uses a recently developed functional logistic regression tool based on functional principal components, aimed at classifying gene expression curves into known gene groups. The number of eigenfunctions in the classifier can be chosen by leave-one-out cross-validation with the aim of minimizing the classification error. RESULTS: We demonstrate that this methodology provides low-error-rate classification for both yeast cell-cycle gene expression profiles and Dictyostelium cell-type specific gene expression patterns. It also works well in simulations. We compare our functional principal components approach with a B-spline implementation of functional discriminant analysis for the yeast cell-cycle data and simulations. This indicates comparative advantages of our approach which uses fewer eigenfunctions/base functions. The proposed methodology is promising for the analysis of temporal gene expression data and beyond. AVAILABILITY: MATLAB programs are available upon request.  相似文献   

16.
Spatial partitioning methods correct for nonstationarity in spatially related data by partitioning the space into regions of local stationarity. Existing spatial partitioning methods can only estimate linear partitioning boundaries. This is inadequate for detecting an arbitrarily shaped anomalous spatial region within a larger area. We propose a novel Bayesian functional spatial partitioning (BFSP) algorithm, which estimates closed curves that act as partitioning boundaries around anomalous regions of data with a distinct distribution or spatial process. Our method utilizes transitions between a fixed Cartesian and moving polar coordinate system to model the smooth boundary curves using functional estimation tools. Using adaptive Metropolis-Hastings, the BFSP algorithm simultaneously estimates the partitioning boundary and the parameters of the spatial distributions within each region. Through simulation we show that our method is robust to shape of the target zone and region-specific spatial processes. We illustrate our method through the detection of prostate cancer lesions using magnetic resonance imaging.  相似文献   

17.
Predictive margins with survey data   总被引:12,自引:0,他引:12  
Graubard BI  Korn EL 《Biometrics》1999,55(2):652-659
In the analysis of covariance, the display of adjusted treatment means allows one to compare mean (treatment) group outcomes controlling for different covariate distributions in the groups. Predictive margins are a generalization of adjusted treatment means to nonlinear models. The predictive margin for group r represents the average predicted response if everyone in the sample had been in group r. This paper discusses the use of predictive margins with complex survey data, where an important consideration is the choice of covariate distribution used to standardize the predictive margin. It is suggested that the textbook formula for the standard error of an adjusted treatment mean from the analysis of covariance may be inappropriate for applications involving survey data. Applications are given using data from the 1992 National Health Interview Survey (NHIS) and the Epidemiologic Followup Study to the first National Health and Nutrition Examination Survey (NHANES I).  相似文献   

18.
The recent availability of residual dipolar coupling measurements in a variety of different alignment media raises the question to what extent biomolecular structure and dynamics are differentially affected by their presence. A computational method is presented that allows the sensitive assessment of such changes using dipolar couplings measured in six or more alignment media. The method is based on a principal component analysis of the covariance matrix of the dipolar couplings. It does not require a priori structural or dynamic information nor knowledge of the alignment tensors and their orientations. In the absence of experimental errors, the covariance matrix has at most five nonzero eigenvalues if the structure and dynamics of the biomolecule is the same in all media. In contrast, differential structural and dynamic changes lead to additional nonzero eigenvalues. Characteristic features of the eigenvalue distribution in the absence and presence of noise are discussed using dipolar coupling data calculated from conformational ensembles taken from a molecular dynamics trajectory of native ubiquitin.  相似文献   

19.
This paper presents the R package BioFTF, which is a tool for statistical biodiversity assessment in the functional data analysis framework. Diversity is a key topic in many research fields; however, in the literature, it is demonstrated that the existing indices do not capture the different aspects of this concept. Thus, a main drawback is that different indicators may lead to different orderings among communities according to their biodiversity. A possible method to evaluate biodiversity consists in using diversity profiles that are curves depending on a specific parameter. In this setting, it is possible to adopt some functional instruments proposed in the literature, such as the first and second derivatives, the curvature, the radius of curvature and the arc length. Specifically, the derivatives and the curvature (or the radius of curvature) highlight any peculiar behaviour of the profiles, whereas the arc length helps in ranking curves, given the richness. Because these instruments do not solve the issue of ranking communities with different numbers of species, we propose an important methodological contribution that introduces the surface area. Indeed, this tool is a scalar measure that reflects the information provided by the biodiversity profile and allows for ordering communities with different richness. However, this approach requires mathematical skills that the average user may not have; thus, our idea is to provide a user-friendly tool for both non-statistician and statistician practitioners to measure biodiversity in a functional context.  相似文献   

20.
In functional data analysis for longitudinal data, the observation process is typically assumed to be noninformative, which is often violated in real applications. Thus, methods that fail to account for the dependence between observation times and longitudinal outcomes may result in biased estimation. For longitudinal data with informative observation times, we find that under a general class of shared random effect models, a commonly used functional data method may lead to inconsistent model estimation while another functional data method results in consistent and even rate-optimal estimation. Indeed, we show that the mean function can be estimated appropriately via penalized splines and that the covariance function can be estimated appropriately via penalized tensor-product splines, both with specific choices of parameters. For the proposed method, theoretical results are provided, and simulation studies and a real data analysis are conducted to demonstrate its performance.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号