首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
In functional data analysis for longitudinal data, the observation process is typically assumed to be noninformative, which is often violated in real applications. Thus, methods that fail to account for the dependence between observation times and longitudinal outcomes may result in biased estimation. For longitudinal data with informative observation times, we find that under a general class of shared random effect models, a commonly used functional data method may lead to inconsistent model estimation while another functional data method results in consistent and even rate-optimal estimation. Indeed, we show that the mean function can be estimated appropriately via penalized splines and that the covariance function can be estimated appropriately via penalized tensor-product splines, both with specific choices of parameters. For the proposed method, theoretical results are provided, and simulation studies and a real data analysis are conducted to demonstrate its performance.  相似文献   

3.
Clinical treatment outcomes are the quality and cost targets that health-care providers aim to improve. Most existing outcome analysis focuses on a single disease or all diseases combined. Motivated by the success of molecular and phenotypic human disease networks (HDNs), this article develops a clinical treatment network that describes the interconnections among diseases in terms of inpatient length of stay (LOS) and readmission. Here one node represents one disease, and two nodes are linked with an edge if their LOS and number of readmissions are conditionally dependent. This is the very first HDN that jointly analyzes multiple clinical treatment outcomes at the pan-disease level. To accommodate the unique data characteristics, we propose a modeling approach based on two-part generalized linear models and estimation based on penalized integrative analysis. Analysis is conducted on the Medicare inpatient data of 100,000 randomly selected subjects for the period of January 2010 to December 2018. The resulted network has 1008 edges for 106 nodes. We analyze key network properties including connectivity, module/hub, and temporal variation. The findings are biomedically sensible. For example, high connectivity and hub conditions, such as disorders of lipid metabolism and essential hypertension, are identified. There are also findings that are less/not investigated in the literature. Overall, this study can provide additional insight into diseases' properties and their interconnections and assist more efficient disease management and health-care resources allocation.  相似文献   

4.
Clustering analysis is a promising data-driven method for the analysis of functional magnetic resonance imaging (fMRI) data. The huge computation load, however, makes it difficult for the practical use. We use affinity propagation clustering (APC), a new clustering algorithm especially for large data sets to detect brain functional activation from fMRI. It considers all data points as possible exemplars through the minimisation of an energy function and message-passing architecture, and obtains the optimal set of exemplars and their corresponding clusters. Four simulation studies and three in vivo fMRI studies reveal that brain functional activation can be effectively detected and that different response patterns can be distinguished using this method. Our results demonstrate that APC is superior to the k-centres clustering, as revealed by their performance measures in the weighted Jaccard coefficient and average squared error. These results suggest that the proposed APC will be useful in detecting brain functional activation from fMRI data.  相似文献   

5.
It has been a challenging task to integrate high-throughput data into investigations of the systematic and dynamic organization of biological networks. Here, we presented a simple hierarchical clustering algorithm that goes a long way to achieve this aim. Our method effectively reveals the modular structure of the yeast protein-protein interaction network and distinguishes protein complexes from functional modules by integrating high-throughput protein-protein interaction data with the added subcellular localization and expression profile data. Furthermore, we take advantage of the detected modules to provide a reliably functional context for the uncharacterized components within modules. On the other hand, the integration of various protein-protein association information makes our method robust to false-positives, especially for derived protein complexes. More importantly, this simple method can be extended naturally to other types of data fusion and provides a framework for the study of more comprehensive properties of the biological network and other forms of complex networks.  相似文献   

6.
7.
8.
9.
This paper presents a novel semiparametric joint model for multivariate longitudinal and survival data (SJMLS) by relaxing the normality assumption of the longitudinal outcomes, leaving the baseline hazard functions unspecified and allowing the history of the longitudinal response having an effect on the risk of dropout. Using Bayesian penalized splines to approximate the unspecified baseline hazard function and combining the Gibbs sampler and the Metropolis–Hastings algorithm, we propose a Bayesian Lasso (BLasso) method to simultaneously estimate unknown parameters and select important covariates in SJMLS. Simulation studies are conducted to investigate the finite sample performance of the proposed techniques. An example from the International Breast Cancer Study Group (IBCSG) is used to illustrate the proposed methodologies.  相似文献   

10.
In this paper, we develop a Gaussian estimation (GE) procedure to estimate the parameters of a regression model for correlated (longitudinal) binary response data using a working correlation matrix. A two‐step iterative procedure is proposed for estimating the regression parameters by the GE method and the correlation parameters by the method of moments. Consistency properties of the estimators are discussed. A simulation study was conducted to compare 11 estimators of the regression parameters, namely, four versions of the GE, five versions of the generalized estimating equations (GEEs), and two versions of the weighted GEE. Simulations show that (i) the Gaussian estimates have the smallest mean square error and best coverage probability if the working correlation structure is correctly specified and (ii) when the working correlation structure is correctly specified, the GE and the GEE with exchangeable correlation structure perform best as opposed to when the correlation structure is misspecified.  相似文献   

11.
12.
基于时间聚类分析和独立成分分析的癫痫fMRI盲分析方法   总被引:3,自引:0,他引:3  
提出了一种基于时间聚类分析和独立成分分析的癫痫fMRI数据盲分析方法,并将两种方法有效联合,提取发作间期的癫痫fMRI激活时空信息.该方法首先由时间聚类分析得到与激活相关的时间峰度特征曲线,以此特征作为时间参考信息;再由空间独立成分分析分解fMRI信号得到空间独立成分;最后将每个独立成分所对应的时间曲线与参考曲线做相关分析提取相应脑激活图.提出的方法无需任何关于癫痫fMRI的先验假设信息,有效解决了独立成分的排序问题,实现了对数据的盲分析.仿真试验结果阐明了这一方法的有效性及可靠性,对癫痫数据的试验结果显示空间定位准确性优于统计参数图方法.  相似文献   

13.
Luo Y  Lin S 《Biometrics》2003,59(2):393-401
Genetic marker data has been increasingly incorporated into segregation analysis, as combined segregation and linkage analysis has been performed more frequently. In this article, we study the extent of information gains with incorporation of marker data in segregation analysis, a topic that has not been investigated rigorously. Specifically, the current study is to investigate the influence of marker data on genetic model parameter estimation. A variance matrix criterion (as the inverse of the Fisher information matrix) and a relative entropy criterion (a measure of flatness of expected log-likelihood surface) are used to quantify the information gains. Our results indicate that substantial information gain can be achieved with the incorporation of marker data. The amount of variance reduction increases as the heterozygosity of the linked marker increases and as the trait gets closer to the linked marker(s). Incorporation of marker data in larger pedigrees also yields greater information gains based on both criteria. The effect of pedigree structure is also studied.  相似文献   

14.
15.
We study the use of simultaneous confidence bands for low-dose risk estimation with quantal response data, and derive methods for estimating simultaneous upper confidence limits on predicted extra risk under a multistage model. By inverting the upper bands on extra risk, we obtain simultaneous lower bounds on the benchmark dose (BMD). Monte Carlo evaluations explore characteristics of the simultaneous limits under this setting, and a suite of actual data sets are used to compare existing methods for placing lower limits on the BMD.  相似文献   

16.
F Fogolari  S Tessari  H Molinari 《Proteins》2002,46(2):161-170
One of the standard tools for the analysis of data arranged in matrix form is singular value decomposition (SVD). Few applications to genomic data have been reported to date mainly for the analysis of gene expression microarray data. We review SVD properties, examine mathematical terms and assumptions implicit in the SVD formalism, and show that SVD can be applied to the analysis of matrices representing pairwise alignment scores between large sets of protein sequences. In particular, we illustrate SVD capabilities for data dimension reduction and for clustering protein sequences. A comparison is performed between SVD-generated clusters of proteins and annotation reported in the SWISS-PROT Database for a set of protein sequences forming the calycin superfamily, entailing all entries corresponding to the lipocalin, cytosolic fatty acid-binding protein, and avidin-streptavidin Prosite patterns.  相似文献   

17.
Assessing natural selection on a phenotypic trait in wild populations is of primary importance for evolutionary ecologists. To cope with the imperfect detection of individuals inherent to monitoring in the wild, we develop a nonparametric method for evaluating the form of natural selection on a quantitative trait using mark-recapture data. Our approach uses penalized splines to achieve flexibility in exploring the form of natural selection by avoiding the need to specify an a priori parametric function. If needed, it can help in suggesting a new parametric model. We employ Markov chain Monte Carlo sampling in a Bayesian framework to estimate model parameters. We illustrate our approach using data for a wild population of sociable weavers (Philetairus socius) to investigate survival in relation to body mass. In agreement with previous parametric analyses, we found that lighter individuals showed a reduction in survival. However, the survival function was not symmetric, indicating that body mass might not be under stabilizing selection as suggested previously.  相似文献   

18.
Pok G  Liu JC  Ryu KH 《Bioinformation》2010,4(8):385-389
The microarray technique has become a standard means in simultaneously examining expression of all genes measured in different circumstances. As microarray data are typically characterized by high dimensional features with a small number of samples, feature selection needs to be incorporated to identify a subset of genes that are meaningful for biological interpretation and accountable for the sample variation. In this article, we present a simple, yet effective feature selection framework suitable for two-dimensional microarray data. Our correlation-based, nonparametric approach allows compact representation of class-specific properties with a small number of genes. We evaluated our method using publicly available experimental data and obtained favorable results.  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号