首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 8 毫秒
Genome-wide association studies (GWASs) have recently revealed many genetic associations that are shared between different diseases. We propose a method, disPCA, for genome-wide characterization of shared and distinct risk factors between and within disease classes. It flips the conventional GWAS paradigm by analyzing the diseases themselves, across GWAS datasets, to explore their “shared pathogenetics”. The method applies principal component analysis (PCA) to gene-level significance scores across all genes and across GWASs, thereby revealing shared pathogenetics between diseases in an unsupervised fashion. Importantly, it adjusts for potential sources of heterogeneity present between GWAS which can confound investigation of shared disease etiology. We applied disPCA to 31 GWASs, including autoimmune diseases, cancers, psychiatric disorders, and neurological disorders. The leading principal components separate these disease classes, as well as inflammatory bowel diseases from other autoimmune diseases. Generally, distinct diseases from the same class tend to be less separated, which is in line with their increased shared etiology. Enrichment analysis of genes contributing to leading principal components revealed pathways that are implicated in the immune system, while also pointing to pathways that have yet to be explored before in this context. Our results point to the potential of disPCA in going beyond epidemiological findings of the co-occurrence of distinct diseases, to highlighting novel genes and pathways that unsupervised learning suggest to be key players in the variability across diseases.  相似文献   

Understanding the role of genetic variation in human diseases remains an important problem to be solved in genomics. An important component of such variation consist of variations at single sites in DNA, or single nucleotide polymorphisms (SNPs). Typically, the problem of associating particular SNPs to phenotypes has been confounded by hidden factors such as the presence of population structure, family structure or cryptic relatedness in the sample of individuals being analyzed. Such confounding factors lead to a large number of spurious associations and missed associations. Various statistical methods have been proposed to account for such confounding factors such as linear mixed-effect models (LMMs) or methods that adjust data based on a principal components analysis (PCA), but these methods either suffer from low power or cease to be tractable for larger numbers of individuals in the sample. Here we present a statistical model for conducting genome-wide association studies (GWAS) that accounts for such confounding factors. Our method scales in runtime quadratic in the number of individuals being studied with only a modest loss in statistical power as compared to LMM-based and PCA-based methods when testing on synthetic data that was generated from a generalized LMM. Applying our method to both real and synthetic human genotype/phenotype data, we demonstrate the ability of our model to correct for confounding factors while requiring significantly less runtime relative to LMMs. We have implemented methods for fitting these models, which are available at http://www.microsoft.com/science.  相似文献   

In certain types of ecological investigations it may be desirable to investigate infraspecific variation in bacteria. Principal component analysis is demonstrated to be satisfactory for this purpose. Hypothetical bacterial populations were used to show that such analysis can be used to compare collections of bacterial isolates taken at different times or from different sources. Alternatively, given n isolates, whether they represent a single bacterial population can be determined. The method is applied to authentic collections of bacteria in three separate analyses. The results are compatible with current taxonomic tenets.  相似文献   

Object Oriented Data Analysis is a new area in statistics that studies populations of general data objects. In this article we consider populations of tree-structured objects as our focus of interest. We develop improved analysis tools for data lying in a binary tree space analogous to classical Principal Component Analysis methods in Euclidean space. Our extensions of PCA are analogs of one dimensional subspaces that best fit the data. Previous work was based on the notion of tree-lines.  相似文献   

目的 本文提出了一种基于主成分分析(PCA)的双对比光学投影断层成像(DC-OPT)方法,以获得活体中血流网络和骨骼的三维可视化。方法 使用主成分分析方法来提取吸收图像和血流图像,原始图像序列的第一主成分用于获取吸收图像;通过计算每个像素的调制深度来获得流动图像。不同投影位置的流动和吸收对比图像被用于三维血流网络和骨骼的同步重建。结果 采用PCA和OPT相结合的方法,通过将动态血流信号和静态背景信号分离,实现了对微生物样本的血流网络和骨骼的三维成像。结论 本文研究的新颖之处在于通过同一光学系统获得了快速、同步、双对比的血流网络和骨骼三维图像。实验结果可用于活体生物的生理发育研究。  相似文献   



The impact of scientific publications has traditionally been expressed in terms of citation counts. However, scientific activity has moved online over the past decade. To better capture scientific impact in the digital era, a variety of new impact measures has been proposed on the basis of social network analysis and usage log data. Here we investigate how these new measures relate to each other, and how accurately and completely they express scientific impact.


We performed a principal component analysis of the rankings produced by 39 existing and proposed measures of scholarly impact that were calculated on the basis of both citation and usage log data.


Our results indicate that the notion of scientific impact is a multi-dimensional construct that can not be adequately measured by any single indicator, although some measures are more suitable than others. The commonly used citation Impact Factor is not positioned at the core of this construct, but at its periphery, and should thus be used with caution.  相似文献   

为了综合评价绣球菌的营养价值,对其菌丝体与子实体蛋白质的化学评分(CS)、氨基酸评分(AAS)、必需氨基酸指数(EAAI)、生物价(BV)、营养指数(NI)及氨基酸比值系数分(AARCS)6个指标进行主成分分析。结果表明,可将原来的6个营养价值指标综合为2个主成分,其累计贡献率达99.543%,其中,决定第1主成分大小的是EAAI和BV,决定第2主成分大小的是CS和AAS。根据试验材料的主成分得分及综合得分,玉米粉配方菌丝体营养价值高于子实体。  相似文献   

应用数量分类学方法对中国原产石蒜属13种2变种的35个性状进行Q、R聚类分析和主成分分析,探讨国产石蒜属植物种间的亲缘关系,并对分类性状进行评价。研究结果显示,Q型聚类可分为2个大类和8小类,安徽石蒜与长筒石蒜的亲缘关系很近,认为将其作为长筒石蒜的变种更为合适,同时支持玫瑰石蒜、红蓝石蒜、乳白石蒜、江苏石蒜、稻草石蒜的杂交起源观点。R型聚类可分为7个组;经主成分分析,35个性状可综合为5个主成分,其累积贡献率达82.55%,根据这5个主成分与性状间的相关性,选出影响比较大的16个性状,其中鳞茎形状、花被片宽和雄蕊长/花被片长的比值最为重要,可作为大类群划分的依据,而花被片是否具条纹、花丝颜色、花色、花葶粗、幼叶尖端及边缘颜色、种子有无等可作为物种划分的重要依据。  相似文献   

Summary This note is in response to Wouters et al. (2003, Biometrics 59, 1131–1139) who compared three methods for exploring gene expression data. Contrary to their summary that principal component analysis is not very informative, we show that it is possible to determine principal component analyses that are useful for exploratory analysis of microarray data. We also present another biplot representation, the GE‐biplot (Gene Expression biplot), that is a useful method for exploring gene expression data with the major advantage of being able to aid interpretation of both the samples and the genes relative to each other.  相似文献   

Computed tomography (CT) has a revolutionized diagnostic radiology but involves large radiation doses that directly impact image quality. In this paper, we propose adaptive tensor-based principal component analysis (AT-PCA) algorithm for low-dose CT image denoising. Pixels in the image are presented by their nearby neighbors, and are modeled as a patch. Adaptive searching windows are calculated to find similar patches as training groups for further processing. Tensor-based PCA is used to obtain transformation matrices, and coefficients are sequentially shrunk by the linear minimum mean square error. Reconstructed patches are obtained, and a denoised image is finally achieved by aggregating all of these patches. The experimental results of the standard test image show that the best results are obtained with two denoising rounds according to six quantitative measures. For the experiment on the clinical images, the proposed AT-PCA method can suppress the noise, enhance the edge, and improve the image quality more effectively than NLM and KSVD denoising methods.  相似文献   

氟监测植物筛选的主成分分析方法马恩林(首都师范大学,北京100053)UseofPrincipalComponentAnalysisinSelectingMonitoringPlantsforFluoride¥MaEnlin(CapitalNorma...  相似文献   

Although circadian and sleep research has made extraordinary progress in the recent years, one remaining challenge is the objective quantification of sleepiness in individuals suffering from sleep deprivation, sleep restriction, and excessive somnolence. The major goal of the present study was to apply principal component analysis to the wake electroencephalographic (EEG) spectrum in order to establish an objective measure of sleepiness. The present analysis was led by the hypothesis that in sleep-deprived individuals, the time course of self-rated sleepiness correlates with the time course score on the 2nd principal component of the EEG spectrum. The resting EEG of 15 young subjects was recorded at 2-h intervals for 32–50?h. Principal component analysis was performed on the sets of 16 single-Hz log-transformed EEG powers (1–16?Hz frequency range). The time course of self-perceived sleepiness correlated strongly with the time course of the 2nd principal component score, irrespective of derivation (frontal or occipital) and of analyzed section of the 7-min EEG record (2-min section with eyes open or any of the five 1-min sections with eyes closed). This result indicates the possibility of deriving an objective index of physiological sleepiness by applying principal component analysis to the wake EEG spectrum. (Author correspondence: )  相似文献   

主成分分析法用于西洋参样品分类研究   总被引:8,自引:0,他引:8  
建立西洋参药材分类方法;采用电感耦合等离子体质谱(ICP-MS)法对12个西洋参样品中的15种无机元素的含量进行测定,用高效液相色谱(HPLC)法测定上述样品中的7种人参皂苷的含量,用蒽酮-硫酸法测定其中多糖的含量;进而采用主成分分析法(PCA)对所测得的西洋参样品的23个变量进行分类研究;12个西洋参样品能得到合理的分类,而各人参皂苷的含量是决定西洋参样品分类的第1关键因素,元素Mn、Cu、As、Ni、Mo以及多糖的含量是第2关键因素;主成分分析法是西洋参分析分类的有效方法。  相似文献   

Fluorescence Lifetime Imaging (FLIM) is an attractive microscopy method in the life sciences, yielding information on the sample otherwise unavailable through intensity‐based techniques. A novel Noise‐Corrected Principal Component Analysis (NC‐PCA) method for time‐domain FLIM data is presented here. The presence and distribution of distinct microenvironments are identified at lower photon counts than previously reported, without requiring prior knowledge of their number or of the dye's decay kinetics. A noise correction based on the Poisson statistics inherent to Time‐Correlated Single Photon Counting is incorporated. The approach is validated using simulated data, and further applied to experimental FLIM data of HeLa cells stained with membrane dye di‐4‐ANEPPDHQ. Two distinct lipid phases were resolved in the cell membranes, and the modification of the order parameters of the plasma membrane during cholesterol depletion was also detected.

Noise‐corrected Principal Component Analysis of FLIM data resolves distinct microenvironments in cell membranes of live HeLa cells.  相似文献   

估算稻田甲烷(CH4)排放量是开展稻田甲烷排放研究的重要内容之一.通过观测南方红黄壤稻田不同水稻品种甲烷排放通量,测定了16个早稻、20个晚稻品种的植株节间组织的数量特征.选取株高、茎秆长度、茎秆维管束面积/茎壁横切面积、茎壁横切面积/节间横切面积、叶鞘横切面积/节间横切面积、气腔面积/茎壁横切面积、维管束总面积/茎壁横切面积等相关因子进行了主成分分析,建立基于水稻植株的CHa排放估算模型,早、晚稻估算模型相关系数分别为0.827、0.853.同时构造了综合评价函数,得出了水稻品种CH4排放综合分值,与实测结果相比较,吻合度较高.利用估算模型进行模拟,比较模拟值与实测值,相对误差较小,证明模型具有有效性和可行性,为估算水稻CH4排放提供参考依据,为评价水稻品种CH4排放高低提供经验参考.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号