首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Gathering vast data sets of cancer genomes requires more efficient and autonomous procedures to classify cancer types and to discover a few essential genes to distinguish different cancers. Because protein expression is more stable than gene expression, we chose reverse phase protein array (RPPA) data, a powerful and robust antibody-based high-throughput approach for targeted proteomics, to perform our research. In this study, we proposed a computational framework to classify the patient samples into ten major cancer types based on the RPPA data using the SMO (Sequential minimal optimization) method. A careful feature selection procedure was employed to select 23 important proteins from the total of 187 proteins by mRMR (minimum Redundancy Maximum Relevance Feature Selection) and IFS (Incremental Feature Selection) on the training set. By using the 23 proteins, we successfully classified the ten cancer types with an MCC (Matthews Correlation Coefficient) of 0.904 on the training set, evaluated by 10-fold cross-validation, and an MCC of 0.936 on an independent test set. Further analysis of these 23 proteins was performed. Most of these proteins can present the hallmarks of cancer; Chk2, for example, plays an important role in the proliferation of cancer cells. Our analysis of these 23 proteins lends credence to the importance of these genes as indicators of cancer classification. We also believe our methods and findings may shed light on the discoveries of specific biomarkers of different types of cancers.  相似文献   

2.
3.
:分析了当前常用的标准化方法在肿瘤基因芯片中引起错误分类的原因,提出了一种基于类均值的标准化方法.该方法对基因表达谱进行双向标准化,并将标准化过程与聚类过程相互缠绕,利用聚类结果来修正参照表达水平.选取了5组肿瘤基因芯片数据,用层次聚类和K-均值聚类算法在不同的方差水平上分别对常用的标准化和基于类均值的标准化处理后的基因表达数据进行聚类分析比较.实验结果表明,基于类均值的标准化方法能有效提高肿瘤基因表达谱聚类结果的质量.  相似文献   

4.
Isobaric multiplexed quantitative proteomics can complement high-resolution sample isolation techniques. Here, we report a simple workflow exponentially modified protein abundance index (emPAI)-MW deconvolution (EMMOL) for normalizing isobaric reporter ratios within and between experiments, where small or unknown amounts of protein are used. EMMOL deconvolutes the isobaric tags for relative and absolute quantification (iTRAQ) data to yield the quantity of each protein of each sample in the pool, a new approach that enables the comparison of many samples without including a channel of reference standard. Moreover, EMMOL allows using a sufficient quantity of control sample to facilitate the peptide fractionation (isoelectric-focusing was used in this report), and mass spectrometry MS/MS sequencing yet relies on the broad dynamic range of iTRAQ quantitation to compare relative protein abundance. We demonstrated EMMOL by comparing four pooled samples with 20-fold range differences in protein abundance and performed data normalization without using prior knowledge of the amounts of proteins in each sample, simulating an iTRAQ experiment without protein quantitation prior to labeling. We used emPAI,1 the target protein MW, and the iTRAQ reporter ratios to calculate the amount of each protein in each of the four channels. Importantly, the EMMOL-delineated proteomes from separate iTRAQ experiments can be assorted for comparison without using a reference sample. We observed no compression of expression in iTRAQ ratios over a 20-fold range for all protein abundances. To complement this ability to analyze minute samples, we report an optimized iTRAQ labeling protocol for using 5 μg protein as the starting material.  相似文献   

5.
6.
Real-time functional magnetic resonance imaging (rtfMRI) is a recently emerged technique that demands fast data processing within a single repetition time (TR), such as a TR of 2 seconds. Data preprocessing in rtfMRI has rarely involved spatial normalization, which can not be accomplished in a short time period. However, spatial normalization may be critical for accurate functional localization in a stereotactic space and is an essential procedure for some emerging applications of rtfMRI. In this study, we introduced an online spatial normalization method that adopts a novel affine registration (AFR) procedure based on principal axes registration (PA) and Gauss-Newton optimization (GN) using the self-adaptive β parameter, termed PA-GN(β) AFR and nonlinear registration (NLR) based on discrete cosine transform (DCT). In AFR, PA provides an appropriate initial estimate of GN to induce the rapid convergence of GN. In addition, the β parameter, which relies on the change rate of cost function, is employed to self-adaptively adjust the iteration step of GN. The accuracy and performance of PA-GN(β) AFR were confirmed using both simulation and real data and compared with the traditional AFR. The appropriate cutoff frequency of the DCT basis function in NLR was determined to balance the accuracy and calculation load of the online spatial normalization. Finally, the validity of the online spatial normalization method was further demonstrated by brain activation in the rtfMRI data.  相似文献   

7.
Liquid chromatography mass spectrometry has become one of the analytical platforms of choice for metabolomics studies. However, LC-MS metabolomics data can suffer from the effects of various systematic biases. These include batch effects, day-to-day variations in instrument performance, signal intensity loss due to time-dependent effects of the LC column performance, accumulation of contaminants in the MS ion source and MS sensitivity among others. In this study we aimed to test a singular value decomposition-based method, called EigenMS, for normalization of metabolomics data. We analyzed a clinical human dataset where LC-MS serum metabolomics data and physiological measurements were collected from thirty nine healthy subjects and forty with type 2 diabetes and applied EigenMS to detect and correct for any systematic bias. EigenMS works in several stages. First, EigenMS preserves the treatment group differences in the metabolomics data by estimating treatment effects with an ANOVA model (multiple fixed effects can be estimated). Singular value decomposition of the residuals matrix is then used to determine bias trends in the data. The number of bias trends is then estimated via a permutation test and the effects of the bias trends are eliminated. EigenMS removed bias of unknown complexity from the LC-MS metabolomics data, allowing for increased sensitivity in differential analysis. Moreover, normalized samples better correlated with both other normalized samples and corresponding physiological data, such as blood glucose level, glycated haemoglobin, exercise central augmentation pressure normalized to heart rate of 75, and total cholesterol. We were able to report 2578 discriminatory metabolite peaks in the normalized data (p<0.05) as compared to only 1840 metabolite signals in the raw data. Our results support the use of singular value decomposition-based normalization for metabolomics data.  相似文献   

8.
9.
DNA methylation is the most widely studied epigenetic mark and is known to be essential to normal development and frequently disrupted in disease. The Illumina HumanMethylation450 BeadChip assays the methylation status of CpGs at 485,577 sites across the genome. Here we present Subset-quantile Within Array Normalization (SWAN), a new method that substantially improves the results from this platform by reducing technical variation within and between arrays. SWAN is available in the minfi Bioconductor package.  相似文献   

10.
Proteins and small molecules are the effectors of physiological action in biological systems and comprehensive methods are needed to analyze their modifications, expression levels and interactions. Systems-scale characterization of the proteome requires thousands of components in high-complexity samples to be isolated and simultaneously probed. While protein microarrays offer a promising approach to probe systems-scale changes in a high-throughput format, they are limited by the need to individually synthesize tens of thousands of proteins. We present an alternative technique, which we call diffusive gel (DiG) stamping, for patterning a microarray using a cellular lysate enabling rapid visualization of dynamic changes in the proteome as well protein interactions. A major advantage of the method described is that it requires no specialized equipment or in-vitro protein synthesis, making it widely accessible to researchers. The method can be integrated with mass spectrometry, allowing for the discovery of novel protein interactions. Here, we describe and characterize the sensitivity and physical features of DiG-Stamping. We demonstrate the biologic utility of DiG-Stamping by (1) identifying the binding partners of a target protein within a cellular lysate and by (2) visualizing the dynamics of proteins with multiple post-translational modifications.  相似文献   

11.
The purpose of the experiments was to analyse the spatial cueing effects of the movements of soccer players executing normal and deceptive (step-over) turns with the ball. Stimuli comprised normal resolution or point-light video clips of soccer players dribbling a football towards the observer then turning right or left with the ball. Clips were curtailed before or on the turn (−160, −80, 0 or +80 ms) to examine the time course of direction prediction and spatial cueing effects. Participants were divided into higher-skilled (HS) and lower-skilled (LS) groups according to soccer experience. In experiment 1, accuracy on full video clips was higher than on point-light but results followed the same overall pattern. Both HS and LS groups correctly identified direction on normal moves at all occlusion levels. For deceptive moves, LS participants were significantly worse than chance and HS participants were somewhat more accurate but nevertheless substantially impaired. In experiment 2, point-light clips were used to cue a lateral target. HS and LS groups showed faster reaction times to targets that were congruent with the direction of normal turns, and to targets incongruent with the direction of deceptive turns. The reversed cueing by deceptive moves coincided with earlier kinematic events than cueing by normal moves. It is concluded that the body kinematics of soccer players generate spatial cueing effects when viewed from an opponent''s perspective. This could create a reaction time advantage when anticipating the direction of a normal move. A deceptive move is designed to turn this cueing advantage into a disadvantage. Acting on the basis of advance information, the presence of deceptive moves primes responses in the wrong direction, which may be only partly mitigated by delaying a response until veridical cues emerge.  相似文献   

12.
A conserved family of herpesvirus protein kinases plays a crucial role in herpesvirus DNA replication and virion production. However, despite the fact that these kinases are potential therapeutic targets, no systematic studies have been performed to identify their substrates. We generated an Epstein-Barr virus (EBV) protein array to evaluate the targets of the EBV protein kinase BGLF4. Multiple proteins involved in EBV lytic DNA replication and virion assembly were identified as previously unrecognized substrates for BGLF4, illustrating the broad role played by this protein kinase. Approximately half of the BGLF4 targets were also in vitro substrates for the cellular kinase CDK1/cyclin B. Unexpectedly, EBNA1 was identified as a substrate and binding partner of BGLF4. EBNA1 is essential for replication and maintenance of the episomal EBV genome during latency. BGLF4 did not prevent EBNA1 binding to sites in the EBV latency origin of replication, oriP. Rather, we found that BGLF4 was recruited by EBNA1 to oriP in cells transfected with an oriP vector and BGLF4 and in lytically induced EBV-positive Akata cells. In cells transfected with an oriP vector, the presence of BGLF4 led to more rapid loss of the episomal DNA, and this was dependent on BGLF4 kinase activity. Similarly, expression of doxycycline-inducible BGLF4 in Akata cells led to a reduction in episomal EBV genomes. We propose that BGLF4 contributes to effective EBV lytic cycle progression, not only through phosphorylation of EBV lytic DNA replication and virion proteins, but also by interfering with the EBNA1 replication function.Herpesviruses encode two families of serine/threonine protein kinases, one of which, the BGLF4 (Epstein-Barr virus [EBV])/UL97 (human cytomegalovirus)/UL13 (herpes simplex virus)/ORF36 (Kaposi''s sarcoma-associated herpesvirus)/ORF47 (varicella-zoster virus) family, is the sole protein kinase encoded by beta and gamma herpesviruses. The protein kinases phosphorylate both viral and host proteins (16, 21, 42) and are necessary for efficient virus lytic replication. Consequently, these kinases have been of interest as potential targets for antiviral drug development (37), and the compound 1263W94 (maribavir), which inhibits the cytomegalovirus UL97 protein (3), has been used in phase I clinical trials (27, 31, 47).EBV infection is prevalent worldwide, and primary infection in adolescence or early adulthood is associated in 30 to 40% of cases with infectious mononucleosis. EBV efficiently infects B cells in the lymphoid tissues of the Waldeyer ring (43). EBV infection of B cells is biased toward establishment of latency with limited viral-gene expression (49). During latent infection, EBV genomes are maintained as extrachromosomal episomes. Replication of episomal genomes utilizes the latency origin of replication, oriP. The only EBV-encoded protein required is the origin binding protein EBNA1. All other essential replication factors are provided by the cell. Expression of the EBV replicative cycle and production of progeny virus take place in terminally differentiated plasma B cells (11, 29), and epithelial cells may also contribute to the cycle of virus replication and spread that is an important component of both persistent infection of the individual and transmission of virus from one individual to the next (4, 22). Lytic DNA replication initiates at separate origins, oriLyt. EBV encodes a set of six core lytic replication proteins, along with ancillary proteins, such as thymidine kinase (TK), that are involved in nucleotide metabolism (13, 44).Several substrates have been described for the EBV BGLF4 protein kinase, including the core lytic EBV replication protein BMRF1, the polymerase processivity factor (8, 17). BGLF4 has also been found to locate to sites of lytic viral replication (46), to be required for efficient lytic DNA replication and release of nucleocapsids from the nucleus (18), and to contribute to the compaction of cell chromatin seen in cells undergoing lytic replication (32). Protein chip technology provides a new tool for global analysis of activities for biologically important enzymes, such as ubiquitin ligases, DNA repair enzymes, and kinases (7, 19, 36, 38, 52). Using an EBV protein array for unbiased screening, we identified multiple new BGLF4 substrates involved in lytic DNA replication, capsid assembly, and DNA packaging. Unexpectedly, we also identified EBNA1 as a substrate and binding partner for BGLF4. The data suggest that the contribution of BGLF4 to the EBV lytic cycle extends beyond the previously recognized contributions to lytic DNA replication and virion production and includes facilitating the switch from latent to lytic DNA replication by downregulating the EBNA1 replication function.  相似文献   

13.
14.
15.
The Protein Data Bank   总被引:163,自引:20,他引:163  
The Protein Data Bank (PDB; http://www.rcsb.org/pdb/ ) is the single worldwide archive of structural data of biological macromolecules. This paper describes the goals of the PDB, the systems in place for data deposition and access, how to obtain further information, and near-term plans for the future development of the resource.  相似文献   

16.
悬浮芯片在核酸和蛋白质检测中的应用   总被引:4,自引:0,他引:4  
悬浮芯片是近年来兴起的一种新型检测技术,不同于固相基因芯片,它整合了高分子化学、分子生物学、免疫学、激光检测、微流体、高速数字信号处理、计算机分析等方面的先进技术,能够对少量样本进行高通量的定性、定量检测。主要综述了悬浮芯片技术的基本原理,并概要介绍了其在核酸和蛋白质检测中的应用。悬浮芯片技术在核酸和蛋白质检测中有着显著的优点,如高通量、操作简便、重复性好、灵敏度高、线性范围宽等,不但可以广泛应用于科学研究领域,而且还将逐渐普及于临床诊断实验室,具有广阔的应用前景。  相似文献   

17.
目的:基于阿尔茨海默病微阵列基因表达数据,分析研究微阵列基因表达数据预处理的新的有效方法.方法:首先采用标准差滤波、FSC(特征记分准则)和WPT-SAM(小波包变换-微阵列数据显著性分析)方法对微阵列基因表达数据进行预处理,比较处理后获得的基因数和FDR值;然后采用分类聚类方法对处理后的数据进行分类聚类和分层决策聚类,比较分类聚类结果.结果:标准差滤波和FSC方法获得的初筛基因数据较WPT-SAM方法多,但FDR值也高、后续分类聚类结果较WPT-SAM方法差.结论:WPT-SAM方法在预处理微阵列基因表达数据中,是比较灵活理想的分析方法.  相似文献   

18.
夏遥  孔薇 《生物磁学》2011,(Z1):4742-4747
目的:基于阿尔茨海默病微阵列基因表达数据,分析研究微阵列基因表达数据预处理的新的有效方法。方法:首先采用标准差滤波、FSC(特征记分准则)和WPT-SAM(小波包变换-微阵列数据显著性分析)方法对微阵列基因表达数据进行预处理,比较处理后获得的基因数和FDR值;然后采用分类聚类方法对处理后的数据进行分类聚类和分层决策聚类,比较分类聚类结果。结果:标准差滤波和FSC方法获得的初筛基因数据较WPT-SAM方法多,但FDR值也高、后续分类聚类结果较WPT-SAM方法差。结论:WPT-SAM方法在预处理微阵列基因表达数据中,是比较灵活理想的分析方法。  相似文献   

19.
20.
Counts data from spatially continguous regions offer a challenge to the statistician both from the data analytic and the statistical modeling point of view. Important applications include epidemiological studies (e. g., cancer mortality over the counties of the USA) and Census surveys (e. g., undercount over the Census blocks of an urban area). It has long been recognized by time-series analysts that data close together in time usually exhibit higher dependence than those far apart. Time-series data analysis relies on methods of data transformation, detrending, and autocorrelation plotting. It is our intention in this article to generalize this approach to a spatial setting. To do this we consider a small spatial data set of 100 observations. Through the use of a square-root transformation, a weighted median polish and a variogram analysis of the median-polish residuals, we represent the transformed data as a trend plus stationary error. Thus we show how standard data-analytic techniques can be modified both to mitigate and to exploit the spatial relationships.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号