首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
基于基因表达谱识别乳腺癌转移相关差异表达基因及其功能时,由于基因表达在个体间的变异相对较高而样本量相对较少,由不同研究识别的差异表达基因的可重复性较低。本文基于两套乳腺癌转移基因表达谱,评价两组差异表达基因及其所富集的功能的可重复性。结果显示:在两套表达谱中识别的差异表达基因的表达改变方向高度一致并具有显著的表达相关性;基于两组差异表达基因识别的转移相关功能在两套表达谱中高度可重复,主要涉及细胞分裂、细胞周期、DNA复制、染色体分离、磷酸肌醇介导信号转导和DNA损伤刺激应答等。  相似文献   

2.
Breast cancer has various molecular subtypes and displays high heterogeneity. Aberrant DNA methylation is involved in tumor origin, development and progression. Moreover, distinct DNA methylation patterns are associated with specific breast cancer subtypes. We explored DNA methylation patterns in association with gene expression to assess their impact on the prognosis of breast cancer based on Infinium 450K arrays (training set) from The Cancer Genome Atlas (TCGA). The DNA methylation patterns of 12 featured genes that had a high correlation with gene expression were identified through univariate and multivariable Cox proportional hazards models and used to define the methylation risk score (MRS). An improved ability to distinguish the power of the DNA methylation pattern from the 12 featured genes (p = 0.00103) was observed compared with the average methylation levels (p = 0.956) or gene expression (p = 0.909). Furthermore, MRS provided a good prognostic value for breast cancers even when the patients had the same receptor status. We found that ER-, PR- or Her2- samples with high-MRS had the worst 5-year survival rate and overall survival time. An independent test set including 28 patients with death as an outcome was used to test the validity of the MRS of the 12 featured genes; this analysis obtained a prognostic value equivalent to the training set. The predict power was validated through two independent datasets from the GEO database. The DNA methylation pattern is a powerful predictor of breast cancer survival, and can predict outcomes of the same breast cancer molecular subtypes.  相似文献   

3.
Breast cancer outcome can be predicted using models derived from gene expression data or clinical data. Only a few studies have created a single prediction model using both gene expression and clinical data. These studies often remain inconclusive regarding an obtained improvement in prediction performance. We rigorously compare three different integration strategies (early, intermediate, and late integration) as well as classifiers employing no integration (only one data type) using five classifiers of varying complexity. We perform our analysis on a set of 295 breast cancer samples, for which gene expression data and an extensive set of clinical parameters are available as well as four breast cancer datasets containing 521 samples that we used as independent validation.mOn the 295 samples, a nearest mean classifier employing a logical OR operation (late integration) on clinical and expression classifiers significantly outperforms all other classifiers. Moreover, regardless of the integration strategy, the nearest mean classifier achieves the best performance. All five classifiers achieve their best performance when integrating clinical and expression data. Repeating the experiments using the 521 samples from the four independent validation datasets also indicated a significant performance improvement when integrating clinical and gene expression data. Whether integration also improves performances on other datasets (e.g. other tumor types) has not been investigated, but seems worthwhile pursuing. Our work suggests that future models for predicting breast cancer outcome should exploit both data types by employing a late OR or intermediate integration strategy based on nearest mean classifiers.  相似文献   

4.
Xu Y  Duanmu H  Chang Z  Zhang S  Li Z  Li Z  Liu Y  Li K  Qiu F  Li X 《Molecular biology reports》2012,39(2):1627-1637
Copy number variations (CNVs) are one type of the human genetic variations and are pervasive in the human genome. It has been confirmed that they can play a causal role in complex diseases. Previous studies of CNVs focused more on identifying the disease-specific CNV regions or candidate genes on these CNV regions, but less on the synergistic actions between genes on CNV regions and other genes. Our research combined the CNVs with related gene co-expression to reconstruct gene co-expression network by using single nucleotide polymorphism microarray datasets and gene microarray datasets of breast cancer, and then extracted the modules which connected densely inside and analyzed the functions of modules. Interestingly, all of these modules’ functions were related to breast cancer according to our enrichment analysis, and most of the genes in these modules have been reported to be involved in breast cancer. Our findings suggested that integrating CNVs and gene co-expressed relations was an available way to analyze the roles of CNV genes and their synergistic genes in breast cancer, and provided a novel insight into the pathological mechanism of breast cancer.  相似文献   

5.
Breast cancer is the second leading cause of cancer death for women in the United States. In 2005, about 215,000 cases of invasive breast cancer (IBC) and 50,000 cases of ductal carcinoma in situ will be diagnosed and 40,000 women will die of IBC in the US. Yet there is presently no molecular marker that can be used to detect a precancerous state or identify which premalignant lesions will develop into invasive breast cancer. Here we report the gene expression analysis of atypical ductal hyperplastic tissues from patients with and without a history of breast cancer. We identify MMP-1 as a candidate marker that may be useful for identification of breast lesions that can develop into cancer.  相似文献   

6.
7.
8.
9.
10.
11.
Breast cancer is considered to be a multifactorial disorder caused by both genetic and non-genetic factors. Different histological types of breast cancer differ in response to treatment and may have a divergent clinical course. Breast tissue is heterogeneous, with components of epithelial, mesenchymal, endothelial and lymphopoietic derivation. The genetic heterogeneity of invasive breast cancer is reflected by the wide spectrum of histological types and differentiation grades. Nevertheless, the influences of these cell types on the tumour's total pattern of gene expression can be estimated analytically. Microarrays permit total tissue analysis and provide a stable molecular portrait of tumours. Some investigations suggest differences in the gene expression profiling for ductal and lobular carcinomas. It has been reported that inactivating mutations of the E-cadherin gene are very frequent in infiltrating lobular breast carcinomas. Other than altered expression of E-cadherin, little is known about the underlying biology that distinguishes ductal and lobular tumour subtypes. However, about 8 genes have been identified differentially which are expressed in lobular and ductal cancers: E-CD, survivin, cathepsin B, TPI1, SPRY1, SCYA14, TFAP2B, and thrombospondin 4, osteopontin, HLA-G, and CHC1. Expression profiling of breast cancers can be used diagnostically to distinguish individual histologic subclassifications and may guide the selection of target therapeutics. However, future approaches will need to include methods for high throughput clinical validation and the ability to analyze microscopic samples.  相似文献   

12.
13.

Introduction

The traditional staging system is inadequate to identify those patients with stage II colorectal cancer (CRC) at high risk of recurrence or with stage III CRC at low risk. A number of gene expression signatures to predict CRC prognosis have been proposed, but none is routinely used in the clinic. The aim of this work was to assess the prediction ability and potential clinical usefulness of these signatures in a series of independent datasets.

Methods

A literature review identified 31 gene expression signatures that used gene expression data to predict prognosis in CRC tissue. The search was based on the PubMed database and was restricted to papers published from January 2004 to December 2011. Eleven CRC gene expression datasets with outcome information were identified and downloaded from public repositories. Random Forest classifier was used to build predictors from the gene lists. Matthews correlation coefficient was chosen as a measure of classification accuracy and its associated p-value was used to assess association with prognosis. For clinical usefulness evaluation, positive and negative post-tests probabilities were computed in stage II and III samples.

Results

Five gene signatures showed significant association with prognosis and provided reasonable prediction accuracy in their own training datasets. Nevertheless, all signatures showed low reproducibility in independent data. Stratified analyses by stage or microsatellite instability status showed significant association but limited discrimination ability, especially in stage II tumors. From a clinical perspective, the most predictive signatures showed a minor but significant improvement over the classical staging system.

Conclusions

The published signatures show low prediction accuracy but moderate clinical usefulness. Although gene expression data may inform prognosis, better strategies for signature validation are needed to encourage their widespread use in the clinic.  相似文献   

14.
15.
16.
With the advancement of microarray technology, it is now possible to study the expression profiles of thousands of genes across different experimental conditions or tissue samples simultaneously. Microarray cancer datasets, organized as samples versus genes fashion, are being used for classification of tissue samples into benign and malignant or their subtypes. They are also useful for identifying potential gene markers for each cancer subtype, which helps in successful diagnosis of particular cancer types. In this article, we have presented an unsupervised cancer classification technique based on multiobjective genetic clustering of the tissue samples. In this regard, a real-coded encoding of the cluster centers is used and cluster compactness and separation are simultaneously optimized. The resultant set of near-Pareto-optimal solutions contains a number of non-dominated solutions. A novel approach to combine the clustering information possessed by the non-dominated solutions through Support Vector Machine (SVM) classifier has been proposed. Final clustering is obtained by consensus among the clusterings yielded by different kernel functions. The performance of the proposed multiobjective clustering method has been compared with that of several other microarray clustering algorithms for three publicly available benchmark cancer datasets. Moreover, statistical significance tests have been conducted to establish the statistical superiority of the proposed clustering method. Furthermore, relevant gene markers have been identified using the clustering result produced by the proposed clustering method and demonstrated visually. Biological relationships among the gene markers are also studied based on gene ontology. The results obtained are found to be promising and can possibly have important impact in the area of unsupervised cancer classification as well as gene marker identification for multiple cancer subtypes.  相似文献   

17.
18.
One goal of single-cell RNA sequencing (scRNA seq) is to expose possible heterogeneity within cell populations due to meaningful, biological variation. Examining cell-to-cell heterogeneity, and further, identifying subpopulations of cells based on scRNA seq data has been of common interest in life science research. A key component to successfully identifying cell subpopulations (or clustering cells) is the (dis)similarity measure used to group the cells. In this paper, we introduce a novel measure, named SIDEseq, to assess cell-to-cell similarity using scRNA seq data. SIDEseq first identifies a list of putative differentially expressed (DE) genes for each pair of cells. SIDEseq then integrates the information from all the DE gene lists (corresponding to all pairs of cells) to build a similarity measure between two cells. SIDEseq can be implemented in any clustering algorithm that requires a (dis)similarity matrix. This new measure incorporates information from all cells when evaluating the similarity between any two cells, a characteristic not commonly found in existing (dis)similarity measures. This property is advantageous for two reasons: (a) borrowing information from cells of different subpopulations allows for the investigation of pairwise cell relationships from a global perspective and (b) information from other cells of the same subpopulation could help to ensure a robust relationship assessment. We applied SIDEseq to a newly generated human ovarian cancer scRNA seq dataset, a public human embryo scRNA seq dataset, and several simulated datasets. The clustering results suggest that the SIDEseq measure is capable of uncovering important relationships between cells, and outperforms or at least does as well as several popular (dis)similarity measures when used on these datasets.  相似文献   

19.
Multiple driver genes in individual patient samples may cause resistance to individual drugs in precision medicine. However, current computational methods have not studied how to fill the gap between personalized driver gene identification and combinatorial drug discovery for individual patients. Here, we developed a novel structural network controllability-based personalized driver genes and combinatorial drug identification algorithm (CPGD), aiming to identify combinatorial drugs for an individual patient by targeting personalized driver genes from network controllability perspective. On two benchmark disease datasets (i.e. breast cancer and lung cancer datasets), performance of CPGD is superior to that of other state-of-the-art driver gene-focus methods in terms of discovery rate among prior-known clinical efficacious combinatorial drugs. Especially on breast cancer dataset, CPGD evaluated synergistic effect of pairwise drug combinations by measuring synergistic effect of their corresponding personalized driver gene modules, which are affected by a given targeting personalized driver gene set of drugs. The results showed that CPGD performs better than existing synergistic combinatorial strategies in identifying clinical efficacious paired combinatorial drugs. Furthermore, CPGD enhanced cancer subtyping by computationally providing personalized side effect signatures for individual patients. In addition, CPGD identified 90 drug combinations candidates from SARS-COV2 dataset as potential drug repurposing candidates for recently spreading COVID-19.  相似文献   

20.
Gene expression studies have been widely used in an effort to identify signatures that can predict clinical progression of cancer. In this study we focused instead on identifying gene expression differences between breast tumors and adjacent normal tissue, and between different subtypes of tumor classified by clinical marker status. We have collected a set of 20 breast cancer tissues, matched with the adjacent pathologically normal tissue from the same patient. The cancer samples representing each subtype of breast cancer identified by estrogen receptor ER(+/-) and Her2(+/-) status and divided into four subgroups (ER+/Her2+, ER+/Her2-, ER-/Her2+, and ER-/Her2-) were hybridized on Affymetrix HG-133 Plus 2.0 microarrays. By comparing cancer samples with their matched normal controls we have identified 3537 overall differentially expressed genes using data analysis methods from Bioconductor. When we looked at the genes in common of the four subgroups, we found 151 regulated genes, some of them encoding known targets for breast cancer treatment. Unique genes in the four subgroups instead suggested gene regulation dependent on the ER/Her2 markers selection. In conclusion, the results indicate that microarray studies using robust analysis of matched tumor and normal samples from the same patients can be used to identify genes differentially expressed in breast cancer tumor subtypes even when small numbers of samples are considered and can further elucidate molecular features of breast cancer.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号