首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

Background

Numerous gene lists or "classifiers" have been derived from global gene expression data that assign breast cancers to good and poor prognosis groups. A remarkable feature of these molecular signatures is that they have few genes in common, prompting speculation that they may use distinct genes to measure the same pathophysiological process(es), such as proliferation. However, this supposition has not been rigorously tested. If gene-based classifiers function by measuring a minimal number of cellular processes, we hypothesized that the informative genes for these processes could be identified and the data sets could be adjusted for the predictive contributions of those genes. Such adjustment would then attenuate the predictive function of any signature measuring that same process.

Results

We tested this hypothesis directly using a novel iterative-subtractive approach. We evaluated five gene expression data sets that sample a broad range of breast cancer subtypes. In all data sets, the dominant cluster capable of predicting metastasis was heavily populated by genes that fluctuate in concert with the cell cycle. When six well-characterized classifiers were examined, all contained a higher than expected proportion of genes that correlate with this cluster. Furthermore, when the data sets were globally adjusted for the cell cycle cluster, each classifier lost its ability to assign tumors to appropriate high and low risk groups. In contrast, adjusting for other predictive gene clusters did not impact their performance.

Conclusion

These data indicate that the discriminative ability of breast cancer classifiers is dependent upon genes that correlate with cell cycle progression.  相似文献   

2.

Introduction

Classic anti-nucleolar antibodies anti-Th/To and U3 ribonucleoprotein (-U3RNP) can help in the diagnosis, prediction of organ involvement and prognosis in systemic sclerosis (SSc); however, no validated commercial assay is available. We aimed at establishing a novel quantitative real time PCR (qPCR) method to detect these antibodies.

Methods

Standard immunoprecipitation (IP) was performed using K562 cell extract and RNA components were extracted. cDNA was reverse transcribed from RNA components and Th RNA and U3 RNA were detected by qPCR using custom primers. Cycle threshold (Ct) values were compared in a titration experiment to determine the assay efficacy. The new assay was evaluated by testing 22 anti-Th/To and 12 anti-U3RNP positive samples in addition to 88 controls, and the results were compared with IP as a gold standard.

Results

By testing serial 1:8 dilutions of cell lysate as the substrate in the IP step, RNA extracted after IP, and its derived cDNA, linear dose response curves were noted for both anti-Th/To and -U3RNP. With every dilution, Ct values changed approximately three as expected, reflecting the eight-fold difference of cDNA. The Ct difference between positive and negative samples was 8 to 13, which was similar throughout the dilutions. In the specificity analysis, the Ct values of positive samples were clearly different from the negative groups and the results by qPCR had a near perfect correlation with IP.

Conclusions

Our new method readily detects these two clinically important antibodies in SSc. Making tests for anti-Th/To and -U3RNP antibodies widely available to clinicians should be helpful in the diagnosis and follow-up of SSc patients.  相似文献   

3.
4.

Background

The controversy surrounding the non-uniqueness of predictive gene lists (PGL) of small selected subsets of genes from very large potential candidates as available in DNA microarray experiments is now widely acknowledged [1]. Many of these studies have focused on constructing discriminative semi-parametric models and as such are also subject to the issue of random correlations of sparse model selection in high dimensional spaces. In this work we outline a different approach based around an unsupervised patient-specific nonlinear topographic projection in predictive gene lists.

Methods

We construct nonlinear topographic projection maps based on inter-patient gene-list relative dissimilarities. The Neuroscale, the Stochastic Neighbor Embedding(SNE) and the Locally Linear Embedding(LLE) techniques have been used to construct two-dimensional projective visualisation plots of 70 dimensional PGLs per patient, classifiers are also constructed to identify the prognosis indicator of each patient using the resulting projections from those visualisation techniques and investigate whether a-posteriori two prognosis groups are separable on the evidence of the gene lists. A literature-proposed predictive gene list for breast cancer is benchmarked against a separate gene list using the above methods. Generalisation ability is investigated by using the mapping capability of Neuroscale to visualise the follow-up study, but based on the projections derived from the original dataset.

Results

The results indicate that small subsets of patient-specific PGLs have insufficient prognostic dissimilarity to permit a distinction between two prognosis patients. Uncertainty and diversity across multiple gene expressions prevents unambiguous or even confident patient grouping. Comparative projections across different PGLs provide similar results.

Conclusion

The random correlation effect to an arbitrary outcome induced by small subset selection from very high dimensional interrelated gene expression profiles leads to an outcome with associated uncertainty. This continuum and uncertainty precludes any attempts at constructing discriminative classifiers. However a patient's gene expression profile could possibly be used in treatment planning, based on knowledge of other patients' responses. We conclude that many of the patients involved in such medical studies are intrinsically unclassifiable on the basis of provided PGL evidence. This additional category of 'unclassifiable' should be accommodated within medical decision support systems if serious errors and unnecessary adjuvant therapy are to be avoided.  相似文献   

5.
Accurate molecular classification of cancer using simple rules   总被引:1,自引:0,他引:1  

Background

One intractable problem with using microarray data analysis for cancer classification is how to reduce the extremely high-dimensionality gene feature data to remove the effects of noise. Feature selection is often used to address this problem by selecting informative genes from among thousands or tens of thousands of genes. However, most of the existing methods of microarray-based cancer classification utilize too many genes to achieve accurate classification, which often hampers the interpretability of the models. For a better understanding of the classification results, it is desirable to develop simpler rule-based models with as few marker genes as possible.

Methods

We screened a small number of informative single genes and gene pairs on the basis of their depended degrees proposed in rough sets. Applying the decision rules induced by the selected genes or gene pairs, we constructed cancer classifiers. We tested the efficacy of the classifiers by leave-one-out cross-validation (LOOCV) of training sets and classification of independent test sets.

Results

We applied our methods to five cancerous gene expression datasets: leukemia (acute lymphoblastic leukemia [ALL] vs. acute myeloid leukemia [AML]), lung cancer, prostate cancer, breast cancer, and leukemia (ALL vs. mixed-lineage leukemia [MLL] vs. AML). Accurate classification outcomes were obtained by utilizing just one or two genes. Some genes that correlated closely with the pathogenesis of relevant cancers were identified. In terms of both classification performance and algorithm simplicity, our approach outperformed or at least matched existing methods.

Conclusion

In cancerous gene expression datasets, a small number of genes, even one or two if selected correctly, is capable of achieving an ideal cancer classification effect. This finding also means that very simple rules may perform well for cancerous class prediction.  相似文献   

6.

Background

One of the major goals in gene and protein expression profiling of cancer is to identify biomarkers and build classification models for prediction of disease prognosis or treatment response. Many traditional statistical methods, based on microarray gene expression data alone and individual genes' discriminatory power, often fail to identify biologically meaningful biomarkers thus resulting in poor prediction performance across data sets. Nonetheless, the variables in multivariable classifiers should synergistically interact to produce more effective classifiers than individual biomarkers.

Results

We developed an integrated approach, namely network-constrained support vector machine (netSVM), for cancer biomarker identification with an improved prediction performance. The netSVM approach is specifically designed for network biomarker identification by integrating gene expression data and protein-protein interaction data. We first evaluated the effectiveness of netSVM using simulation studies, demonstrating its improved performance over state-of-the-art network-based methods and gene-based methods for network biomarker identification. We then applied the netSVM approach to two breast cancer data sets to identify prognostic signatures for prediction of breast cancer metastasis. The experimental results show that: (1) network biomarkers identified by netSVM are highly enriched in biological pathways associated with cancer progression; (2) prediction performance is much improved when tested across different data sets. Specifically, many genes related to apoptosis, cell cycle, and cell proliferation, which are hallmark signatures of breast cancer metastasis, were identified by the netSVM approach. More importantly, several novel hub genes, biologically important with many interactions in PPI network but often showing little change in expression as compared with their downstream genes, were also identified as network biomarkers; the genes were enriched in signaling pathways such as TGF-beta signaling pathway, MAPK signaling pathway, and JAK-STAT signaling pathway. These signaling pathways may provide new insight to the underlying mechanism of breast cancer metastasis.

Conclusions

We have developed a network-based approach for cancer biomarker identification, netSVM, resulting in an improved prediction performance with network biomarkers. We have applied the netSVM approach to breast cancer gene expression data to predict metastasis in patients. Network biomarkers identified by netSVM reveal potential signaling pathways associated with breast cancer metastasis, and help improve the prediction performance across independent data sets.  相似文献   

7.
8.
9.
Adapter-tagged competitive PCR (ATAC-PCR) is an advanced version of competitive quantitative PCR that is characterized by the addition of unique adapters to cDNA derived from each sample RNA. Using multiple adapters, we can accurately measure the relative expression ratios of many samples, with a calibration curve obtained from internal standards included in the same reaction. ATAC-PCR can identify differences in gene expression as small as twofold, even from very small amounts of sample RNA. This technique is suitable for confirming results obtained with cDNA microarrays or differential display, and it can process more than a thousand of genes per day when used in conjunction with a capillary DNA sequencer.  相似文献   

10.
Adapter-tagged competitive PCR (ATAC-PCR) is an advanced version of competitive quantitative PCR that is characterized by the addition of unique adapters to cDNA derived from each sample RNA. Using multiple adapters, we can accurately measure the relative expression ratios of many samples, with a calibration curve obtained from internal standards included in the same reaction. ATAC-PCR can identify differences in gene expression as small as twofold, even from very small amounts of sample RNA. This technique is suitable for confirming results obtained with cDNA microarrays or differential display, and it can process more than a thousand of genes per day when used in conjunction with a capillary DNA sequencer.  相似文献   

11.
12.
13.

Background

A tremendous amount of efforts have been devoted to identifying genes for diagnosis and prognosis of diseases using microarray gene expression data. It has been demonstrated that gene expression data have cluster structure, where the clusters consist of co-regulated genes which tend to have coordinated functions. However, most available statistical methods for gene selection do not take into consideration the cluster structure.

Results

We propose a supervised group Lasso approach that takes into account the cluster structure in gene expression data for gene selection and predictive model building. For gene expression data without biological cluster information, we first divide genes into clusters using the K-means approach and determine the optimal number of clusters using the Gap method. The supervised group Lasso consists of two steps. In the first step, we identify important genes within each cluster using the Lasso method. In the second step, we select important clusters using the group Lasso. Tuning parameters are determined using V-fold cross validation at both steps to allow for further flexibility. Prediction performance is evaluated using leave-one-out cross validation. We apply the proposed method to disease classification and survival analysis with microarray data.

Conclusion

We analyze four microarray data sets using the proposed approach: two cancer data sets with binary cancer occurrence as outcomes and two lymphoma data sets with survival outcomes. The results show that the proposed approach is capable of identifying a small number of influential gene clusters and important genes within those clusters, and has better prediction performance than existing methods.  相似文献   

14.

Objective

It is widely recognized that the diagnosis of parathyroid carcinoma (PC) is often difficult because of the overlap of characteristics between malignant and benign parathyroid tumors, especially at an early stage. Based on the identification of tumor suppressor gene HRPT2/CDC73 and its association with hereditary and sporadic PC, screening of gene mutations and detection of parafibromin immunoreactivity have been suggested as diagnostic instruments of PC in Whites. There is little information about HRPT2/CDC73 mutations and its corresponding protein expression in patients with sporadic PC in Chinese population, and the long-term follow-up data is scarce.

Methods

Paraffin-embedded tissues were obtained from 13 patients with PC, 13 patients with parathyroid adenoma (PA) and 7 patients with parathyroid hyperplasia(PH), and 6 normal parathyroid (NP) tissues as controls. Peripheral blood from 11 patients with PC was collected. PCR products using Genomic DNA extracted from tumor tissues or blood as template was sequenced for HRPT2/CDC73 gene. Expression of parafibromin in tumor tissues was evaluated by immunohistochemical analysis.

Results

Six mutations in 6 of 13 patients with PC were identified, with three being novel. Four of them were germ-line mutations. Patients with mutations were susceptible to recurrence of the PC. Complete (8/13, 61.5%) or partial (5/13, 38.5%) loss of parafibromin expression was observed in PC tissues. All of tissue samples from normal parathyroid or benign parathyroid tumors displayed positive immunostaining of parafibromin except one adenoma.

Conclusions

The present study supplies information on the mutations and protein expression of HRPT2/CDC73 gene and phenotypes of parathyroid carcinoma in Chinese population. And the expanded mutation database of this gene may benefit patients in the diagnosis and treatment of this disease.  相似文献   

15.

Background

Evaluating copy numbers of given genes in Plasmodium falciparum parasites is of major importance for laboratory-based studies or epidemiological surveys. For instance, pfmdr1 gene amplification has been associated with resistance to quinine derivatives and several genes involved in anti-oxidant defence may play an important role in resistance to antimalarial drugs, although their potential involvement has been overlooked.

Methods

TheΔΔCt method of relative quantification using real-time quantitative PCR with SYBR Green I detection was adapted and optimized to estimate copy numbers of three genes previously indicated as putative candidates of resistance to quinolines and artemisinin derivatives: pfmdr1, pfatp6 (SERCA) and pftctp, and in six further genes involved in oxidative stress responses.

Results

Using carefully designed specific RT-qPCR oligonucleotides, the methods were optimized for each gene and validated by the accurate measure of previously known number of copies of the pfmdr1 gene in the laboratory reference strains P. falciparum 3D7 and Dd2. Subsequently, Standard Operating Procedures (SOPs) were developed to the remaining genes under study and successfully applied to DNA obtained from dried filter blood spots of field isolates of P. falciparum collected in São Tomé & Principe, West Africa.

Conclusion

The SOPs reported here may be used as a high throughput tool to investigate the role of these drug resistance gene candidates in laboratory studies or large scale epidemiological surveys.  相似文献   

16.
17.

Background

Tamoxifen (TAM) is a well characterized breast cancer drug and selective estrogen receptor modulator (SERM) which also has been associated with a small increase in risk for uterine cancers. TAM's partial agonist activation of estrogen receptor has been characterized for specific gene promoters but not at the genomic level in vivo.Furthermore, reducing uncertainties associated with cross-species extrapolations of pharmaco- and toxicogenomic data remains a formidable challenge.

Results

A comparative ligand and species analysis approach was conducted to systematically assess the physiological, morphological and uterine gene expression alterations elicited across time by TAM and ethynylestradiol (EE) in immature ovariectomized Sprague-Dawley rats and C57BL/6 mice. Differential gene expression was evaluated using custom cDNA microarrays, and the data was compared to identify conserved and divergent responses. 902 genes were differentially regulated in all four studies, 398 of which exhibit identical temporal expression patterns.

Conclusion

Comparative analysis of EE and TAM differentially expressed gene lists suggest TAM regulates no unique uterine genes that are conserved in the rat and mouse. This demonstrates that the partial agonist activities of TAM extend to molecular targets in regulating only a subset of EE-responsive genes. Ligand-conserved, species-divergent expression of carbonic anhydrase 2 was observed in the microarray data and confirmed by real time PCR. The identification of comparable temporal phenotypic responses linked to related gene expression profiles demonstrates that systematic comparative genomic assessments can elucidate important conserved and divergent mechanisms in rodent estrogen signalling during uterine proliferation.  相似文献   

18.

Objectives

The role of heparanase (HPSE) gene in cancers including hepatocellular carcinoma (HCC) is currently controversial. This study was aimed at investigating the impact of genetic alteration and expression change of HPSE on the progression and prognosis of HCC.

Methods

The HPSE gene was studied in three different aspects: (1) loss of heterozygosity (LOH) by a custom SNP microarray and DNA copy number by real-time PCR; (2) mRNA level by qRT-PCR; and (3) protein expression by immunohistochemistry. The clinical significances of allele loss and expression change of HPSE were analyzed.

Results

Microarray analysis showed that the average LOH frequency for 10 SNPs located within HPSE gene was 31.6%, three of which were significantly correlated with tumor grade, serum HBV-DNA level, and AFP concentration. In agreement with SNP LOH data, DNA copy number loss of HPSE was observed in 38.74% (43/111) of HCC cases. HPSE mRNA level was notably reduced in 74.1% (83/112) of tumor tissues compared with non-tumor liver tissues, which was significantly associated with DNA copy number loss, increased tumor size, and post-operative metastasis. HPSE protein level was also remarkably reduced in 66.3% (53/80) of tumor tissues, which was correlated with tumor grade. Patients with lower expression level of HPSE mRNA or protein had a significantly lower survival rate than those with higher expression. Cox regression analysis suggested that HPSE protein was an independent predictor of overall survival in HCC patients.

Conclusions

The results in this study demonstrate that genetic alteration and reduction of HPSE expression are associated with tumor progression and poor prognosis of HCCs, suggesting that HPSE behaves like a tumor suppressor gene and is a potential prognostic marker for HCC patients.  相似文献   

19.

Background

Genomic prediction faces two main statistical problems: multicollinearity and n ≪ p (many fewer observations than predictor variables). Principal component (PC) analysis is a multivariate statistical method that is often used to address these problems. The objective of this study was to compare the performance of PC regression (PCR) for genomic prediction with that of a commonly used REML model with a genomic relationship matrix (GREML) and to investigate the full potential of PCR for genomic prediction.

Methods

The PCR model used either a common or a semi-supervised approach, where PC were selected based either on their eigenvalues (i.e. proportion of variance explained by SNP (single nucleotide polymorphism) genotypes) or on their association with phenotypic variance in the reference population (i.e. the regression sum of squares contribution). Cross-validation within the reference population was used to select the optimum PCR model that minimizes mean squared error. Pre-corrected average daily milk, fat and protein yields of 1609 first lactation Holstein heifers, from Ireland, UK, the Netherlands and Sweden, which were genotyped with 50 k SNPs, were analysed. Each testing subset included animals from only one country, or from only one selection line for the UK.

Results

In general, accuracies of GREML and PCR were similar but GREML slightly outperformed PCR. Inclusion of genotyping information of validation animals into model training (semi-supervised PCR), did not result in more accurate genomic predictions. The highest achievable PCR accuracies were obtained across a wide range of numbers of PC fitted in the regression (from one to more than 1000), across test populations and traits. Using cross-validation within the reference population to derive the number of PC, yielded substantially lower accuracies than the highest achievable accuracies obtained across all possible numbers of PC.

Conclusions

On average, PCR performed only slightly less well than GREML. When the optimal number of PC was determined based on realized accuracy in the testing population, PCR showed a higher potential in terms of achievable accuracy that was not capitalized when PC selection was based on cross-validation. A standard approach for selecting the optimal set of PC in PCR remains a challenge.

Electronic supplementary material

The online version of this article (doi:10.1186/s12711-014-0060-x) contains supplementary material, which is available to authorized users.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号