首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Cancer has been increasingly recognized as a systems biology disease since many investigators have demonstrated that this malignant phenotype emerges from abnormal protein-protein, regulatory and metabolic interactions induced by simultaneous structural and regulatory changes in multiple genes and pathways. Therefore, the identification of oncogenic interactions and cancer-related signaling networks is crucial for better understanding cancer. As experimental techniques for determining such interactions and signaling networks are labor-intensive and time-consuming, the development of a computational approach capable to accomplish this task would be of great value. For this purpose, we present here a novel computational approach based on network topology and machine learning capable to predict oncogenic interactions and extract relevant cancer-related signaling subnetworks from an integrated network of human genes interactions (INHGI). This approach, called graph2sig, is twofold: first, it assigns oncogenic scores to all interactions in the INHGI and then these oncogenic scores are used as edge weights to extract oncogenic signaling subnetworks from INHGI. Regarding the prediction of oncogenic interactions, we showed that graph2sig is able to recover 89% of known oncogenic interactions with a precision of 77%. Moreover, the interactions that received high oncogenic scores are enriched in genes for which mutations have been causally implicated in cancer. We also demonstrated that graph2sig is potentially useful in extracting oncogenic signaling subnetworks: more than 80% of constructed subnetworks contain more than 50% of original interactions in their corresponding oncogenic linear pathways present in the KEGG PATHWAY database. In addition, the potential oncogenic signaling subnetworks discovered by graph2sig are supported by experimental evidence. Taken together, these results suggest that graph2sig can be a useful tool for investigators involved in cancer research interested in detecting signaling networks most prone to contribute with the emergence of malignant phenotype.  相似文献   

2.
3.
4.
Gene expression profiling has been widely used to study molecular signatures of many diseases and to develop molecular diagnostics for disease prediction. Gene selection, as an important step for improved diagnostics, screens tens of thousands of genes and identifies a small subset that discriminates between disease types. A two-step gene selection method is proposed to identify informative gene subsets for accurate classification of multiclass phenotypes. In the first step, individually discriminatory genes (IDGs) are identified by using one-dimensional weighted Fisher criterion (wFC). In the second step, jointly discriminatory genes (JDGs) are selected by sequential search methods, based on their joint class separability measured by multidimensional weighted Fisher criterion (wFC). The performance of the selected gene subsets for multiclass prediction is evaluated by artificial neural networks (ANNs) and/or support vector machines (SVMs). By applying the proposed IDG/JDG approach to two microarray studies, that is, small round blue cell tumors (SRBCTs) and muscular dystrophies (MDs), we successfully identified a much smaller yet efficient set of JDGs for diagnosing SRBCTs and MDs with high prediction accuracies (96.9% for SRBCTs and 92.3% for MDs, resp.). These experimental results demonstrated that the two-step gene selection method is able to identify a subset of highly discriminative genes for improved multiclass prediction.  相似文献   

5.
To identify gene expression responses common to multiple pulmonary diseases we collected microarray data for acute lung inflammation models from 12 studies and used these in a meta-analysis. The data used include exposures to air pollutants; bacterial, viral, and parasitic infections; and allergic asthma models. Hierarchical clustering revealed a cluster of 383 up-regulated genes with a common response. This cluster contained five subsets, each characterized by more specific functions such as inflammatory response, interferon-induced genes, immune signaling, or cell proliferation. Of these subsets, the inflammatory response was common to all models, interferon-induced responses were more pronounced in bacterial and viral models, and a cell division response was more prominent in parasitic and allergic models. A common cluster containing 157 moderately down-regulated genes was associated with the effects of tissue damage. Responses to influenza in macaques were weaker than in mice, reflecting differences in the degree of lung inflammation and/or virus replication. The existence of a common cluster shows that in vivo lung inflammation in response to various pathogens or exposures proceeds through shared molecular mechanisms.  相似文献   

6.
7.
8.
9.
Machine learning techniques offer a viable approach to cluster discovery from microarray data, which involves identifying and classifying biologically relevant groups in genes and conditions. It has been recognized that genes (whether or not they belong to the same gene group) may be co-expressed via a variety of pathways. Therefore, they can be adequately described by a diversity of coherence models. In fact, it is known that a gene may participate in multiple pathways that may or may not be co-active under all conditions. It is therefore biologically meaningful to simultaneously divide genes into functional groups and conditions into co-active categories--leading to the so-called biclustering analysis. For this, we have proposed a comprehensive set of coherence models to cope with various plausible regulation processes. Furthermore, a multivariate biclustering analysis based on fusion of different coherence models appears to be promising because the expression level of genes from the same group may follow more than one coherence models. The simulation studies further confirm that the proposed framework enjoys the advantage of high prediction performance.  相似文献   

10.
11.
Microarray gene expression data can provide insights into biological processes at a system-wide level and is commonly used for reverse engineering gene regulatory networks (GRN). Due to the amalgamation of noise from different sources, microarray expression profiles become inherently noisy leading to significant impact on the GRN reconstruction process. Microarray replicates (both biological and technical), generated to increase the reliability of data obtained under noisy conditions, have limited influence in enhancing the accuracy of reconstruction . Therefore, instead of the conventional GRN modeling approaches which are deterministic, stochastic techniques are becoming increasingly necessary for inferring GRN from noisy microarray data. In this paper, we propose a new stochastic GRN model by investigating incorporation of various standard noise measurements in the deterministic S-system model. Experimental evaluations performed for varying sizes of synthetic network, representing different stochastic processes, demonstrate the effect of noise on the accuracy of genetic network modeling and the significance of stochastic modeling for GRN reconstruction . The proposed stochastic model is subsequently applied to infer the regulations among genes in two real life networks: (1) the well-studied IRMA network, a real-life in-vivo synthetic network constructed within the Saccharomycescerevisiae yeast, and (2) the SOS DNA repair network in Escherichiacoli.  相似文献   

12.
Overlaying differential changes in gene expression on protein interaction networks has proven to be a useful approach to interpreting the cell's dynamic response to a changing environment. Despite successes in finding active subnetworks in the context of a single species, the idea of overlaying lists of differentially expressed genes on networks has not yet been extended to support the analysis of multiple species' interaction networks. To address this problem, we designed a scalable, cross-species network search algorithm, neXus (Network-cross(X)-species-Search), that discovers conserved, active subnetworks based on parallel differential expression studies in multiple species. Our approach leverages functional linkage networks, which provide more comprehensive coverage of functional relationships than physical interaction networks by combining heterogeneous types of genomic data. We applied our cross-species approach to identify conserved modules that are differentially active in stem cells relative to differentiated cells based on parallel gene expression studies and functional linkage networks from mouse and human. We find hundreds of conserved active subnetworks enriched for stem cell-associated functions such as cell cycle, DNA repair, and chromatin modification processes. Using a variation of this approach, we also find a number of species-specific networks, which likely reflect mechanisms of stem cell function that have diverged between mouse and human. We assess the statistical significance of the subnetworks by comparing them with subnetworks discovered on random permutations of the differential expression data. We also describe several case examples that illustrate the utility of comparative analysis of active subnetworks.  相似文献   

13.
Genetic interactions help map biological processes and their functional relationships. A genetic interaction is defined as a deviation from the expected phenotype when combining multiple genetic mutations. In Saccharomyces cerevisiae, most genetic interactions are measured under a single phenotype - growth rate in standard laboratory conditions. Recently genetic interactions have been collected under different phenotypic readouts and experimental conditions. How different are these networks and what can we learn from their differences? We conducted a systematic analysis of quantitative genetic interaction networks in yeast performed under different experimental conditions. We find that networks obtained using different phenotypic readouts, in different conditions and from different laboratories overlap less than expected and provide significant unique information. To exploit this information, we develop a novel method to combine individual genetic interaction data sets and show that the resulting network improves gene function prediction performance, demonstrating that individual networks provide complementary information. Our results support the notion that using diverse phenotypic readouts and experimental conditions will substantially increase the amount of gene function information produced by genetic interaction screens.  相似文献   

14.
15.
An important problem in the analysis of large-scale gene expression data is the validation of gene expression clusters. By examining the temporal expression patterns of 74 genes expressed in rat spinal cord under three different experimental conditions, we have found evidence that some genes cluster together under multiple conditions. Using RT-PCR data from spinal cord development and two sets of microarray data from spinal injury, we applied Spearman correlation to identify clusters and to assign P values to pairs of genes with highly similar temporal expression patterns. We found that 15% of genes occurred in statistically significant pairs in all three experimental conditions, providing both statistical and experimental support for the idea that genes that cluster together are co-regulated. In addition, we demonstrated that DNA microarray and RT-PCR data are comparable, and can be combined to confirm gene expression relationships.  相似文献   

16.
The growing body of DNA microarray data has the potential to advance our understanding of the molecular basis of disease. However annotating microarray datasets with clinically useful information is not always possible, as this often requires access to detailed patient records. In this study we introduce GLAD, a new Semi-Supervised Learning (SSL) method for combining independent annotated datasets and unannotated datasets with the aim of identifying more robust sample classifiers. In our method, independent models are developed using subsets of genes for the annotated and unannotated datasets. These models are evaluated according to a scoring function that incorporates terms for classification accuracy on annotated data, and relative cluster separation in unannotated data. Improved models are iteratively generated using a genetic algorithm feature selection technique. Our results show that the addition of unannotated data into training, significantly improves classifier robustness.  相似文献   

17.
18.
19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号