首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Over the last decade, multiple functional genomic datasets studying chromosomal aberrations and their downstream effects on gene expression have accumulated for several cancer types. A vast majority of them are in the form of paired gene expression profiles and somatic copy number alterations (CNA) information on the same patients identified using microarray platforms. In response, many algorithms and software packages are available for integrating these paired data. Surprisingly, there has been no serious attempt to review the currently available methodologies or the novel insights brought using them. In this work, we discuss the quantitative relationships observed between CNA and gene expression in multiple cancer types and biological milestones achieved using the available methodologies. We discuss the conceptual evolution of both, the step-wise and the joint data integration methodologies over the last decade. We conclude by providing suggestions for building efficient data integration methodologies and asking further biological questions.  相似文献   

2.
Development of high-throughput monitoring technologies enables interrogation of cancer samples at various levels of cellular activity. Capitalizing on these developments, various public efforts such as The Cancer Genome Atlas (TCGA) generate disparate omic data for large patient cohorts. As demonstrated by recent studies, these heterogeneous data sources provide the opportunity to gain insights into the molecular changes that drive cancer pathogenesis and progression. However, these insights are limited by the vast search space and as a result low statistical power to make new discoveries. In this paper, we propose methods for integrating disparate omic data using molecular interaction networks, with a view to gaining mechanistic insights into the relationship between molecular changes at different levels of cellular activity. Namely, we hypothesize that genes that play a role in cancer development and progression may be implicated by neither frequent mutation nor differential expression, and that network-based integration of mutation and differential expression data can reveal these “silent players”. For this purpose, we utilize network-propagation algorithms to simulate the information flow in the cell at a sample-specific resolution. We then use the propagated mutation and expression signals to identify genes that are not necessarily mutated or differentially expressed genes, but have an essential role in tumor development and patient outcome. We test the proposed method on breast cancer and glioblastoma multiforme data obtained from TCGA. Our results show that the proposed method can identify important proteins that are not readily revealed by molecular data, providing insights beyond what can be gleaned by analyzing different types of molecular data in isolation.  相似文献   

3.
This review describes a vision of a proteomics-on-a-chip device to separate, detect and identify the proteome. It guides the reader towards a development strategy, avoiding some of the pitfalls. It also describes the current state-of-the-art developments in proteomic analysis including available technologies, current market issues, the elements of an envisaged proteomics-on-a-chip device, the required microfabrication processes and the integration of the elements into one device. Address-flow microfluidics is a tool for connecting separation and detection platforms. The final section contains an expert opinion on the recommended development strategies, benefits of proteomics-on-a-chip in the life sciences and the anticipated market.  相似文献   

4.
This review describes a vision of a proteomics-on-a-chip device to separate, detect and identify the proteome. It guides the reader towards a development strategy, avoiding some of the pitfalls. It also describes the current state-of-the-art developments in proteomic analysis including available technologies, current market issues, the elements of an envisaged proteomics-on-a-chip device, the required microfabrication processes and the integration of the elements into one device. Address-flow microfluidics is a tool for connecting separation and detection platforms. The final section contains an expert opinion on the recommended development strategies, benefits of proteomics-on-a-chip in the life sciences and the anticipated market.  相似文献   

5.
Advances in molecular “omics” technologies have motivated new methodologies for the integration of multiple sources of high-content biomedical data. However, most statistical methods for integrating multiple data matrices only consider data shared vertically (one cohort on multiple platforms) or horizontally (different cohorts on a single platform). This is limiting for data that take the form of bidimensionally linked matrices (eg, multiple cohorts measured on multiple platforms), which are increasingly common in large-scale biomedical studies. In this paper, we propose bidimensional integrative factorization (BIDIFAC) for integrative dimension reduction and signal approximation of bidimensionally linked data matrices. Our method factorizes data into (a) globally shared, (b) row-shared, (c) column-shared, and (d) single-matrix structural components, facilitating the investigation of shared and unique patterns of variability. For estimation, we use a penalized objective function that extends the nuclear norm penalization for a single matrix. As an alternative to the complicated rank selection problem, we use results from the random matrix theory to choose tuning parameters. We apply our method to integrate two genomics platforms (messenger RNA and microRNA expression) across two sample cohorts (tumor samples and normal tissue samples) using the breast cancer data from the Cancer Genome Atlas. We provide R code for fitting BIDIFAC, imputing missing values, and generating simulated data.  相似文献   

6.
Recent advances in technology and associated methodology have made the current period one of the most exciting in molecular biology and medicine. Underlying these is an appreciation that modern research is driven by increasing large amounts of data being interpreted by interdisciplinary collaborative teams which are often geographically dispersed. The availability of cheap computing power, high speed informatics networks and high quality analysis software has been essential to this as has the application of modern quality assurance methodologies. In this review, we discuss the application of modern 'High-Throughput' molecular biological technologies such as 'Microarrays' and 'Next Generation Sequencing' to scientific and biomedical research as we have observed. Furthermore in this review, we also offer some guidance that enables the reader as to understand certain features of these as well as new strategies and help them to apply these i-Gene tools in their endeavours successfully. Collectively, we term this 'i-Gene Analysis'. We also offer predictions as to the developments that are anticipated in the near and more distant future.  相似文献   

7.
Many of the steps in phylogenetic reconstruction can be confounded by “rogue” taxa—taxa that cannot be placed with assurance anywhere within the tree, indeed, whose location within the tree varies with almost any choice of algorithm or parameters. Phylogenetic consensus methods, in particular, are known to suffer from this problem. In this paper, we provide a novel framework to define and identify rogue taxa. In this framework, we formulate a bicriterion optimization problem, the relative information criterion, that models the net increase in useful information present in the consensus tree when certain taxa are removed from the input data. We also provide an effective greedy heuristic to identify a subset of rogue taxa and use this heuristic in a series of experiments, with both pathological examples from the literature and a collection of large biological data sets. As the presence of rogue taxa in a set of bootstrap replicates can lead to deceivingly poor support values, we propose a procedure to recompute support values in light of the rogue taxa identified by our algorithm; applying this procedure to our biological data sets caused a large number of edges to move from “unsupported” to “supported” status, indicating that many existing phylogenies should be recomputed and reevaluated to reduce any inaccuracies introduced by rogue taxa. We also discuss the implementation issues encountered while integrating our algorithm into RAxML v7.2.7, particularly those dealing with scaling up the analyses. This integration enables practitioners to benefit from our algorithm in the analysis of very large data sets (up to 2,500 taxa and 10,000 trees, although we present the results of even larger analyses).  相似文献   

8.
The structure and composition of forest ecosystems are expected to shift with climate‐induced changes in precipitation, temperature, fire, carbon mitigation strategies, and biological disturbance. These factors are likely to have biodiversity implications. However, climate‐driven forest ecosystem models used to predict changes to forest structure and composition are not coupled to models used to predict changes to biodiversity. We proposed integrating woodpecker response (biodiversity indicator) with forest ecosystem models. Woodpeckers are a good indicator species of forest ecosystem dynamics, because they are ecologically constrained by landscape‐scale forest components, such as composition, structure, disturbance regimes, and management activities. In addition, they are correlated with forest avifauna community diversity. In this study, we explore integrating woodpecker and forest ecosystem climate models. We review climate–woodpecker models and compare the predicted responses to observed climate‐induced changes. We identify inconsistencies between observed and predicted responses, explore the modeling causes, and identify the models pertinent to integration that address the inconsistencies. We found that predictions in the short term are not in agreement with observed trends for 7 of 15 evaluated species. Because niche constraints associated with woodpeckers are a result of complex interactions between climate, vegetation, and disturbance, we hypothesize that the lack of adequate representation of these processes in the current broad‐scale climate–woodpecker models results in model–data mismatch. As a first step toward improvement, we suggest a conceptual model of climate–woodpecker–forest modeling for integration. The integration model provides climate‐driven forest ecosystem modeling with a measure of biodiversity while retaining the feedback between climate and vegetation in woodpecker climate change modeling.  相似文献   

9.
Despite advances in treatment for glioblastoma multiforme (GBM), patient prognosis remains poor. Although there is growing evidence that molecular targeting could translate into better survival for GBM, current clinical data show limited impact on survival. Recent progress in GBM genomics implicate several activated pathways and numerous mutated genes. This molecular diversity can partially explain therapeutic resistance and several approaches have been postulated to target molecular changes. Furthermore, most drugs are unable to reach effective concentrations within the tumor owing to elevated intratumoral pressure, restrictive vasculature and other limiting factors. Here, we describe the preclinical and clinical developments in treatment strategies of GBM. We review the current clinical trials for GBM and discuss the challenges and future directions of targeted therapies.  相似文献   

10.
11.
With the growing surge of biological measurements, the problem of integrating and analyzing different types of genomic measurements has become an immediate challenge for elucidating events at the molecular level. In order to address the problem of integrating different data types, we present a framework that locates variation patterns in two biological inputs based on the generalized singular value decomposition (GSVD). In this work, we jointly examine gene expression and copy number data and iteratively project the data on different decomposition directions defined by the projection angle /spl theta/ in the GSVD. With the proper choice of /spl theta/, we locate similar and dissimilar patterns of variation between both data types. We discuss the properties of our algorithm using simulated data and conduct a case study with biologically verified results. Ultimately, we demonstrate the efficacy of our method on two genome-wide breast cancer studies to identify genes with large variation in expression and copy number across numerous cell line and tumor samples. Our method identifies genes that are statistically significant in both input measurements. The proposed method is useful for a wide variety of joint copy number and expression-based studies. Supplementary information is available online, including software implementations and experimental data.  相似文献   

12.
Comprehensive characterization of a gene's impact on phenotypes requires knowledge of the context of the gene. To address this issue we introduce a systematic data integration method Candidate Genes and SNPs (CANGES) that links SNP and linkage disequilibrium data to pathway- and protein-protein interaction information. It can be used as a knowledge discovery tool for the search of disease associated causative variants from genome-wide studies as well as to generate new hypotheses on synergistically functioning genes. We demonstrate the utility of CANGES by integrating pathway and protein-protein interaction data to identify putative functional variants for (i) the p53 gene and (ii) three glioblastoma multiforme (GBM) associated risk genes. For the GBM case, we further integrate the CANGES results with clinical and genome-wide data for 209 GBM patients and identify genes having effects on GBM patient survival. Our results show that selecting a focused set of genes can result in information beyond the traditional genome-wide association approaches. Taken together, holistic approach to identify possible interacting genes and SNPs with CANGES provides a means to rapidly identify networks for any set of genes and generate novel hypotheses. CANGES is available in http://csbi.ltdk.helsinki.fi/CANGES/  相似文献   

13.
The use of cross-species hybridization (CSH) to DNA microarrays, in which the target RNA and microarray probe are from different species, has increased in the past few years. CSH is used in comparative, evolutionary and ecological studies of closely related species, and for gene-expression profiling of many species that lack a representative microarray platform. However, unlike species-specific hybridization, CSH is still considered a non-standard use of microarrays. Here, we present the recent developments in the field of CSH for cDNA and oligomer microarray platforms. We discuss issues that influence the quality of CSH results, including platform choice, experiment design and data analysis, and suggest strategies that can lead to improvement of CSH studies to investigate species diversity.  相似文献   

14.
Treatment of BRAF mutant melanomas with specific BRAF inhibitors leads to tumor remission. However, most patients eventually relapse due to drug resistance. Therefore, we designed an integrated strategy using (phospho)proteomic and functional genomic platforms to identify drug targets whose inhibition sensitizes melanoma cells to BRAF inhibition. We found many proteins to be induced upon PLX4720 (BRAF inhibitor) treatment that are known to be involved in BRAF inhibitor resistance, including FOXD3 and ErbB3. Several proteins were down‐regulated, including Rnd3, a negative regulator of ROCK1 kinase. For our genomic approach, we performed two parallel shRNA screens using a kinome library to identify genes whose inhibition sensitizes to BRAF or ERK inhibitor treatment. By integrating our functional genomic and (phospho)proteomic data, we identified ROCK1 as a potential drug target for BRAF mutant melanoma. ROCK1 silencing increased melanoma cell elimination when combined with BRAF or ERK inhibitor treatment. Translating this to a preclinical setting, a ROCK inhibitor showed augmented melanoma cell death upon BRAF or ERK inhibition in vitro. These data merit exploration of ROCK1 as a target in combination with current BRAF mutant melanoma therapies.  相似文献   

15.
Today, toxicoproteomics still relies mainly on 2-DE followed by MS for detection and identification of proteins, which might characterize a certain state of disease, indicate toxicity or even predict carcinogenicity. We utilized the classical 2-DE/MS approach for the evaluation of early protein biomarkers which are predictive for chemically induced hepatocarcinogenesis in rats. We were able to identify statistically significantly deregulated proteins in N-nitrosomorpholine exposed rat liver tissue. Based on literature data, biological relevance in the early molecular process of hepatocarcinogenicity could be suggested for most of these potential biomarkers. However, in order to ensure reliable results and to create the prerequisites necessary for integration in routine toxicology studies in the future, these protein expression patterns need to be prevalidated using independent technology platforms. In the current study, we evaluated the usefulness of iTRAQ reagent technology (Applied Biosystems, Framingham, USA), a recently introduced MS-based protein quantitation method, for verification of the 2-DE/MS biomarkers. In summary, the regulation of 26 2-DE/MS derived protein biomarkers could be verified. Proteins like HSP 90-beta, annexin A5, ketohexokinase, N-hydroxyarylamine sulfotransferase, ornithine aminotransferase, and adenosine kinase showed highly comparable fold changes using both proteomic quantitation strategies. In addition, iTRAQ analysis delivered further potential biomarkers with biological relevance to the processes of hepatocarcinogenicity: e.g. placental form of glutathione S-transferase (GST-P), carbonic anhydrase, and aflatoxin B1 aldehyde reductase. Our results show both the usefulness of iTRAQ reagent technology for biomarker prevalidation as well as for identification of further potential marker proteins, which are indicative for liver hepatocarcinogenicity.  相似文献   

16.
Infection or cell damage triggers the release of pro-inflammatory cytokines such as interleukin(IL)-1α or β and tumor necrosis factor (TNF)α which are key mediators of the host immune response. Following their identification and the elucidation of central signaling pathways, recent results show a highly complex crosstalk between various cytokines and their signaling effectors. The molecular mechanisms controlling signaling thresholds, signal integration and the function of feed-forward and feedback loops are currently revealed by combining methods from biochemistry, genetics and in silico analysis. Increasing evidence is mounted that defects in information processing circuits or their components can be causative for chronic or overshooting inflammation. As progress in biosciences has always benefitted from the use of well-studied model systems, research on inflammatory cytokines may function as a paradigm to reveal general principles of signal integration, crosstalk mechanisms and signaling networks.  相似文献   

17.
Nesvizhskii AI 《Proteomics》2012,12(10):1639-1655
Analysis of protein interaction networks and protein complexes using affinity purification and mass spectrometry (AP/MS) is among most commonly used and successful applications of proteomics technologies. One of the foremost challenges of AP/MS data is a large number of false-positive protein interactions present in unfiltered data sets. Here we review computational and informatics strategies for detecting specific protein interaction partners in AP/MS experiments, with a focus on incomplete (as opposite to genome wide) interactome mapping studies. These strategies range from standard statistical approaches, to empirical scoring schemes optimized for a particular type of data, to advanced computational frameworks. The common denominator among these methods is the use of label-free quantitative information such as spectral counts or integrated peptide intensities that can be extracted from AP/MS data. We also discuss related issues such as combining multiple biological or technical replicates, and dealing with data generated using different tagging strategies. Computational approaches for benchmarking of scoring methods are discussed, and the need for generation of reference AP/MS data sets is highlighted. Finally, we discuss the possibility of more extended modeling of experimental AP/MS data, including integration with external information such as protein interaction predictions based on functional genomics data.  相似文献   

18.
The ultimate goal of functional genomics is to define the function of all the genes in the genome of an organism. A large body of information of the biological roles of genes has been accumulated and aggregated in the past decades of research, both from traditional experiments detailing the role of individual genes and proteins, and from newer experimental strategies that aim to characterize gene function on a genomic scale.It is clear that the goal of functional genomics can only be achieved by integrating information and data sources from the variety of these different experiments. Integration of different data is thus an important challenge for bioinformatics.The integration of different data sources often helps to uncover non-obvious relationships between genes, but there are also two further benefits. First, it is likely that whenever information from multiple independent sources agrees, it should be more valid and reliable. Secondly, by looking at the union of multiple sources, one can cover larger parts of the genome. This is obvious for integrating results from multiple single gene or protein experiments, but also necessary for many of the results from genome-wide experiments since they are often confined to certain (although sizable) subsets of the genome.In this paper, we explore an example of such a data integration procedure. We focus on the prediction of membership in protein complexes for individual genes. For this, we recruit six different data sources that include expression profiles, interaction data, essentiality and localization information. Each of these data sources individually contains some weakly predictive information with respect to protein complexes, but we show how this prediction can be improved by combining all of them. Supplementary information is available at http://bioinfo.mbb.yale.edu/integrate/interactions/.Abbreviations: TP: true possitive; TN: true negative; FP: false positive; FN: false negative; Y2H: yeast two-hybrid.  相似文献   

19.

Metabolomics has advanced significantly in the past 10 years with important developments related to hardware, software and methodologies and an increasing complexity of applications. In discovery-based investigations, applying untargeted analytical methods, thousands of metabolites can be detected with no or limited prior knowledge of the metabolite composition of samples. In these cases, metabolite identification is required following data acquisition and processing. Currently, the process of metabolite identification in untargeted metabolomic studies is a significant bottleneck in deriving biological knowledge from metabolomic studies. In this review we highlight the different traditional and emerging tools and strategies applied to identify subsets of metabolites detected in untargeted metabolomic studies applying various mass spectrometry platforms. We indicate the workflows which are routinely applied and highlight the current limitations which need to be overcome to provide efficient, accurate and robust identification of metabolites in untargeted metabolomic studies. These workflows apply to the identification of metabolites, for which the structure can be assigned based on entries in databases, and for those which are not yet stored in databases and which require a de novo structure elucidation.

  相似文献   

20.
The recently introduced term ‘integrative taxonomy’ refers to taxonomy that integrates all available data sources to frame species limits. We survey current taxonomic methods available to delimit species that integrate a variety of data, including molecular and morphological characters. A literature review of empirical studies using the term ‘integrative taxonomy’ assessed the kinds of data being used to frame species limits, and methods of integration. Almost all studies are qualitative and comparative – we are a long way from a repeatable, quantitative method of truly ‘integrative taxonomy’. The usual methods for integrating data in phylogenetic and population genetic paradigms are not appropriate for integrative taxonomy, either because of the diverse range of data used or because of the special challenges that arise when working at the species/population boundary. We identify two challenges that, if met, will facilitate the development of a more complete toolkit and a more robust research programme in integrative taxonomy using species tree approaches. We propose the term ‘iterative taxonomy’ for current practice that treats species boundaries as hypotheses to be tested with new evidence. A search for biological or evolutionary explanations for discordant evidence can be used to distinguish between competing species boundary hypotheses. We identify two recent empirical examples that use the process of iterative taxonomy.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号