首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
3.
The decreasing cost of sequencing is leading to a growing repertoire of personal genomes. However, we are lagging behind in understanding the functional consequences of the millions of variants obtained from sequencing. Global system-wide effects of variants in coding genes are particularly poorly understood. It is known that while variants in some genes can lead to diseases, complete disruption of other genes, called ‘loss-of-function tolerant’, is possible with no obvious effect. Here, we build a systems-based classifier to quantitatively estimate the global perturbation caused by deleterious mutations in each gene. We first survey the degree to which gene centrality in various individual networks and a unified ‘Multinet’ correlates with the tolerance to loss-of-function mutations and evolutionary conservation. We find that functionally significant and highly conserved genes tend to be more central in physical protein-protein and regulatory networks. However, this is not the case for metabolic pathways, where the highly central genes have more duplicated copies and are more tolerant to loss-of-function mutations. Integration of three-dimensional protein structures reveals that the correlation with centrality in the protein-protein interaction network is also seen in terms of the number of interaction interfaces used. Finally, combining all the network and evolutionary properties allows us to build a classifier distinguishing functionally essential and loss-of-function tolerant genes with higher accuracy (AUC = 0.91) than any individual property. Application of the classifier to the whole genome shows its strong potential for interpretation of variants involved in Mendelian diseases and in complex disorders probed by genome-wide association studies.  相似文献   

4.
Next-generation sequencing(NGS) technologies generate thousands to millions of genetic variants per sample.Identification of potential disease-causal variants is labor intensive as it relies on filtering using various annotation metrics and consideration of multiple pathogenicity prediction scores.We have developed VPOT(variant prioritization ordering tool),a python-based command line tool that allows researchers to create a single fully customizable pathogenicity ranking score from any number of annotation values,each with a user-defined weighting.The use of VPOT can be informative when analyzing entire cohorts,as variants in a cohort can be prioritized.VPOT also provides additional functions to allow variant filtering based on a candidate gene list or by affected status in a family pedigree.VPOT outperforms similar tools in terms of efficacy,flexibility,scalability,and computational performance.VPOT is freely available for public use at Git Hub(https://github.com/VCCRI/VPOT/).Documentation for installation along with a user tutorial,a default parameter file,and test data are provided.  相似文献   

5.
Hepatocellular carcinoma (HCC) is the world’s third most widespread cancer. Currently available circulating biomarkers for this silently progressing malignancy are not sufficiently specific and sensitive to meet all clinical needs. There is an imminent and pressing need for the identification of novel circulating biomarkers to increase disease-free survival rate. In order to facilitate the selection of the most promising circulating protein biomarkers, we attempted to define an objective method likely to have a significant impact on the analysis of vast data generated from cutting-edge technologies. Current study exploits data available in seven publicly accessible gene and protein databases, unveiling 731 liver-specific proteins through initial enrichment analysis. Verification of expression profiles followed by integration of proteomic datasets, enriched for the cancer secretome, filtered out 20 proteins including 6 previously characterized circulating HCC biomarkers. Finally, interactome analysis of these proteins with midkine (MDK), dickkopf-1 (DKK-1), current standard HCC biomarker alpha-fetoprotein (AFP), its interacting partners in conjunction with HCC-specific circulating and liver deregulated miRNAs target filtration highlighted seven novel statistically significant putative biomarkers including complement component 8, alpha (C8A), mannose binding lectin (MBL2), antithrombin III (SERPINC1), 11β-hydroxysteroid dehydrogenase type 1 (HSD11B1), alcohol dehydrogenase 6 (ADH6), beta-ureidopropionase (UPB1) and cytochrome P450, family 2, subfamily A, polypeptide 6 (CYP2A6). Our proposed methodology provides a swift assortment process for biomarker prioritization that eventually reduces the economic burden of experimental evaluation. Further dedicated validation studies of potential putative biomarkers on HCC patient blood samples are warranted. We hope that the use of such integrative secretome, interactome and miRNAs target filtration approach will accelerate the selection of high-priority biomarkers for other diseases as well, that are more amenable to downstream clinical validation experiments.  相似文献   

6.
7.
The accumulation of various types of drug informatics data and computational approaches for drug repositioning can accelerate pharmaceutical research and development. However, the integration of multi-dimensional drug data for precision repositioning remains a pressing challenge. Here, we propose a systematic framework named PIMD to predict drug therapeutic properties by integrating multi-dimensional data for drug repositioning. In PIMD, drug similarity networks (DSNs) based on chemical, pharmacological, and clinical data are fused into an integrated DSN (iDSN) composed of many clusters. Rather than simple fusion, PIMD offers a systematic way to annotate clusters. Unexpected drugs within clusters and drug pairs with a high iDSN similarity score are therefore identified to predict novel therapeutic uses. PIMD provides new insights into the universality, individuality, and complementarity of different drug properties by evaluating the contribution of each property data. To test the performance of PIMD, we use chemical, pharmacological, and clinical properties to generate an iDSN. Analyses of the contributions of each drug property indicate that this iDSN was driven by all data types and performs better than other DSNs. Within the top 20 recommended drug pairs, 7 drugs have been reported to be repurposed. The source code for PIMD is available at https://github.com/Sepstar/PIMD/.  相似文献   

8.
Errors in sample annotation or labeling often occur in large-scale genetic or genomic studies and are difficult to avoid completely during data generation and management. For integrative genomic studies, it is critical to identify and correct these errors. Different types of genetic and genomic data are inter-connected by cis-regulations. On that basis, we developed a computational approach, Multi-Omics Data Matcher (MODMatcher), to identify and correct sample labeling errors in multiple types of molecular data, which can be used in further integrative analysis. Our results indicate that inspection of sample annotation and labeling error is an indispensable data quality assurance step. Applied to a large lung genomic study, MODMatcher increased statistically significant genetic associations and genomic correlations by more than two-fold. In a simulation study, MODMatcher provided more robust results by using three types of omics data than two types of omics data. We further demonstrate that MODMatcher can be broadly applied to large genomic data sets containing multiple types of omics data, such as The Cancer Genome Atlas (TCGA) data sets.  相似文献   

9.
Array CGH for the detection of genomic copy number variants has replaced G-banded karyotype analysis. This paper describes the technology and its application in a clinical diagnostic service laboratory. DNA extracted from a patient’s sample (blood, saliva or other tissue types) is labeled with a fluorochrome (either cyanine 5 or cyanine 3). A reference DNA sample is labeled with the opposite fluorochrome. There follows a cleanup step to remove unincorporated nucleotides before the labeled DNAs are mixed and resuspended in a hybridization buffer and applied to an array comprising ~60,000 oligonucleotide probes from loci across the genome, with high probe density in clinically important areas. Following hybridization, the arrays are washed, then scanned and the resulting images are analyzed to measure the red and green fluorescence for each probe. Software is used to assess the quality of each probe measurement, calculate the ratio of red to green fluorescence and detect potential copy number variants.  相似文献   

10.
Understanding the mechanisms and the time and spatial evolution of penumbra following an ischemic stroke is crucially important for developing therapeutics aimed at preventing this area from evolving towards infarction. To help in integrating the available data, we decided to build a formal model. We first collected and categorised the major available evidence from animal models and human observations and summarized this knowledge in a flow-chart with the potential key components of an evolving stroke. Components were grouped in ten sub-models that could be modelled and tested independently: the sub-models of tissue reactions, ionic movements, oedema development, glutamate excitotoxicity, spreading depression, NO synthesis, inflammation, necrosis, apoptosis, and reperfusion. Then, we figured out markers, identified mediators and chose the level of complexity to model these sub-models. We first applied this integrative approach to build a model based on cytotoxic oedema development following a stroke. Although this model includes only three sub-models and would need to integrate more mechanisms in each of these sub-models, the characteristics and the time and spatial evolution of penumbra obtained by simulation are qualitatively and, to some extent, quantitatively consistent with those observed using medical imaging after a permanent occlusion or after an occlusion followed by a reperfusion.  相似文献   

11.
Laboratory inbred mouse models are a valuable resource to identify quantitative trait loci (QTL) for complex reproductive performance traits. Advances in mouse genomics and high density single nucleotide polymorphism mapping has enabled genome-wide association studies to identify genes linked with specific phenotypes. Gene expression profiles of reproductive tissues also provide potentially useful information for identifying genes that play an important role. We have developed a highly fecund inbred strain, QSi5, with accompanying genotyping for comparative analysis of reproductive performance. Here we analyzed the QSi5 phenotype using a comparative analysis with fecundity data derived from 22 inbred strains of mice from the Mouse Phenome Project, and integration with published expression data from mouse ovary development. Using a haplotype association approach, 400 fecundity-associated regions (FDR < 0.05) with 499 underlying genes were identified. The most significant associations were located on Chromosomes 14, 8, and 6, and the genes underlying these regions were extracted. When these genes were analyzed for expression in an ovarian development profile (GSE6916) several distinctive co-expression patterns across each developmental stage were identified. The genetic analysis also refined 21 fecundity associated intervals on Chromosomes 1, 6, 9, 13, and 17 that overlapped with previously reported reproductive performance QTL. The combined use of phenotypic and in silico data with an integrative genomic analysis provides a powerful tool for elucidating the molecular mechanisms underlying fecundity.  相似文献   

12.
Planning for Restoration: A Decision Analysis Approach to Prioritization   总被引:2,自引:0,他引:2  
Ecological restoration often relies on the use of expert opinion to make management decisions in the face of uncertainty. The quantification of expert opinion can be difficult, especially when more than one expert is consulted and experts are not in agreement. Decision analysis can provide a framework to systematically deconstruct a complex problem and provide greater objectivity to restoration decisions. We utilized decision analysis techniques to identify restoration objectives and to quantify expert opinions to prioritize restoration activities at 112 prairie openings in the Edge of Appalachia Preserve in southern Ohio, U.S.A. We first created an objectives hierarchy to model how decision‐makers decide which prairies to manage. We then determined how to measure each component of the hierarchy and sampled all prairies for percent woody cover, geology, indicator species index (an index of plant species richness), slope, aspect, and distance to nearest prairie. We modeled seven different experts’ preferences for managing prairies with varying values for each of these ecological measures. We then interviewed the same decision‐makers to determine relative weights for each component of the objectives hierarchy using trade‐off analysis. By combining the weights, preference relationships, and sampling data, we were able to rank each prairie and management unit based on its management priority. Experts had similar preferences except for the measure of distance to nearest prairie. We found that decision‐makers gave different weights to each of the different components of the hierarchy. Generally, experts weighted percent woody cover, indicator species index, and geology more highly than slope, aspect, and distance to nearest prairie. Despite these differences, priorities for management, once all factors were weighted and combined, were similar.  相似文献   

13.
Next-generation sequencing (NGS) technologies provide the potential for developing high-throughput and low-cost platforms for clinical diagnostics. A limiting factor to clinical applications of genomic NGS is downstream bioinformatics analysis for data interpretation. We have developed an integrated approach for end-to-end clinical NGS data analysis from variant detection to functional profiling. Robust bioinformatics pipelines were implemented for genome alignment, single nucleotide polymorphism (SNP), small insertion/deletion (InDel), and copy number variation (CNV) detection of whole exome sequencing (WES) data from the Illumina platform. Quality-control metrics were analyzed at each step of the pipeline by use of a validated training dataset to ensure data integrity for clinical applications. We annotate the variants with data regarding the disease population and variant impact. Custom algorithms were developed to filter variants based on criteria, such as quality of variant, inheritance pattern, and impact of variant on protein function. The developed clinical variant pipeline links the identified rare variants to Integrated Genome Viewer for visualization in a genomic context and to the Protein Information Resource’s iProXpress for rich protein and disease information. With the application of our system of annotations, prioritizations, inheritance filters, and functional profiling and analysis, we have created a unique methodology for downstream variant filtering that empowers clinicians and researchers to interpret more effectively the relevance of genomic alterations within a rare genetic disease.  相似文献   

14.
15.
Cancer is a heterogeneous disease caused by diverse genomic alterations in oncogenes and tumor suppressor genes. Despite recent advances in high-throughput sequencing technologies and development of targeted therapies, novel cancer drug development is limited due to the high attrition rate from clinical studies. Patient-derived xenografts (PDX), which are established by the transfer of patient tumors into immunodeficient mice, serve as a platform for co-clinical trials by enabling the integration of clinical data, genomic profiles, and drug responsiveness data to determine precisely targeted therapies. PDX models retain many of the key characteristics of patients’ tumors including histology, genomic signature, cellular heterogeneity, and drug responsiveness. These models can also be applied to the development of biomarkers for drug responsiveness and personalized drug selection. This review summarizes our current knowledge of this field, including methodologic aspects, applications in drug development, challenges and limitations, and utilization for precision cancer medicine.  相似文献   

16.
17.
18.
19.
20.
Fanconi anemia (FA) is a heterogeneous recessive disorder associated with a markedly elevated risk to develop cancer. To date sixteen FA genes have been identified, three of which predispose heterozygous mutation carriers to breast cancer. The FA proteins work together in a genome maintenance pathway, the so-called FA/BRCA pathway which is important during the S phase of the cell cycle. Since not all FA patients can be linked to (one of) the sixteen known complementation groups, new FA genes remain to be identified. In addition the complex FA network remains to be further unravelled. One of the FA genes, FANCI, has been identified via a combination of bioinformatic techniques exploiting FA protein properties and genetic linkage. The aim of this study was to develop a prioritization approach for proteins of the entire human proteome that potentially interact with the FA/BRCA pathway or are novel candidate FA genes. To this end, we combined the original bioinformatics approach based on the properties of the first thirteen FA proteins identified with publicly available tools for protein-protein interactions, literature mining (Nermal) and a protein function prediction tool (FuncNet). Importantly, the three newest FA proteins FANCO/RAD51C, FANCP/SLX4, and XRCC2 displayed scores in the range of the already known FA proteins. Likewise, a prime candidate FA gene based on next generation sequencing and having a very low score was subsequently disproven by functional studies for the FA phenotype. Furthermore, the approach strongly enriches for GO terms such as DNA repair, response to DNA damage stimulus, and cell cycle-regulated genes. Additionally, overlaying the top 150 with a haploinsufficiency probability score, renders the approach more tailored for identifying breast cancer related genes. This approach may be useful for prioritization of putative novel FA or breast cancer genes from next generation sequencing efforts.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号