首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Driver mutations are somatic mutations that provide growth advantage to tumor cells, while passenger mutations are those not functionally related to oncogenesis. Distinguishing drivers from passengers is challenging because drivers occur much less frequently than passengers, they tend to have low prevalence, their functions are multifactorial and not intuitively obvious. Missense mutations are excellent candidates as drivers, as they occur more frequently and are potentially easier to identify than other types of mutations. Although several methods have been developed for predicting the functional impact of missense mutations, only a few have been specifically designed for identifying driver mutations. As more mutations are being discovered, more accurate predictive models can be developed using machine learning approaches that systematically characterize the commonality and peculiarity of missense mutations under the background of specific cancer types. Here, we present a cancer driver annotation (CanDrA) tool that predicts missense driver mutations based on a set of 95 structural and evolutionary features computed by over 10 functional prediction algorithms such as CHASM, SIFT, and MutationAssessor. Through feature optimization and supervised training, CanDrA outperforms existing tools in analyzing the glioblastoma multiforme and ovarian carcinoma data sets in The Cancer Genome Atlas and the Cancer Cell Line Encyclopedia project.  相似文献   

2.
Cancer genome sequencing has shown that driver genes can often be distinguished not only by the elevated mutation frequency but also by specific nucleotide positions that accumulate changes at a high rate. However, properties associated with a residue's potential to drive tumorigenesis when mutated have not yet been systematically investigated. Here, using a novel methodological approach, we identify and characterize a compendium of 180 hotspot residues within 160 human proteins which occur with a significant frequency and are likely to have functionally relevant impact. We find that such mutations (i) are more prominent in proteins that can exist in the on and off state, (ii) reflect the identity of a tumor of origin, and (iii) often localize within interfaces which mediate interactions with other proteins or ligands. Following, we further examine structural data for human protein complexes and identify a number of additional protein interfaces that accumulate cancer mutations at a high rate. Jointly, these analyses suggest that disruption and dysregulation of protein interactions can be instrumental in switching functions of cancer proteins and activating downstream changes.  相似文献   

3.
Cancer is a genetic disease that results from a variety of genomic alterations. Identification of some of these causal genetic events has enabled the development of targeted therapeutics and spurred efforts to discover the key genes that drive cancer formation. Rapidly improving sequencing and genotyping technology continues to generate increasingly large datasets that require analytical methods to identify functional alterations that deserve additional investigation. This review examines statistical and computational approaches for the identification of functional changes among sets of single-nucleotide substitutions. Frequency-based methods identify the most highly mutated genes in large-scale cancer sequencing efforts while bioinformatics approaches are effective for independent evaluation of both non-synonymous mutations and polymorphisms. We also review current knowledge and tools that can be utilized for analysis of alterations in non-protein-coding genomic sequence.  相似文献   

4.
Bioinformatic tools are widely utilized to predict functional single nucleotide polymorphisms (SNPs) for genotyping in molecular epidemiological studies. However, the extent to which these approaches are mirrored by epidemiological findings has not been fully explored. In this study, we first surveyed SNPs examined in case-control studies of lung cancer, the most extensively studied cancer type. We then computed SNP functional scores using four popular bioinformatics tools: SIFT, PolyPhen, SNPs3D, and PMut, and determined their predictive potential using the odds ratios (ORs) reported. Spearman's correlation coefficient (r) for the association with SNP score from SIFT, PolyPhen, SNPs3D, and PMut, and the summary ORs were r=-0.36 (p=0.007), r=0.25 (p=0.068), r=-0.20 (p=0.205), and r=-0.12 (p=0.370), respectively. By creating a combined score using information from all four tools we were able to achieve a correlation coefficient of r=0.51 (p<0.001). These results indicate that scores of predicted functionality could explain a certain fraction of the lung cancer risk detected in genetic association studies and more accurate predictions may be obtained by combining information from a variety of tools. Our findings suggest that bioinformatic tools are useful in predicting SNP functionality and may facilitate future genetic epidemiological studies.  相似文献   

5.
Breast cancer is one of the most common cancers among the women around the world. Several genes are known to be responsible for conferring the susceptibility to breast cancer. Among them, TP53 is one of the major genetic risk factor which is known to be mutated in many of the breast tumor types. TP53 mutations in breast cancer are known to be related to a poor prognosis and chemo resistance. This renders them as a promising molecular target for the treatment of breast cancer. In this study, we present a computational based screening and molecular dynamic simulation of breast cancer associated deleterious non-synonymous single nucleotide polymorphisms in TP53. We have predicted three deleterious coding non-synonymous single nucleotide polymorphisms rs11540654 (R110P), rs17849781 (P278A) and rs28934874 (P151T) in TP53 with a phenotype in breast tumors using computational tools SIFT, Polyphen-2 and MutDB. We have performed molecular dynamics simulations to study the structural and dynamic effects of these TP53 mutations in comparison to the wild-type protein. Results from our simulations revealed a detailed consequence of the mutations on the p53 DNA-binding core domain that may provide insight for therapeutic approaches in breast cancer.  相似文献   

6.
As large-scale re-sequencing of genomes reveals many protein mutations, especially in human cancer tissues, prediction of their likely functional impact becomes important practical goal. Here, we introduce a new functional impact score (FIS) for amino acid residue changes using evolutionary conservation patterns. The information in these patterns is derived from aligned families and sub-families of sequence homologs within and between species using combinatorial entropy formalism. The score performs well on a large set of human protein mutations in separating disease-associated variants (∼19 200), assumed to be strongly functional, from common polymorphisms (∼35 600), assumed to be weakly functional (area under the receiver operating characteristic curve of ∼0.86). In cancer, using recurrence, multiplicity and annotation for ∼10 000 mutations in the COSMIC database, the method does well in assigning higher scores to more likely functional mutations (‘drivers’). To guide experimental prioritization, we report a list of about 1000 top human cancer genes frequently mutated in one or more cancer types ranked by likely functional impact; and, an additional 1000 candidate cancer genes with rare but likely functional mutations. In addition, we estimate that at least 5% of cancer-relevant mutations involve switch of function, rather than simply loss or gain of function.  相似文献   

7.
Cancer is a genetic disease that develops through a series of somatic mutations, a subset of which drive cancer progression. Although cancer genome sequencing studies are beginning to reveal the mutational patterns of genes in various cancers, identifying the small subset of “causative” mutations from the large subset of “non-causative” mutations, which accumulate as a consequence of the disease, is a challenge. In this article, we present an effective machine learning approach for identifying cancer-associated mutations in human protein kinases, a class of signaling proteins known to be frequently mutated in human cancers. We evaluate the performance of 11 well known supervised learners and show that a multiple-classifier approach, which combines the performances of individual learners, significantly improves the classification of known cancer-associated mutations. We introduce several novel features related specifically to structural and functional characteristics of protein kinases and find that the level of conservation of the mutated residue at specific evolutionary depths is an important predictor of oncogenic effect. We consolidate the novel features and the multiple-classifier approach to prioritize and experimentally test a set of rare unconfirmed mutations in the epidermal growth factor receptor tyrosine kinase (EGFR). Our studies identify T725M and L861R as rare cancer-associated mutations inasmuch as these mutations increase EGFR activity in the absence of the activating EGF ligand in cell-based assays.  相似文献   

8.
Predicting the phenotypes of missense mutations uncovered by large‐scale sequencing projects is an important goal in computational biology. High‐confidence predictions can be an aid in focusing experimental and association studies on those mutations most likely to be associated with causative relationships between mutation and disease. As an aid in developing these methods further, we have derived a set of random mutations of the enzymatic domains of human cystathionine beta synthase. This enzyme is a dimeric protein that catalyzes the condensation of serine and homocysteine to produce cystathionine. Yeast missing this enzyme cannot grow on medium lacking a source of cysteine, while transfection of functional human CBS into yeast strains missing endogenous enzyme can successfully complement for the missing gene. We used PCR mutagenesis with error‐prone Taq polymerase to produce 948 colonies and compared cell growth in the presence or absence of a cysteine source as a measure of CBS function. We were able to infer the phenotypes of 204 single‐site mutants, 79 of them deleterious and 125 neutral. This set was used to test the accuracy of six publicly available prediction methods for phenotype prediction of missense mutations: SIFT, PolyPhen, PMut, SNPs3D, PhD‐SNP, and nsSNPAnalyzer. The top methods are PolyPhen, SIFT, and nsSNPAnalyzer, which have similar performance. Using kernel discriminant functions, we found that the difference in position‐specific scoring matrix values is more predictive than the wild‐type PSSM score alone, and that the relative surface area in the biologically relevant complex is more predictive than that of the monomeric proteins. Proteins 2010. © 2010 Wiley‐Liss, Inc.  相似文献   

9.
We assessed the disease-causing potential of single nucleotide polymorphisms (SNPs) based on a simple set of sequence-based features. We focused on SNPs from the dbSNP database in G-protein-coupled receptors (GPCRs), a large class of important transmembrane (TM) proteins. Apart from the location of the SNP in the protein, we evaluated the predictive power of three major classes of features to differentiate between disease-causing mutations and neutral changes: (i) properties derived from amino-acid scales, such as volume and hydrophobicity; (ii) position-specific phylogenetic features reflecting evolutionary conservation, such as normalized site entropy, residue frequency and SIFT score; and (iii) substitution-matrix scores, such as those derived from the BLOSUM62, GRANTHAM and PHAT matrices. We validated our approach using a control dataset consisting of known disease-causing mutations and neutral variations. Logistic regression analyses indicated that position-specific phylogenetic features that describe the conservation of an amino acid at a specific site are the best discriminators of disease mutations versus neutral variations, and integration of all our features improves discrimination power. Overall, we identify 115 SNPs in GPCRs from dbSNP that are likely to be associated with disease and thus are good candidates for genotyping in association studies.  相似文献   

10.
Neuronal ceroid lipofuscinoses are a group of fatal progressive neurodegenerative diseases predominantly affecting children. Identification of mutations that cause neuronal ceroid lipofuscinosis, and subsequent functional and pathological studies of the affected genes, underpins efforts to investigate disease mechanisms and identify and test potential therapeutic strategies. These functional studies and pre-clinical trials necessitate the use of model organisms in addition to cell and tissue culture models as they enable the study of protein function within a complex organ such as the brain and the testing of therapies on a whole organism. To this end, a large number of disease models and genetic tools have been identified or created in a variety of model organisms. In this review, we will discuss the ethical issues associated with experiments using model organisms, the factors underlying the choice of model organism, the disease models and genetic tools available, and the contributions of those disease models and tools to neuronal ceroid lipofuscinosis research. This article is part of a Special Issue entitled: The Neuronal Ceroid Lipofuscinoses or Batten Disease.  相似文献   

11.
Protein kinases are the most common protein domains implicated in cancer, where somatically acquired mutations are known to be functionally linked to a variety of cancers. Resequencing studies of protein kinase coding regions have emphasized the importance of sequence and structure determinants of cancer-causing kinase mutations in understanding of the mutation-dependent activation process. We have developed an integrated bioinformatics resource, which consolidated and mapped all currently available information on genetic modifications in protein kinase genes with sequence, structure and functional data. The integration of diverse data types provided a convenient framework for kinome-wide study of sequence-based and structure-based signatures of cancer mutations. The database-driven analysis has revealed a differential enrichment of SNPs categories in functional regions of the kinase domain, demonstrating that a significant number of cancer mutations could fall at structurally equivalent positions (mutational hotspots) within the catalytic core. We have also found that structurally conserved mutational hotspots can be shared by multiple kinase genes and are often enriched by cancer driver mutations with high oncogenic activity. Structural modeling and energetic analysis of the mutational hotspots have suggested a common molecular mechanism of kinase activation by cancer mutations, and have allowed to reconcile the experimental data. According to a proposed mechanism, structural effect of kinase mutations with a high oncogenic potential may manifest in a significant destabilization of the autoinhibited kinase form, which is likely to drive tumorigenesis at some level. Structure-based functional annotation and prediction of cancer mutation effects in protein kinases can facilitate an understanding of the mutation-dependent activation process and inform experimental studies exploring molecular pathology of tumorigenesis.  相似文献   

12.
Li BQ  Huang T  Liu L  Cai YD  Chou KC 《PloS one》2012,7(4):e33393
One of the most important and challenging problems in biomedicine and genomics is how to identify the disease genes. In this study, we developed a computational method to identify colorectal cancer-related genes based on (i) the gene expression profiles, and (ii) the shortest path analysis of functional protein association networks. The former has been used to select differentially expressed genes as disease genes for quite a long time, while the latter has been widely used to study the mechanism of diseases. With the existing protein-protein interaction data from STRING (Search Tool for the Retrieval of Interacting Genes), a weighted functional protein association network was constructed. By means of the mRMR (Maximum Relevance Minimum Redundancy) approach, six genes were identified that can distinguish the colorectal tumors and normal adjacent colonic tissues from their gene expression profiles. Meanwhile, according to the shortest path approach, we further found an additional 35 genes, of which some have been reported to be relevant to colorectal cancer and some are very likely to be relevant to it. Interestingly, the genes we identified from both the gene expression profiles and the functional protein association network have more cancer genes than the genes identified from the gene expression profiles alone. Besides, these genes also had greater functional similarity with the reported colorectal cancer genes than the genes identified from the gene expression profiles alone. All these indicate that our method as presented in this paper is quite promising. The method may become a useful tool, or at least plays a complementary role to the existing method, for identifying colorectal cancer genes. It has not escaped our notice that the method can be applied to identify the genes of other diseases as well.  相似文献   

13.
Fibroblast growth factor receptors (FGFRs) are recurrently altered by single nucleotide variants (SNVs) in many human cancers. The prevalence of SNVs in FGFRs depends on the cancer type. In some tumors, such as the urothelial carcinoma, mutations of FGFRs occur at very high frequency (up to 60%). Many characterized mutations occur in the extracellular or transmembrane domains, while fewer known mutations are found in the kinase domain. In this study, we performed a bioinformatics analysis to identify novel putative cancer driver or therapeutically actionable mutations of the kinase domain of FGFRs. To pinpoint those mutations that may be clinically relevant, we exploited the recurrence of alterations on analogous amino acid residues within the kinase domain (PK_Tyr_Ser-Thr) of different kinases as a predictor of functional impact. By exploiting MutationAligner and LowMACA bioinformatics resources, we highlighted novel uncharacterized mutations of FGFRs which recur in other protein kinases. By revealing unanticipated correspondence with known variants, we were able to infer their functional effects, as alterations clustering on similar residues in analogous proteins have a high probability to elicit similar effects. As FGFRs represent an important class of oncogenes and drug targets, our study opens the way for further studies to validate their driver and/or actionable nature and, in the long term, for a more efficacious application of precision oncology.  相似文献   

14.
The promise of personalized cancer medicine cannot be fulfilled until we gain better understanding of the connections between the genomic makeup of a patient''s tumor and its response to anticancer drugs. Several datasets that include both pharmacologic profiles of cancer cell lines as well as their genomic alterations have been recently developed and extensively analyzed. However, most analyses of these datasets assume that mutations in a gene will have the same consequences regardless of their location. While this assumption might be correct in some cases, such analyses may miss subtler, yet still relevant, effects mediated by mutations in specific protein regions. Here we study such perturbations by separating effects of mutations in different protein functional regions (PFRs), including protein domains and intrinsically disordered regions. Using this approach, we have been able to identify 171 novel associations between mutations in specific PFRs and changes in the activity of 24 drugs that couldn''t be recovered by traditional gene-centric analyses. Our results demonstrate how focusing on individual protein regions can provide novel insights into the mechanisms underlying the drug sensitivity of cancer cell lines. Moreover, while these new correlations are identified using only data from cancer cell lines, we have been able to validate some of our predictions using data from actual cancer patients. Our findings highlight how gene-centric experiments (such as systematic knock-out or silencing of individual genes) are missing relevant effects mediated by perturbations of specific protein regions. All the associations described here are available from http://www.cancer3d.org.  相似文献   

15.
A large number of mutations have been reported in SCO2 (synthesis of cytochrome c oxidase) gene in association with COX deficiency reported in different diseases such as cardioencephalomyopathy, cardiomyopathy and Leigh syndrome. However, very few of these mutations have been functionally analyzed.SCO2 gene encodes for an essential assembly factor for the formation of cytochrome c oxidase (COX). It is a nuclear encoded protein that helps in transfer of copper ions to COX. This study is an attempt to understand the possible effect of these mutations on the structure and function of SCO2 protein, by using different in silico tools. As per Human Gene Mutation Database, total 11 non synonymous variations have been reported in SCO2 gene. Among these 11 variations, only E140K and R171W are functionally proven to cause COX deficiency. They have been used as controls in this study. The remaining variations were further analyzed using ClustalW, SIFT, PolyPhen-2, GOR4, MuPro and Panther softwares. As compared to the results of the controls, most of these variations were predicted to affect the structure of SCO2 protein and hence, may cause COX dysfunction. Thus, we hypothesize that these variations have the potential to result in a disease phenotype and should be investigated by subsequent functional analyses. This will help in an appropriate diagnosis and management of the wide spectrum of COX deficiency diseases.  相似文献   

16.
Precision oncology is premised on identifying and drugging proteins and pathways that drive tumorigenesis or are required for survival of tumor cells. Across diverse cancer types, the signaling pathway emanating from receptor tyrosine kinases on the cell surface to RAS and the MAP kinase pathway is the most frequent target of oncogenic mutations, and key proteins in this signaling axis including EGFR, SHP2, RAS, BRAF, and MEK have long been a focus in cancer drug discovery. In this review, we provide an overview of historical and recent efforts to develop inhibitors targeting these nodes with an emphasis on the role that an understanding of protein structure and regulation has played in inhibitor discovery and characterization. Beyond its well‐established role in structure‐based drug design, structural biology has revealed mechanisms of allosteric regulation, distinct effects of activating oncogenic mutations, and other vulnerabilities that have opened new avenues in precision cancer drug discovery.  相似文献   

17.
Fibroadenoma is the most common type of benign breast tumor, accounting for 90% of benign lesions in India. Somatic mutations in the mediator complex subunit 12 (MED12) gene play a critical role in fibroepithelial tumorigenesis. The current study evaluated the hotspot region encompassing exon 2 of the MED12 gene, in benign and malignant breast tumor tissue from women who presented for breast lump evaluation. A total of 100 (80 fibroadenoma and 20 breast cancer) samples were analyzed by polymerase chain reaction-Sanger sequencing. Sequence variant analysis showed that 68.75% of nucleotide changes were found in exon 2 and the remaining in the adjacent intron 1. Codon 44 was implicated as a hotspot mutation in benign tumors, and 86.36% of the identified mutations involved this codon. An in silico functional analysis of missense mutations using consensus scoring sorting intolerant from tolerant (SIFT), SIFT seq, Polyphen2, Mutation Assessor, SIFT transFIC, Polyphen2 transFIC, Mutation Assesor transFIC, I-Mutant, DUET, PON-PS, SNAP2, and protein variation effect analyzer] revealed that apart from variants involving codon 44 (G44S; G44H), others like V41A and E55D were also predicted to be deleterious. Most of the missense mutations appeared in the loop region of the MED12 protein, which is expected to affect its functional interaction with cyclin C–CDK8/CDK19, causing loss of mediator-associated cyclin depended kinase (CDK) activity. These results suggest a key role of MED12 somatic variations in the pathogenesis of fibroadenoma. For the first time, it was demonstrated that MED12 sequence variations are present in benign breast tumors in the south Indian population.  相似文献   

18.
Whole-genome approaches to identify genetic and epigenetic alterations in cancer genomes have begun to provide new insights into the range of molecular events that occurs in human tumours. Although in some cases this knowledge immediately illuminates a path towards diagnostic or therapeutic implementation, the bewildering lists of mutations in each tumour make it clear that systematic functional approaches are also necessary to obtain a comprehensive molecular understanding of cancer. Here we review the current range of methods, assays and approaches for genome-scale interrogation of gene function in cancer. We also discuss the integration of functional-genomics approaches with the outputs from cancer genome sequencing efforts.  相似文献   

19.
Pyrroline-5-carboxylate reductase (P5CR) encoded by PYCR1 gene is a housekeeping enzyme that catalyzes the reduction of P5C to proline using NAD(P)H as the cofactor. In this study, we used in silico approaches to examine the role of nonsynonymous single-nucleotide polymorphisms in the PYCR1 gene and their putative functions in the pathogenesis of Cutis Laxa. Among the 348 identified SNPs, 15 were predicted to be potentially damaging by both SIFT and PolyPhen tools; of them two SNP‐derived mutations, R119G and G206W, have been previously reported to correlate with Cutis Laxa. These two mutations were therefore selected to be mapped to the wild‐type (WT) P5CR structure for further structural and functional analyses. The results of comparative computational analyses using I-Mutant and Autodock reveal reductions in both stability and cofactor binding affinity of these two mutants. Comparative molecular dynamics (MD) simulations were performed to evaluate the changes in dynamic properties of P5CR upon mutations. The results reveal that the two mutations enhance the rigidity of P5CR structure, especially that of cofactor binding site, which could result in decreased kinetics of cofactor entrance and egress. Comparison between the structural properties of the WT and mutants during MD simulations shows that the enhanced rigidity of mutants results most likely from the increased number of inter‐atomic interactions and the decreased number of dynamic hydrogen bonds. Our study provides novel insight into the deleterious effects of the R119G and G206W mutations on P5CR, and sheds light on the mechanisms by which these mutations mediate Cutis Laxa.  相似文献   

20.
Single amino acid substitutions in the globin chain are the most common forms of genetic variations that produce hemoglobinopathies--the most widespread inherited disorders worldwide. Several hemoglobinopathies result from homozygosity or compound heterozygosity to beta-globin (HBB) gene mutations, such as that producing sickle cell hemoglobin (HbS), HbC, HbD and HbE. Several of these mutations are deleterious and result in moderate to severe hemolytic anemia, with associated complications, requiring lifelong care and management. Even though many hemoglobinopathies result from single amino acid changes producing similar structural abnormalities, there are functional differences in the generated variants. Using in silico methods, we examined the genetic variations that can alter the expression and function of the HBB gene. Using a sequence homology-based Sorting Intolerant from Tolerant (SIFT) server we have searched for the SNPs, which showed that 200 (80%) non-synonymous polymorphism were found to be deleterious. The structure-based method via PolyPhen server indicated that 135 (40%) non-synonymous polymorphism may modify protein function and structure. The Pupa Suite software showed that the SNPs will have a phenotypic consequence on the structure and function of the altered protein. Structure analysis was performed on the key mutations that occur in the native protein coded by the HBB gene that causes hemoglobinopathies such as: HbC (E→K), HbD (E→Q), HbE (E→K) and HbS (E→V). Atomic Non-Local Environment Assessment (ANOLEA), Yet Another Scientific Artificial Reality Application (YASARA), CHARMM-GUI webserver for macromolecular dynamics and mechanics, and Normal Mode Analysis, Deformation and Refinement (NOMAD-Ref) of Gromacs server were used to perform molecular dynamics simulations and energy minimization calculations on β-Chain residue of the HBB gene before and after mutation. Furthermore, in the native and altered protein models, amino acid residues were determined and secondary structures were observed for solvent accessibility to confirm the protein stability. The functional study in this investigation may be a good model for additional future studies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号