首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Tumorigenesis is a multi-step process in which normal cells transform into malignant tumors following the accumulation of genetic mutations that enable them to evade the growth control checkpoints that would normally suppress their growth or result in apoptosis. It is therefore important to identify those combinations of mutations that collaborate in cancer development and progression. DNA copy number alterations (CNAs) are one of the ways in which cancer genes are deregulated in tumor cells. We hypothesized that synergistic interactions between cancer genes might be identified by looking for regions of co-occurring gain and/or loss. To this end we developed a scoring framework to separate truly co-occurring aberrations from passenger mutations and dominant single signals present in the data. The resulting regions of high co-occurrence can be investigated for between-region functional interactions. Analysis of high-resolution DNA copy number data from a panel of 95 hematological tumor cell lines correctly identified co-occurring recombinations at the T-cell receptor and immunoglobulin loci in T- and B-cell malignancies, respectively, showing that we can recover truly co-occurring genomic alterations. In addition, our analysis revealed networks of co-occurring genomic losses and gains that are enriched for cancer genes. These networks are also highly enriched for functional relationships between genes. We further examine sub-networks of these networks, core networks, which contain many known cancer genes. The core network for co-occurring DNA losses we find seems to be independent of the canonical cancer genes within the network. Our findings suggest that large-scale, low-intensity copy number alterations may be an important feature of cancer development or maintenance by affecting gene dosage of a large interconnected network of functionally related genes.  相似文献   

3.
A major challenge in interpreting the large volume of mutation data identified by next-generation sequencing (NGS) is to distinguish driver mutations from neutral passenger mutations to facilitate the identification of targetable genes and new drugs. Current approaches are primarily based on mutation frequencies of single-genes, which lack the power to detect infrequently mutated driver genes and ignore functional interconnection and regulation among cancer genes. We propose a novel mutation network method, VarWalker, to prioritize driver genes in large scale cancer mutation data. VarWalker fits generalized additive models for each sample based on sample-specific mutation profiles and builds on the joint frequency of both mutation genes and their close interactors. These interactors are selected and optimized using the Random Walk with Restart algorithm in a protein-protein interaction network. We applied the method in >300 tumor genomes in two large-scale NGS benchmark datasets: 183 lung adenocarcinoma samples and 121 melanoma samples. In each cancer, we derived a consensus mutation subnetwork containing significantly enriched consensus cancer genes and cancer-related functional pathways. These cancer-specific mutation networks were then validated using independent datasets for each cancer. Importantly, VarWalker prioritizes well-known, infrequently mutated genes, which are shown to interact with highly recurrently mutated genes yet have been ignored by conventional single-gene-based approaches. Utilizing VarWalker, we demonstrated that network-assisted approaches can be effectively adapted to facilitate the detection of cancer driver genes in NGS data.  相似文献   

4.
BackgroundGenome wide-association studies have successfully identified several hundred independent loci harboring common cancer susceptibility alleles that are distinct from the more than 110 cancer predisposition genes. The latter are generally characterized by disruptive mutations in coding genes that have been established as ‘drivers’ of cancer in large somatic sequencing studies. We set out to determine whether, similarly, common cancer susceptibility loci map to genes that have altered frequencies of mutation.ResultsIn our analysis of the intervals defined by the cancer susceptibility markers, we observed that cancer susceptibility regions have gene mutation frequencies comparable to background mutation frequencies. Restricting analyses to genes that have been determined to be pleiotropic across cancer types, genes affected by expression quantitative trait loci, or functional genes indicates that most cancer susceptibility genes classified into these subgroups do not display mutation frequencies that deviate from those expected. We observed limited evidence that cancer susceptibility regions that harbor common alleles with small estimated effect sizes are preferential targets for altered somatic mutation frequencies.ConclusionsOur findings suggest a complex interplay between germline susceptibility and somatic mutation, underscoring the cumulative effect of common variants on redundant pathways as opposed to driver genes. Complex biological pathways and networks likely link these genetic features of carcinogenesis, particularly as they relate to distinct polygenic models for each cancer type.  相似文献   

5.
张杨  沈晓沛  王靖  朱晶  郭政 《生物信息学》2011,9(3):217-219,223
癌是一种涉及多基因变异的遗传异质性疾病,涉及多种生物学功能通路中不同基因的遗传变异。因此,识别癌基因是一项富有挑战性的工作。提出通过寻找在癌样本中突变显著共发生的基因筛选候选癌基因的方法。应用该方法,通过分析蛋白激酶基因在癌组织中的突变谱数据,发现了167个显著共发生突变的基因对,包含85个基因。分析这167个基因对发现:(1)发生共突变的基因富集已知的癌基因;(2)共突变基因对倾向于共扰动与癌症相关的通路对。以上结果提示,在癌样本中显著共发生突变的基因倾向于候选癌基因;在癌发生过程中起重要作用的基因倾向于协同扰动不同的癌相关细胞生物学过程。  相似文献   

6.
Large-scale sequencing of cancer genomes has revealed many novel mutations and inter-tumoral heterogeneity. Therefore, prioritizing variants according to their potential deleterious effects has become essential. We constructed a disease gene network and proposed a Bayesian ensemble approach that integrates diverse sources to predict the functional effects of missense variants. We analyzed 23,336 missense disease mutations and 36,232 neutral polymorphisms of 12,039 human proteins. The results showed successful improvement of prediction accuracy in both sensitivity and specificity, and we demonstrated the utility of the method by applying it to somatic mutations obtained from colorectal and breast cancer cell lines. The candidate genes with predicted deleterious mutations as well as known cancer genes were significantly enriched in many KEGG pathways related to carcinogenesis, supporting genetic homogeneity of cancer at the pathway level. The breast cancer-specific network increased the prediction accuracy for breast cancer mutations. This study provides a ranked list of deleterious mutations and candidate cancer genes and suggests that mutations affecting cancer may occur in important pathways and should be interpreted on the phenotype-related network or pathway. A disease gene network may be of value in predicting functional effects of novel disease-specific mutations.  相似文献   

7.
The genetic etiology of hereditary breast cancer has not been fully elucidated. Although germline mutations of high-penetrance genes such as BRCA1/2 are implicated in development of hereditary breast cancers, at least half of all breast cancer families are not linked to these genes. To identify a comprehensive spectrum of genetic factors for hereditary breast cancer in a Chinese population, we performed an analysis of germline mutations in 2,165 coding exons of 152 genes associated with hereditary cancer using next-generation sequencing (NGS) in 99 breast cancer patients from families of cancer patients regardless of cancer types. Forty-two deleterious germline mutations were identified in 21 genes of 34 patients, including 18 (18.2%) BRCA1 or BRCA2 mutations, 3 (3%) TP53 mutations, 5 (5.1%) DNA mismatch repair gene mutations, 1 (1%) CDH1 mutation, 6 (6.1%) Fanconi anemia pathway gene mutations, and 9 (9.1%) mutations in other genes. Of seven patients who carried mutations in more than one gene, 4 were BRCA1/2 mutation carriers, and their average onset age was much younger than patients with only BRCA1/2 mutations. Almost all identified high-penetrance gene mutations in those families fulfill the typical phenotypes of hereditary cancer syndromes listed in the National Comprehensive Cancer Network (NCCN) guidelines, except two TP53 and three mismatch repair gene mutations. Furthermore, functional studies of MSH3 germline mutations confirmed the association between MSH3 mutation and tumorigenesis, and segregation analysis suggested antagonism between BRCA1 and MSH3. We also identified a lot of low-penetrance gene mutations. Although the clinical significance of those newly identified low-penetrance gene mutations has not been fully appreciated yet, these new findings do provide valuable epidemiological information for the future studies. Together, these findings highlight the importance of genetic testing based on NCCN guidelines and a multi-gene analysis using NGS may be a supplement to traditional genetic counseling.  相似文献   

8.
ABSTRACT: BACKGROUND: Cancer sequencing projects are now measuring somatic mutations in large numbers of cancer genomes. A key challenge in interpreting these data is to distinguish driver mutations, mutations important for cancer development, from passenger mutations that have accumulated in somatic cells but without functional consequences. A common approach to identify genes harboring driver mutations is a single gene test that identifies individual genes that are recurrently mutated in a significant number of cancer genomes. However, the power of this test is reduced by: (1) the necessity of estimating the background mutation rate (BMR) for each gene; (2) the mutational heterogeneity in most cancers meaning that groups of genes (e.g. pathways), rather than single genes, are the primary target of mutations. RESULTS: We investigate the problem of discovering driver pathways, groups of genes containing driver mutations, directly from cancer mutation data and without prior knowledge of pathways or other interactions between genes. We introduce two generative models of somatic mutations in cancer and study the algorithmic complexity of discovering driver pathways in both models. We show that a single gene test for driver genes is highly sensitive to the estimate of the BMR. In contrast, we show that an algorithmic approach that maximizes a straightforward measure of the mutational properties of a driver pathway successfully discovers these groups of genes without an estimate of the BMR. Moreover, this approach is also successful in the case when the observed frequencies of passenger and driver mutations are indistinguishable, a situation where single gene tests fail. CONCLUSIONS: Accurate estimation of the BMR is a challenging task. Thus, methods that do not require an estimate of the BMR, such as the ones we provide here, can give increased power for the discovery of driver genes.  相似文献   

9.
Improved efforts are necessary to define the functional product of cancer mutations currently being revealed through large‐scale sequencing efforts. Using genome‐scale pooled shRNA screening technology, we mapped negative genetic interactions across a set of isogenic cancer cell lines and confirmed hundreds of these interactions in orthogonal co‐culture competition assays to generate a high‐confidence genetic interaction network of differentially essential or differential essentiality (DiE) genes. The network uncovered examples of conserved genetic interactions, densely connected functional modules derived from comparative genomics with model systems data, functions for uncharacterized genes in the human genome and targetable vulnerabilities. Finally, we demonstrate a general applicability of DiE gene signatures in determining genetic dependencies of other non‐isogenic cancer cell lines. For example, the PTEN?/? DiE genes reveal a signature that can preferentially classify PTEN‐dependent genotypes across a series of non‐isogenic cell lines derived from the breast, pancreas and ovarian cancers. Our reference network suggests that many cancer vulnerabilities remain to be discovered through systematic derivation of a network of differentially essential genes in an isogenic cancer cell model.  相似文献   

10.
11.
Gundry M  Vijg J 《Mutation research》2012,729(1-2):1-15
DNA mutations are the source of genetic variation within populations. The majority of mutations with observable effects are deleterious. In humans mutations in the germ line can cause genetic disease. In somatic cells multiple rounds of mutations and selection lead to cancer. The study of genetic variation has progressed rapidly since the completion of the draft sequence of the human genome. Recent advances in sequencing technology, most importantly the introduction of massively parallel sequencing (MPS), have resulted in more than a hundred-fold reduction in the time and cost required for sequencing nucleic acids. These improvements have greatly expanded the use of sequencing as a practical tool for mutation analysis. While in the past the high cost of sequencing limited mutation analysis to selectable markers or small forward mutation targets assumed to be representative for the genome overall, current platforms allow whole genome sequencing for less than $5000. This has already given rise to direct estimates of germline mutation rates in multiple organisms including humans by comparing whole genome sequences between parents and offspring. Here we present a brief history of the field of mutation research, with a focus on classical tools for the measurement of mutation rates. We then review MPS, how it is currently applied and the new insight into human and animal mutation frequencies and spectra that has been obtained from whole genome sequencing. While great progress has been made, we note that the single most important limitation of current MPS approaches for mutation analysis is the inability to address low-abundance mutations that turn somatic tissues into mosaics of cells. Such mutations are at the basis of intra-tumor heterogeneity, with important implications for clinical diagnosis, and could also contribute to somatic diseases other than cancer, including aging. Some possible approaches to gain access to low-abundance mutations are discussed, with a brief overview of new sequencing platforms that are currently waiting in the wings to advance this exploding field even further.  相似文献   

12.
Wang J  Zhang Y  Shen X  Zhu J  Zhang L  Zou J  Guo Z 《Molecular bioSystems》2011,7(4):1158-1166
Finding candidate cancer genes playing causal roles in carcinogenesis is an important task in cancer research. The non-randomness of the co-mutation of genes in cancer samples can provide statistical evidence for these genes' involvement in carcinogenesis. It can also provide important information on the functional cooperation of gene mutations in cancer. However, due to the relatively small sample sizes used in current high-throughput somatic mutation screening studies and the extraordinary large-scale hypothesis tests, the statistical power of finding co-mutated gene pairs based on high-throughput somatic mutation data of cancer genomes is very low. Thus, we proposed a stratified FDR (False Discovery Rate) control approach, for identifying significantly co-mutated gene pairs according to the mutation frequency of genes. We then compared the identified co-mutated gene pairs separately by pre-selecting genes with higher mutation frequencies and by the stratified FDR control approach. Finally, we searched for pairs of pathways annotated with significantly more between-pathway co-mutated gene pairs to evaluate the functional roles of the identified co-mutated gene pairs. Based on two datasets of somatic mutations in cancer genomes, we demonstrated that, at a given FDR level, the power of finding co-mutated gene pairs could be increased by pre-selecting genes with higher mutation frequencies. However, many true co-mutation between genes with lower mutation rates will still be missed. By the stratified FDR control approach, many more co-mutated gene pairs could be found. Finally, the identified pathway pairs significantly overrepresented with between-pathway co-mutated gene pairs suggested that their co-dysregulations may play causal roles in carcinogenesis. The stratified FDR control strategy is efficient in identifying co-mutated gene pairs and the genes in the identified co-mutated gene pairs can be considered as candidate cancer genes because their non-random co-mutations in cancer genomes are highly unlikely to be attributable to chance.  相似文献   

13.
14.
The availability of the human genome sequence and progress in sequencing and bioinformatic technologies have enabled genome-wide investigation of somatic mutations in human cancers. This article briefly reviews challenges arising in the statistical analysis of mutational data of this kind. A first challenge is that of designing studies that efficiently allocate sequencing resources. We show that this can be addressed by two-stage designs and demonstrate via simulations that even relatively small studies can produce lists of candidate cancer genes that are highly informative for future research efforts. A second challenge is to distinguish mutated genes that are selected for by cancer (drivers) from mutated genes that have no role in the development of cancer and simply happened to mutate (passengers). We suggest that this question is best approached as a classification problem and discuss some of the difficulties of more traditional testing-based approaches. A third challenge is to identify biologic processes affected by the driver genes. This can be pursued by gene set analyses. These can reliably identify functional groups and pathways that are enriched for mutated genes even when the individual genes involved in those pathways or sets are not mutated at sufficient frequencies to provide conclusive evidence as drivers.  相似文献   

15.
BackgroundPreliminary investigation revealed that Low-density lipoprotein receptor-related protein 1b (LRP1B) and FAT atypical cadherin (FAT) family mutation might serve as immune regulators under certain tumor microenvironment.Experimental designWe curated a total of 70 non-small cell lung cancer (NSCLC) patients who harbored alterations in LRP1B and/or FAT family (FAT1/2/3/4) based on next-generation sequencing (NGS) to analyze multiple-dimensional data types, including comutant status, tumor mutation burden (TMB), programmed death receptor ligand 1 (PD-L1) expression, T cell-inflamed gene expression profiling (GEP) and therapy response.Results20 patients with co-occurring mutations in LRP1B and FAT1/2/3/4 revealed a relatively higher TMB level of 17.05 mut/Mb compared with 7.60 mut/Mb and 8.80 mut/Mb in single LRP1B and FAT mutation groups, respectively. LRP1B and FAT members showed specifically enriched T cell-inflamed genes and the co-occurring mutant TP53 status in NSCLC patients who harbor LRP1B/FAT comutations.ConclusionsThis work provides evidence that co-occurring mutations of LRP1B and FAT in NSCLC may serve as a group of potential predictive factors in guiding immunotherapy on the basis of their association with TMB status.  相似文献   

16.
The decreasing cost of sequencing is leading to a growing repertoire of personal genomes. However, we are lagging behind in understanding the functional consequences of the millions of variants obtained from sequencing. Global system-wide effects of variants in coding genes are particularly poorly understood. It is known that while variants in some genes can lead to diseases, complete disruption of other genes, called ‘loss-of-function tolerant’, is possible with no obvious effect. Here, we build a systems-based classifier to quantitatively estimate the global perturbation caused by deleterious mutations in each gene. We first survey the degree to which gene centrality in various individual networks and a unified ‘Multinet’ correlates with the tolerance to loss-of-function mutations and evolutionary conservation. We find that functionally significant and highly conserved genes tend to be more central in physical protein-protein and regulatory networks. However, this is not the case for metabolic pathways, where the highly central genes have more duplicated copies and are more tolerant to loss-of-function mutations. Integration of three-dimensional protein structures reveals that the correlation with centrality in the protein-protein interaction network is also seen in terms of the number of interaction interfaces used. Finally, combining all the network and evolutionary properties allows us to build a classifier distinguishing functionally essential and loss-of-function tolerant genes with higher accuracy (AUC = 0.91) than any individual property. Application of the classifier to the whole genome shows its strong potential for interpretation of variants involved in Mendelian diseases and in complex disorders probed by genome-wide association studies.  相似文献   

17.
Natural selection is a significant force that shapes the architecture of the human genome and introduces diversity across global populations. The question of whether advantageous mutations have arisen in the human genome as a result of single or multiple mutation events remains unanswered except for the fact that there exist a handful of genes such as those that confer lactase persistence, affect skin pigmentation, or cause sickle cell anemia. We have developed a long-range-haplotype method for identifying genomic signatures of positive selection to complement existing methods, such as the integrated haplotype score (iHS) or cross-population extended haplotype homozygosity (XP-EHH), for locating signals across the entire allele frequency spectrum. Our method also locates the founder haplotypes that carry the advantageous variants and infers their corresponding population frequencies. This presents an opportunity to systematically interrogate the whole human genome whether a selection signal shared across different populations is the consequence of a single mutation process followed subsequently by gene flow between populations or of convergent evolution due to the occurrence of multiple independent mutation events either at the same variant or within the same gene. The application of our method to data from 14 populations across the world revealed that positive-selection events tend to cluster in populations of the same ancestry. Comparing the founder haplotypes for events that are present across different populations revealed that convergent evolution is a rare occurrence and that the majority of shared signals stem from the same evolutionary event.  相似文献   

18.
Protein kinases are the most common protein domains implicated in cancer, where somatically acquired mutations are known to be functionally linked to a variety of cancers. Resequencing studies of protein kinase coding regions have emphasized the importance of sequence and structure determinants of cancer-causing kinase mutations in understanding of the mutation-dependent activation process. We have developed an integrated bioinformatics resource, which consolidated and mapped all currently available information on genetic modifications in protein kinase genes with sequence, structure and functional data. The integration of diverse data types provided a convenient framework for kinome-wide study of sequence-based and structure-based signatures of cancer mutations. The database-driven analysis has revealed a differential enrichment of SNPs categories in functional regions of the kinase domain, demonstrating that a significant number of cancer mutations could fall at structurally equivalent positions (mutational hotspots) within the catalytic core. We have also found that structurally conserved mutational hotspots can be shared by multiple kinase genes and are often enriched by cancer driver mutations with high oncogenic activity. Structural modeling and energetic analysis of the mutational hotspots have suggested a common molecular mechanism of kinase activation by cancer mutations, and have allowed to reconcile the experimental data. According to a proposed mechanism, structural effect of kinase mutations with a high oncogenic potential may manifest in a significant destabilization of the autoinhibited kinase form, which is likely to drive tumorigenesis at some level. Structure-based functional annotation and prediction of cancer mutation effects in protein kinases can facilitate an understanding of the mutation-dependent activation process and inform experimental studies exploring molecular pathology of tumorigenesis.  相似文献   

19.
Cancer drivers are genomic alterations that provide cells containing them with a selective advantage over their local competitors, whereas neutral passengers do not change the somatic fitness of cells. Cancer-driving mutations are usually discriminated from passenger mutations by their higher degree of recurrence in tumor samples. However, there is increasing evidence that many additional driver mutations may exist that occur at very low frequencies among tumors. This observation has prompted alternative methods for driver detection, including finding groups of mutually exclusive mutations and incorporating prior biological knowledge about gene function or network structure. Dependencies among drivers due to epistatic interactions can also result in low mutation frequencies, but this effect has been ignored in driver detection so far. Here, we present a new computational approach for identifying genomic alterations that occur at low frequencies because they depend on other events. Unlike passengers, these constrained mutations display punctuated patterns of occurrence in time. We test this driver–passenger discrimination approach based on mutation timing in extensive simulation studies, and we apply it to cross-sectional copy number alteration (CNA) data from ovarian cancer, CNA and single-nucleotide variant (SNV) data from breast tumors and SNV data from colorectal cancer. Among the top ranked predicted drivers, we find low-frequency genes that have already been shown to be involved in carcinogenesis, as well as many new candidate drivers. The mutation timing approach is orthogonal and complementary to existing driver prediction methods. It will help identifying from cancer genome data the alterations that drive tumor progression.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号