首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Mutational processes shape the genomes of cancer patients and their understanding has important applications in diagnosis and treatment. Current modeling of mutational processes by identifying their characteristic signatures views each base substitution in a limited context of a single flanking base on each side. This context definition gives rise to 96 categories of mutations that have become the standard in the field, even though wider contexts have been shown to be informative in specific cases. Here we propose a data-driven approach for constructing a mutation categorization for mutational signature analysis. Our approach is based on the assumption that tumor cells that are exposed to similar mutational processes, show similar expression levels of DNA damage repair genes that are involved in these processes. We attempt to find a categorization that maximizes the agreement between mutation and gene expression data, and show that it outperforms the standard categorization over multiple quality measures. Moreover, we show that the categorization we identify generalizes to unseen data from different cancer types, suggesting that mutation context patterns extend beyond the immediate flanking bases.  相似文献   

2.
Paul Little  Li Hsu  Wei Sun 《Biometrics》2023,79(3):2705-2718
Somatic mutations in cancer patients are inherently sparse and potentially high dimensional. Cancer patients may share the same set of deregulated biological processes perturbed by different sets of somatically mutated genes. Therefore, when assessing the associations between somatic mutations and clinical outcomes, gene-by-gene analysis is often under-powered because it does not capture the complex disease mechanisms shared across cancer patients. Rather than testing genes one by one, an intuitive approach is to aggregate somatic mutation data of multiple genes to assess their joint association with clinical outcomes. The challenge is how to aggregate such information. Building on the optimal transport method, we propose a principled approach to estimate the similarity of somatic mutation profiles of multiple genes between tumor samples, while accounting for gene–gene similarities defined by gene annotations or empirical mutational patterns. Using such similarities, we can assess the associations between somatic mutations and clinical outcomes by kernel regression. We have applied our method to analyze somatic mutation data of 17 cancer types and identified at least five cancer types, where somatic mutations are associated with overall survival, progression-free interval, or cytolytic activity.  相似文献   

3.
Protein-protein interactions are critical to most biological processes, and locating protein-protein interfaces on protein structures is an important task in molecular biology. We developed a new experimental strategy called the ‘absence of interference’ approach to determine surface residues involved in protein-protein interaction of established yeast two-hybrid pairs of interacting proteins. One of the proteins is subjected to high-level randomization by error-prone PCR. The resulting library is selected by yeast two-hybrid system for interacting clones that are isolated and sequenced. The interaction region can be identified by an absence or depletion of mutations. For data analysis and presentation, we developed a Web interface that analyzes the mutational spectrum and displays the mutational frequency on the surface of the structure (or a structural model) of the randomized protein†. Additionally, this interface might be of use for the display of mutational distributions determined by other types of random mutagenesis experiments. We applied the approach to map the interface of the catalytic domain of the DNA methyltransferase Dnmt3a with its regulatory factor Dnmt3L. Dnmt3a was randomized with high mutational load. A total of 76 interacting clones were isolated and sequenced, and 648 mutations were identified. The mutational pattern allowed to identify a unique interaction region on the surface of Dnmt3a, which comprises about 500-600 Å2. The results were confirmed by site-directed mutagenesis and structural analysis. The absence-of-interference approach will allow high-throughput mapping of protein interaction sites suitable for functional studies and protein docking.  相似文献   

4.
As the ultimate source of genetic variation, spontaneous mutation is essential to evolutionary change. Theoretical studies over several decades have revealed the dependence of evolutionary consequences of mutation on specific mutational properties, including genomic mutation rates, U, and the effects of newly arising mutations on individual fitness, s. The recent resurgence of empirical effort to infer these properties for diverse organisms has not achieved consensus. Estimates, which have been obtained by methods that assume mutations are unidirectional in their effects on fitness, are imprecise. Both because a general approach must allow for occurrence of fitness-enhancing mutations, even if these are rare, and because recent evidence demands it, we present a new method for inferring mutational parameters. For the distribution of mutational effects, we retain Keightley's assumption of the gamma distribution, to take advantage of the flexibility of its shape. Because the conventional gamma is one sided, restricting it to unidirectional effects, we include an additional parameter, rho, as an amount it is displaced from zero. Estimation is accomplished by Markov chain Monte Carlo maximum likelihood. Through a limited set of simulations, we verify the accuracy of this approach. We apply it to analyze data on two reproductive fitness components from a 17-generation mutation-accumulation study of a Columbia accession of Arabidopsis thaliana in which 40 lines sampled in three generations were assayed simultaneously. For these traits, U approximately/= 0.1-0.2, with distributions of mutational effects broadly spanning zero, such that roughly half the mutations reduce reproductive fitness. One evolutionary consequence of these results is lower extinction risks of small populations of A. thaliana than expected from the process of mutational meltdown. A comprehensive view of the evolutionary consequences of mutation will depend on quantitatively accounting for fitness-enhancing, as well as fitness-reducing, mutations.  相似文献   

5.
We have used mathematical modeling and statistical analysis to examine the correlation between UV-induced DNA damage and resulting base-substitution mutations in mammalian cells. The frequency and site specificity of UV-induced photoproducts in the supF gene of the pZ189 shuttle vector plasmid were compared with the frequency and site specificity of base-substitution mutations induced upon passage of the UV-irradiated vector in monkey cells. The hypothesis that the observed mutational spectrum is due to a preferential insertion of adenosine opposite UV photoproducts in the DNA template was found to best explain the mutational data. Models in which it was postulated that only (6-4) photoproducts, and not cyclobutane dimers, are mutagenic, or that the relative frequency of photoproduct formation does not influence mutation frequencies, fit the data much less well. This analysis demonstrates that molecular mechanisms of mutagenesis in mammalian cells can be deduced from mutational data obtained with a shuttle vector system.  相似文献   

6.
7.
Griswold CK  Whitlock MC 《Genetics》2003,165(4):2181-2192
Pleiotropy allows for the deterministic fixation of bidirectional mutations: mutations with effects both in the direction of selection and opposite to selection for the same character. Mutations with deleterious effects on some characters can fix because of beneficial effects on other characters. This study analytically quantifies the expected frequency of mutations that fix with negative and positive effects on a character and the average size of a fixed effect on a character when a mutation pleiotropically affects from very few to many characters. The analysis allows for mutational distributions that vary in shape and provides a framework that would allow for varying the frequency at which mutations arise with deleterious and positive effects on characters. The results show that a large fraction of fixed mutations will have deleterious pleiotropic effects even when mutation affects as little as two characters and only directional selection is occurring, and, not surprisingly, as the degree of pleiotropy increases the frequency of fixed deleterious effects increases. As a point of comparison, we show how stabilizing selection and random genetic drift affect the bidirectional distribution of fixed mutational effects. The results are then applied to QTL studies that seek to find loci that contribute to phenotypic differences between populations or species. It is shown that QTL studies are biased against detecting chromosome regions that have deleterious pleiotropic effects on characters.  相似文献   

8.
Protein kinases are the most common protein domains implicated in cancer, where somatically acquired mutations are known to be functionally linked to a variety of cancers. Resequencing studies of protein kinase coding regions have emphasized the importance of sequence and structure determinants of cancer-causing kinase mutations in understanding of the mutation-dependent activation process. We have developed an integrated bioinformatics resource, which consolidated and mapped all currently available information on genetic modifications in protein kinase genes with sequence, structure and functional data. The integration of diverse data types provided a convenient framework for kinome-wide study of sequence-based and structure-based signatures of cancer mutations. The database-driven analysis has revealed a differential enrichment of SNPs categories in functional regions of the kinase domain, demonstrating that a significant number of cancer mutations could fall at structurally equivalent positions (mutational hotspots) within the catalytic core. We have also found that structurally conserved mutational hotspots can be shared by multiple kinase genes and are often enriched by cancer driver mutations with high oncogenic activity. Structural modeling and energetic analysis of the mutational hotspots have suggested a common molecular mechanism of kinase activation by cancer mutations, and have allowed to reconcile the experimental data. According to a proposed mechanism, structural effect of kinase mutations with a high oncogenic potential may manifest in a significant destabilization of the autoinhibited kinase form, which is likely to drive tumorigenesis at some level. Structure-based functional annotation and prediction of cancer mutation effects in protein kinases can facilitate an understanding of the mutation-dependent activation process and inform experimental studies exploring molecular pathology of tumorigenesis.  相似文献   

9.
Ferenci T 《Heredity》2008,100(5):446-452
The spread of beneficial mutations through populations is at the core of evolutionary change. A long-standing hindrance to understanding mutational sweeps was that beneficial mutations have been slow to be identified, even in commonly studied experimental populations. The lack of information on what constitutes a beneficial mutation has led to many uncertainties about the frequency, fitness benefit and fixation of beneficial mutations. A more complete picture is currently emerging for a limited set of identified mutations in bacterial populations. In turn, this will allow quantitation of several features of mutational sweeps. Most importantly, the 'benefit' of beneficial mutations can now be explained in terms of physiological function and how variations in the environment change the selectability of mutations. Here, the sweep of rpoS mutations in Escherichia coli, in both experimental and natural populations, is described in detail. These studies reveal the subtleties of physiology and regulation that strongly influence the benefit of a mutation and explain differences in sweeps between strains and between various environments.  相似文献   

10.
Jiang C  Zhao Z 《Genomics》2006,88(5):527-534
So far, there is no genome-wide estimation of the mutational spectrum in humans. In this study, we systematically examined the directionality of the point mutations and maintenance of GC content in the human genome using approximately 1.8 million high-quality human single nucleotide polymorphisms and their ancestral sequences in chimpanzees. The frequency of C-->T (G-->A) changes was the highest among all mutation types and the frequency of each type of transition was approximately fourfold that of each type of transversion. In intergenic regions, when the GC content increased, the frequency of changes from G or C increased. In exons, the frequency of G:C-->A:T was the highest among the genomic categories and contributed mainly by the frequent mutations at the CpG sites. In contrast, mutations at the CpG sites, or CpG-->TpG/CpA mutations, occurred less frequently in the CpG islands relative to intergenic regions with similar GC content. Our results suggest that the GC content is overall not in equilibrium in the human genome, with a trend toward shifting the human genome to be AT rich and shifting the GC content of a region to approach the genome average. Our results, which differ from previous estimates based on limited loci or on the rodent lineage, provide the first representative and reliable mutational spectrum in the recent human genome and categorized genomic regions.  相似文献   

11.
ABSTRACT: BACKGROUND: Cancer sequencing projects are now measuring somatic mutations in large numbers of cancer genomes. A key challenge in interpreting these data is to distinguish driver mutations, mutations important for cancer development, from passenger mutations that have accumulated in somatic cells but without functional consequences. A common approach to identify genes harboring driver mutations is a single gene test that identifies individual genes that are recurrently mutated in a significant number of cancer genomes. However, the power of this test is reduced by: (1) the necessity of estimating the background mutation rate (BMR) for each gene; (2) the mutational heterogeneity in most cancers meaning that groups of genes (e.g. pathways), rather than single genes, are the primary target of mutations. RESULTS: We investigate the problem of discovering driver pathways, groups of genes containing driver mutations, directly from cancer mutation data and without prior knowledge of pathways or other interactions between genes. We introduce two generative models of somatic mutations in cancer and study the algorithmic complexity of discovering driver pathways in both models. We show that a single gene test for driver genes is highly sensitive to the estimate of the BMR. In contrast, we show that an algorithmic approach that maximizes a straightforward measure of the mutational properties of a driver pathway successfully discovers these groups of genes without an estimate of the BMR. Moreover, this approach is also successful in the case when the observed frequencies of passenger and driver mutations are indistinguishable, a situation where single gene tests fail. CONCLUSIONS: Accurate estimation of the BMR is a challenging task. Thus, methods that do not require an estimate of the BMR, such as the ones we provide here, can give increased power for the discovery of driver genes.  相似文献   

12.
The frequency of the most common sporadic Apert syndrome mutation (C755G) in the human fibroblast growth factor receptor 2 gene (FGFR2) is 100–1,000 times higher than expected from average nucleotide substitution rates based on evolutionary studies and the incidence of human genetic diseases. To determine if this increased frequency was due to the nucleotide site having the properties of a mutation hot spot, or some other explanation, we developed a new experimental approach. We examined the spatial distribution of the frequency of the C755G mutation in the germline by dividing four testes from two normal individuals each into several hundred pieces, and, using a highly sensitive PCR assay, we measured the mutation frequency of each piece. We discovered that each testis was characterized by rare foci with mutation frequencies 103 to >104 times higher than the rest of the testis regions. Using a model based on what is known about human germline development forced us to reject (p < 10−6) the idea that the C755G mutation arises more frequently because this nucleotide simply has a higher than average mutation rate (hot spot model). This is true regardless of whether mutation is dependent or independent of cell division. An alternate model was examined where positive selection acts on adult self-renewing Ap spermatogonial cells (SrAp) carrying this mutation such that, instead of only replacing themselves, they occasionally produce two SrAp cells. This model could not be rejected given our observed data. Unlike the disease site, similar analysis of C-to-G mutations at a control nucleotide site in one testis pair failed to find any foci with high mutation frequencies. The rejection of the hot spot model and lack of rejection of a selection model for the C755G mutation, along with other data, provides strong support for the proposal that positive selection in the testis can act to increase the frequency of premeiotic germ cells carrying a mutation deleterious to an offspring, thereby unfavorably altering the mutational load in humans. Studying the anatomical distribution of germline mutations can provide new insights into genetic disease and evolutionary change.  相似文献   

13.
Jacob KD  Eckert KA 《Mutation research》2007,619(1-2):93-103
Slipped strand mispairing during DNA synthesis is one proposed mechanism for microsatellite or short tandem repeat (STR) mutation. However, the DNA polymerase(s) responsible for STR mutagenesis have not been determined. In this study, we investigated the effect of the Escherichia colidinB gene product (Pol IV) on mononucleotide and dinucleotide repeat stability, using an HSV-tk gene episomal reporter system for microsatellite mutations. For the control vector (HSV-tk gene only) we observed a statistically significant 3.5-fold lower median mutation frequency in dinB(-) than dinB(+) cells (p<0.001, Wilcoxon Mann Whitney Test). For vectors containing an in-frame mononucleotide allele ([G/C](10)) or either of two dinucleotide alleles ([GT/CA](10) and [TC/AG](11)) we observed no statistically significant difference in the overall HSV-tk mutation frequency observed between dinB(+) and dinB(-) strains. To determine if a mutational bias exists for mutations made by Pol IV, mutational spectra were generated for each STR vector and strain. No statistically significant differences between strains were observed for either the proportion of mutational events at the STR or STR specificity among the three vectors. However, the specificity of mutational events at the STR alleles in each strain varied in a statistically significant manner as a consequence of microsatellite sequence. Our results indicate that while Pol IV contributes to spontaneous mutations within the HSV-tk coding sequence, Pol IV does not play a significant role in spontaneous mutagenesis at [G/C](10), [GT/CA](10), or [TC/AG](11) microsatellite alleles. Our data demonstrate that in a wild type genetic background, the major factor influencing microsatellite mutagenesis is the allelic sequence composition.  相似文献   

14.
Forward mutations induced by the ultimate carcinogen N-acetoxy-N-2-acetylaminofluorene (N-Aco-AAF) in the tetracycline resistance gene carried on plasmid pBR322 are shown to be dependent upon the induction of the host SOS functions in wild-type and umuC Escherichia coli cells. The mutation frequency in the umuC strain is equal to about 40% of the mutation frequency observed in the umu+ background. In the excision-repair-deficient uvrA mutant strain the mutagenic response is the same as in SOS-induced wild-type cells whether or not the uvrA bacteria are SOS-induced. Equal mutation frequencies are obtained in both the wild-type and the uvrA strains for equal modification levels although the survival of AAF-modified plasmid DNA is greatly reduced in the uvrA strain as compared to the wild-type strain. Sequence analysis of the mutations reveals that more than 90% of the N-Aco-AAF-induced mutations are frameshift mutations. Two types of mutational hotspots are observed occurring either at repetitive sequences or at non-repetitive sequences. Both types of mutants appear at similar locations and frequencies in both the wild-type and the uvrA strains. On the other hand, only the non-repetitive sequence mutants are obtained in the umuC background. These non-repetitive sequence mutants preferentially occur within the sequence 5' G-G-C-G-C-C 3' (the NarI restriction enzyme recognition sequence). The analysis of the -AAF binding spectrum to the same DNA fragment shows that there is no direct correlation between the modification spectrum and the mutation spectrum. We suggest that certain sequences are "mutation-prone" in the sense that only these sequences can be efficiently mutated as the result of an active processing mediated by specific proteins. When a sequence is said to be mutation-prone it probably corresponds to a particular structure that is induced within this sequence as a result of the binding to the DNA of the mutagen. This sequence-specific conformational change is the substrate for the protein(s) that fixes the mutation. The mutagenic processing pathway(s) is part of the cellular response to DNA-damaging agents (the so-called SOS response). Two pathways for frameshift mutagenesis are suggested by the data: an umuC-dependent pathway, which is involved in the mutagenic processing of lesions within repetitive sequences; an umuC-independent pathway responsible for the fixation of mutations within specific non-repetitive sequences.  相似文献   

15.
HPRT mutations in humans: biomarkers for mechanistic studies.   总被引:7,自引:0,他引:7  
The X-chromosomal gene for hypoxanthine-guanine phosphoribosyltransferase (HPRT), first recognized through its human germinal mutations, quickly became a useful target for studies of somatic mutations in vitro and in vivo in humans and animals. In this role, HPRT serves as a simple reporter gene. The in vivo mutational studies have concentrated on peripheral blood lymphocytes, for obvious reasons. In vivo mutations in T cells are now used to monitor humans exposed to environmental mutagens with analyses of molecular mutational spectra serving as adjuncts for determining causation. Studies of the distributions of HPRT mutants among T cell receptor (TCR) gene-defined T cell clones in vivo have revealed an unexpected clonality, suggesting that HPRT mutations may be probes for fundamental cellular and biological processes. Use of HPRT in this way has allowed the analyses of V(D)J recombinase mediated mutations as markers of a mutational process with carcinogenic potential, the use of somatic mutations as surrogate markers for the in vivo T cell proliferation that underlies immunological processes, and the discovery and study of mutator phenotypes in non-malignant T cells. In this last application, the role of HPRT is related to its function, as well as to its utility as a reporter of mutation. Most recently, HPRT is finding use in studies of in vivo selection for in vivo mutations arising in either somatic or germinal cells.  相似文献   

16.
Molecular epidemiology studies have used the counts of different mutational types like transitions, transversions, etc. to identify putative mutagens, with little reference to gene organization and structure–function of the translated product. Moreover, geographical variation in the mutational spectrum is not limited to the mutational types at the nucleotide level but also have a bearing at the functional level. Here, we developed a novel measure to estimate the rate of spontaneous detrimental mutations called “mutation index” for comparing the mutational spectra consisting of all single base, missense, and non-missense changes. We have analyzed 1609 mutations occurring in 38 exons in 24 populations in three diseases viz. hemophilia B (F9 gene – 420 mutations in 9 populations across 8 exons), hemophilia A (F8 gene – 650, 8 and 26, respectively) and ovarian carcinoma (TP53 gene – 539, 7 and 4, respectively). We considered exons as units of evolution instead of the entire gene and observed feeble differences among populations implying lack of a mutagen-specific effect and the possibility of mutation causing endogenous factors. In all the three genes we observed elevated rates of detrimental mutations in exons encoding regions of significance for the molecular function of the protein. We propose that this can be extended to the entire exome with implications in exon-shuffling and complex human diseases.  相似文献   

17.
18.
Mutational bias is a potentially important agent of evolution, but it is difficult to disentangle the effects of mutation from those of natural selection. Mutation-accumulation experiments, in which mutations are allowed to accumulate at very small population size, thus minimizing the efficiency of natural selection, are the best way to separate the effects of mutation from those of selection. Body size varies greatly among species of nematode in the family rhabditidae; mutational biases are both a potential cause and a consequence of that variation. We report data on the cumulative effects of mutations that affect body size in three species of rhabditid nematode that vary fivefold in adult size. Results are very consistent with previous studies of mutations underlying fitness in the same strains: two strains of Caenorhabditis briggsae decline in body size about twice as fast as two strains of C. elegans, with a concomitant higher point estimate of the genomic mutation rate; the confamilial Oscheius myriophila is intermediate. There is an overall mutational bias, such that mutations reduce size on average, but the bias appears consistent between species. The genetic correlation between mutations that affect size and those underlying fitness is large and positive, on average.  相似文献   

19.
Wang J  Zhang Y  Shen X  Zhu J  Zhang L  Zou J  Guo Z 《Molecular bioSystems》2011,7(4):1158-1166
Finding candidate cancer genes playing causal roles in carcinogenesis is an important task in cancer research. The non-randomness of the co-mutation of genes in cancer samples can provide statistical evidence for these genes' involvement in carcinogenesis. It can also provide important information on the functional cooperation of gene mutations in cancer. However, due to the relatively small sample sizes used in current high-throughput somatic mutation screening studies and the extraordinary large-scale hypothesis tests, the statistical power of finding co-mutated gene pairs based on high-throughput somatic mutation data of cancer genomes is very low. Thus, we proposed a stratified FDR (False Discovery Rate) control approach, for identifying significantly co-mutated gene pairs according to the mutation frequency of genes. We then compared the identified co-mutated gene pairs separately by pre-selecting genes with higher mutation frequencies and by the stratified FDR control approach. Finally, we searched for pairs of pathways annotated with significantly more between-pathway co-mutated gene pairs to evaluate the functional roles of the identified co-mutated gene pairs. Based on two datasets of somatic mutations in cancer genomes, we demonstrated that, at a given FDR level, the power of finding co-mutated gene pairs could be increased by pre-selecting genes with higher mutation frequencies. However, many true co-mutation between genes with lower mutation rates will still be missed. By the stratified FDR control approach, many more co-mutated gene pairs could be found. Finally, the identified pathway pairs significantly overrepresented with between-pathway co-mutated gene pairs suggested that their co-dysregulations may play causal roles in carcinogenesis. The stratified FDR control strategy is efficient in identifying co-mutated gene pairs and the genes in the identified co-mutated gene pairs can be considered as candidate cancer genes because their non-random co-mutations in cancer genomes are highly unlikely to be attributable to chance.  相似文献   

20.
Exposure to tobacco carcinogens is the major cause of human lung cancer, but even heavy smokers have only about a 10% life-time risk of developing lung cancer. Currently used screening processes, based largely on age and exposure status, have proven to be of limited clinical utility in predicting cancer risk. More precise methods of assessing an individual's risk of developing lung cancer are needed. Because of their sensitivity to DNA damage, microsatellites are potentially useful for the assessment of somatic mutational load in normal cells. We assessed mutational load using hypermutable microsatellites in buccal cells obtained from lung carcinoma cases and controls to test if such a measure could be used to estimate lung cancer risk. There was no significant association between smoking status and mutation frequency with any of the markers tested. No significant association between case status and mutation frequency was observed. Age was significantly related to mutation frequency in the microsatellite marker D7S1482. These observations indicate that somatic mutational load, as measured using mutation frequency of microsatellites in buccal cells, increases with increasing age but that subjects who develop lung cancer have a similar mutational load as those who remain cancer free. This finding suggests that mutation frequency of microsatellite mutations in buccal cells may not be a promising biomarker for lung cancer risk.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号